Given the trial
X with sample space {
xi}, the task now is to assign probabilities
pi to the outcomes
xi.
By definition, the probability that trial
X produces an outcome in the sample space is certainty, conventionally represented by the value 1. If outcomes are symmetric—that is, are fundamentally indistinguishable
?For example, rolling a fair die produces outcomes that are fundamentally indistinguishable; the number of pips on the die faces have no influence on the outcome, although they do help distinguish one outcome from another. Similarly, the outcome of drawing a card from a well-shuffled deck is uninfluenced by the cards’ rank and suit. In contrast, the outcomes that result from measuring the height of the next person you meet are not indistinguishable; you’re more likely to meet someone between five and six feet tall than someone outside that range.—then the associated probabilities should also be symmetric—that is, be indistinguishable—which means
pi is 1/
n, where
n is the sample-space size.
?It’s not clear why a particular value is being assigned to symmetric probabilities, rather than just requiring that pi = pj for i, j ∈ {1, …, n}. You might want to offer rolling a fair die as a real-life motivation for equal probabilities, but that raises two problems. First, the probabilities are being developed without reference to real-world motivations, making any motivation suspect. Second, and more importantly, it doesn’t seem that you can deduce any particular probability value from any real-world motivation with the single-event model (which would be begging the question in any event); you need to use the frequency model, which was rejected in Section 1.3. Indistinguishable outcome probabilities is a consequence of matching indistinguishable outcomes.
Symmetry is a judgment. Perceived symmetry among sample-space outcomes determines the outcome’s probabilities, otherwise you’re stuck, at least for the moment. Symmetry is determined largely by the absence of contrary evidence; if a sample space is not clearly asymmetric, assume it’s symmetric.
The two axioms used to assign probabilities are 1) a trial
X produces an outcome in the sample space
X with certainty, represented by probability 1, and 2) a trial
X produces an outcome
xi from a symmetric sample space
X of size
n with probability 1/
n.
Because probabilities are additive,
?Why are probabilities additive? Although probabilities are equated with numbers, where was it established that addition is defined on those numbers? This seems like it should be another axiom. You might want to argue that additive probabilities are most consistent with the real world, but that seems a judgment possible only with the rejected frequency model. You might want to argue that outcome probabilities should combine to 1, but so what? I can define each outcome probability as 1 and multiply them to get 1. You might want to argue that assigning every outcome probability 1 is nonsensical, but why is that so? What is it about the single-event model that brings out the absurdity? the probability of a combination of symmetric outcomes is the sum of the probabilities of the individual outcomes.
Using intuition to battle circularity, consider equally likely outcomes to occur “at random” (intuition will apparently be replaced with more careful reasoning in Section 1.9).