What is the probability of cutting two cards of the same rank from a well-shuffled deck?
Tuesday, May 31, 2011
Monday, May 30, 2011
Section 2.4 - The Binomial Distribution—Bernoulli Trials
A Bernoulli trial produces an outcome which satisfies a criterion with probability \(p\), and so fails to satisfy the criterion with compliment probability \(q = 1 - p\). What is the probability that \(n\) successive, independent Bernoulli trials results in exactly \(k\) satisfactory outcomes?
The compound outcome is an arrangement of \(k\) satisfactory outcomes and \(n - k\) unsatisfactory outcomes. Because the Bernoulli-trial outcomes are independent, the outcome probabilities are multiplied to get the combined-outcome probability, and because multiplication is commutative, any individual compound outcome has probability \(p^kq^{n-k}\). There are \(C(n, k)\) such compound outcomes, and the probability that \(n\) successive, independent Bernoulli trials results in exactly \(0 \leq k \leq n\) satisfactory outcomes when satisfactory outcomes have probability \(p\) is \[ b(k; n, p) = C(n, k)p^kq^{n - k}\] The compound-outcome sample space is not uniform. Even if \(p = 1/2\),?that is, \(p = q\). \(C(n, k)\) varies with \(k\); the variation becomes more complex when \(p\neq q\).
The compound outcome is an arrangement of \(k\) satisfactory outcomes and \(n - k\) unsatisfactory outcomes. Because the Bernoulli-trial outcomes are independent, the outcome probabilities are multiplied to get the combined-outcome probability, and because multiplication is commutative, any individual compound outcome has probability \(p^kq^{n-k}\). There are \(C(n, k)\) such compound outcomes, and the probability that \(n\) successive, independent Bernoulli trials results in exactly \(0 \leq k \leq n\) satisfactory outcomes when satisfactory outcomes have probability \(p\) is \[ b(k; n, p) = C(n, k)p^kq^{n - k}\] The compound-outcome sample space is not uniform. Even if \(p = 1/2\),?that is, \(p = q\). \(C(n, k)\) varies with \(k\); the variation becomes more complex when \(p\neq q\).
Sunday, May 29, 2011
Section 2.3 - Combinations
Given \(n\) distinct items, there are \(P(n, k)\) permutations of \(0 \leq k \leq n\) items. Each permutation of \(k\) items can be re-arranged \(k!\) ways. If item order doesn’t matter, there are
\[ \frac{P(n, k)}{k!} = \frac{n!}{k!(n-k)!} = C(n, k) \]
orderless permutations, or combinations. Sanity check: \(C(n, n) = n!/(n!(n - n)!) = 1/0! = 1\).
Combinations are fun to play with. First, \(C(n, k) = C(n, n - k)\).?And so \(C(n, 0) = C(n, n - 0) = 1\). Then
\[\begin{eqnarray}
C(n, k + 1) & = & \frac{n!}{(k + 1)!(n - k - 1)!} \\
& = & \frac{n!(n - k)}{(k + 1)k!(n - k)(n - k - 1)!} \\
& = & \frac{n - k}{k + 1}C(n, k)
\end{eqnarray}\]
\(C(n, k + 1)\) is largest when \((n - k)/(k + 1)\) is closest to 1, which occurs when \(k\) is closest to \((n - 1)/2\). Similarly
\[\begin{eqnarray}
C(n + 1, k) & = & \frac{(n + 1)!}{k!(n + 1 - k)!} \\
& = & \frac{(n + 1)n!}{k!(n + 1 - k)(n - k)!} \\
& = & \frac{n + 1}{n + 1 - k}C(n, k)
\end{eqnarray}\]
\[ \frac{P(n, k)}{k!} = \frac{n!}{k!(n-k)!} = C(n, k) \]
orderless permutations, or combinations. Sanity check: \(C(n, n) = n!/(n!(n - n)!) = 1/0! = 1\).
Combinations are fun to play with. First, \(C(n, k) = C(n, n - k)\).?And so \(C(n, 0) = C(n, n - 0) = 1\). Then
\[\begin{eqnarray}
C(n, k + 1) & = & \frac{n!}{(k + 1)!(n - k - 1)!} \\
& = & \frac{n!(n - k)}{(k + 1)k!(n - k)(n - k - 1)!} \\
& = & \frac{n - k}{k + 1}C(n, k)
\end{eqnarray}\]
\(C(n, k + 1)\) is largest when \((n - k)/(k + 1)\) is closest to 1, which occurs when \(k\) is closest to \((n - 1)/2\). Similarly
\[\begin{eqnarray}
C(n + 1, k) & = & \frac{(n + 1)!}{k!(n + 1 - k)!} \\
& = & \frac{(n + 1)n!}{k!(n + 1 - k)(n - k)!} \\
& = & \frac{n + 1}{n + 1 - k}C(n, k)
\end{eqnarray}\]
Saturday, May 28, 2011
Section 2.2 - Permutations
How to count the outcomes in a sample space.
N distinct items sampled with replacement k times produces a sample space of size nk. Ordered sampling without replacement is a permutation, and produces a sample space of size
Multiplying the falling factorial (n)k by (n - k)! turns it into the full factorial n! Multiplying P(n, k) by 1 in the form n!/n! gives an alternate formula for permutation counting:
N distinct items sampled with replacement k times produces a sample space of size nk. Ordered sampling without replacement is a permutation, and produces a sample space of size
P(n, k) = n(n - 1)(n - 2)· · ·(n - k + 1)?The first item can be picked n ways, the second (n - 1), and so on. = (n)k?(n)k is known as the falling factorial.As a sanity check, P(n, 1) is n, and P(n, n + 1) is 0.
Multiplying the falling factorial (n)k by (n - k)! turns it into the full factorial n! Multiplying P(n, k) by 1 in the form n!/n! gives an alternate formula for permutation counting:
P(n, k) = n!/(n - k)!For more sanity P(n, n) = n!/(n - n)! = n!/0! = n!?Because 0! = 1. and P(n, 0) = n!/(n - 0)! = n!/n! = 1.
Friday, May 27, 2011
Section 2.1 - Introduction
We avoid Peirce’s observation?[…]but it may be doubted if there is a single extensive treatise on probabilities in existence which does not contain solutions absolutely indefensible. through careful modeling, consistent methods, developed intuition, result analysis, and reasonableness tests. We must also prepare for non-rational probabilities and non-uniform probability distributions.
Uniform probability assignments allow computations to be either the ratio of satisfactory outcomes over all outcomes, or the sum of individual satisfactory outcome probabilities. Non-uniform probability calculations can only be carried out by summing individual probabilities. Where you end up depends on the direction you start walking. “Random” without modifier is the code-word for uniform probability assignments.
We now discover tools for computing probabilities over finite discrete sample spaces, the related uniform solution methods?Uniform over various probability models, that is, one of which is non-uniform probability assignments., and the intuition to guide their use, in part by appealing to the frequency interpretation of probability.
Uniform probability assignments allow computations to be either the ratio of satisfactory outcomes over all outcomes, or the sum of individual satisfactory outcome probabilities. Non-uniform probability calculations can only be carried out by summing individual probabilities. Where you end up depends on the direction you start walking. “Random” without modifier is the code-word for uniform probability assignments.
We now discover tools for computing probabilities over finite discrete sample spaces, the related uniform solution methods?Uniform over various probability models, that is, one of which is non-uniform probability assignments., and the intuition to guide their use, in part by appealing to the frequency interpretation of probability.
Thursday, May 26, 2011
Example 1.9-3 - The Four-Card Deck
A card deck contains four cards: AC, 2C, AD, and 2D. The deck is shuffled, and a pair of cards are dealt.?And mistakes may be made.
Example 1.9-2 - The Two Children Problem
A family has two children, one of which is a boy. What is the probability the other child is a boy?
Wednesday, May 25, 2011
Section 1.10 - Critique of the Model
The probability model is based on finite, symmetric sample spaces, which are easy to manipulate and provide simple outcome-probability calculations. However, using rational numbers for ratios can be limiting,?For example, it is unlikely that the probability that a flipped thumb tack lands point up is expressible a ratio. as can unbounded sample spaces.?For example, the probability that a sequence of coin tosses eventually produces a heads has potentially unbounded outcome length.
Separating the probabilistic model from its mathematical underpinnings promotes philosophical clarity, and emphasizes that the mathematics have wider use than just supporting the model. Also missing is the connection between the model based on sample-space symmetry and the intuitive frequency-based probability interpretation; Section 2.8 makes the connection.
Separating the probabilistic model from its mathematical underpinnings promotes philosophical clarity, and emphasizes that the mathematics have wider use than just supporting the model. Also missing is the connection between the model based on sample-space symmetry and the intuitive frequency-based probability interpretation; Section 2.8 makes the connection.
Tuesday, May 24, 2011
Section 1.9 - Randomness
Like symmetry and independence, randomness is recognizable by the absence of contrary evidence. Concisely describing a pattern would be suitable contrary evidence. Randomness is best taken as a mathematical concept to avoid philosophical knots and physical antinomies. It’s like quantum mechanics: all outcomes are equally possible until the trial, when all possibilities collapse into the outcome (at which point randomness disappears). The concise pattern criterion seems to prohibit any kind of finitely defined random-number generator.
To some extent, randomness is relative: if you can’t figure out a pattern, then you should assume randomness, even though somebody else might be able to find a pattern. Relative randomness may be unpalatable, but it does allow a pragmatic justification for getting on with the task. Randomness can be interpreted as symmetry (identical probabilities, or uniformity), but symmetry can’t be interpreted as randomness. Sticking to symmetric sample spaces with independence helps sidestep semantic and philosophical problems with randomness. A random sample is actually an average over the sample space.
To some extent, randomness is relative: if you can’t figure out a pattern, then you should assume randomness, even though somebody else might be able to find a pattern. Relative randomness may be unpalatable, but it does allow a pragmatic justification for getting on with the task. Randomness can be interpreted as symmetry (identical probabilities, or uniformity), but symmetry can’t be interpreted as randomness. Sticking to symmetric sample spaces with independence helps sidestep semantic and philosophical problems with randomness. A random sample is actually an average over the sample space.
Monday, May 23, 2011
Section 1.8 - Conditional Probability
Let the trial T≥2 be flipping 10 coins such that at least one comes up heads. What is the probability that an outcome of T≥2 has at least two heads?
The requirement that at least one coin land heads-up is a condition,?Don’t confuse the condition on any trial outcome (“contains at least one head”) with the predicate on the satisfactory outcomes (contains at least two heads). The condition is satisfied by every outcome in the sample space, while the predicate may or may not be satisfied by outcomes. and the resulting probability is known as known as a conditional probability. There’s at least three ways to find conditional probabilities: direct counting, indirect counting, and compliment probability.?In fact, these three ways can solve any probability problem, not just conditional-probability problems.
Before getting to the solutions, it’s important to recall a fact, and understand why it’s important. What is T≥2’s sample space? The sample space is the set of all possible trial outcomes. T≥2’s sample space is not { ti | 0 ≤ i < 1024 }?The subscript i is interpreted as a 10-bit binary number with 1 bits interpreted as heads. because t0 is not a possible outcome for T≥2.?t0 contains no heads, which contradicts the condition that T≥2 produce outcomes with at least one head. Instead, T≥2’s sample space is { ti | 0 < i < 1024 }, because now every ti has at least one heads. The sample space is symmetric, and of size 1023, which means ti for 0 < i < 1024 has probability 1/1023. These are the two most common errors when dealing with conditional probabilities:?That is, these are the two mistakes I make the most. forgetting to adjust the sample space to accommodate the conditions, and misadjusting the sample space so it contains too many or too few outcomes.
The simplest and most straightforward solution (in some sense) is to directly count all the outcomes satisfying the predicate P(ti) = ti has at least two heads and add their probabilities together. There are 1013 such outcomes,?This result courtesy of the haskell expression
Direct counting doesn’t scale well with sample-space size or outcome predicate complexity. However, because the predicate partitions the sample space into two subsets, it may be possible to indirectly count the satisfactory outcomes by counting the unsatisfactory outcomes and using the fact that
The third solution is a variant of the indirect counting because it too relies on outcome predicate partitioning:
Indirect counting and compliment probability are useful to the extent that counting the unsatisfactory outcomes is easier than counting the satisfactory outcomes. If counting the two kinds of outcomes are of similar difficultly, it’s likely that none of the three solution techniques is going to have a clear advantage over the others.
The requirement that at least one coin land heads-up is a condition,?Don’t confuse the condition on any trial outcome (“contains at least one head”) with the predicate on the satisfactory outcomes (contains at least two heads). The condition is satisfied by every outcome in the sample space, while the predicate may or may not be satisfied by outcomes. and the resulting probability is known as known as a conditional probability. There’s at least three ways to find conditional probabilities: direct counting, indirect counting, and compliment probability.?In fact, these three ways can solve any probability problem, not just conditional-probability problems.
Before getting to the solutions, it’s important to recall a fact, and understand why it’s important. What is T≥2’s sample space? The sample space is the set of all possible trial outcomes. T≥2’s sample space is not { ti | 0 ≤ i < 1024 }?The subscript i is interpreted as a 10-bit binary number with 1 bits interpreted as heads. because t0 is not a possible outcome for T≥2.?t0 contains no heads, which contradicts the condition that T≥2 produce outcomes with at least one head. Instead, T≥2’s sample space is { ti | 0 < i < 1024 }, because now every ti has at least one heads. The sample space is symmetric, and of size 1023, which means ti for 0 < i < 1024 has probability 1/1023. These are the two most common errors when dealing with conditional probabilities:?That is, these are the two mistakes I make the most. forgetting to adjust the sample space to accommodate the conditions, and misadjusting the sample space so it contains too many or too few outcomes.
The simplest and most straightforward solution (in some sense) is to directly count all the outcomes satisfying the predicate P(ti) = ti has at least two heads and add their probabilities together. There are 1013 such outcomes,?This result courtesy of the haskell expression
length [ i | i ← [1..1023],ones i > 1]
, where ones n
returns the number of one bits in the binary representation of n
. and the resulting probabilities sum to 1013/1023 = 0.99.Direct counting doesn’t scale well with sample-space size or outcome predicate complexity. However, because the predicate partitions the sample space into two subsets, it may be possible to indirectly count the satisfactory outcomes by counting the unsatisfactory outcomes and using the fact that
unsatisfactory size + satisfactory size = sample-space sizeto get
satisfactory size = sample-space size - unsatisfactory sizeIn this case, unsatisfactory size is 10,?There are ten ways to toss ten coins so that exactly one heads comes up. and so satisfactory size = 1023 - 10 = 1013, and the probability is 1013/1023 = 0.99.
The third solution is a variant of the indirect counting because it too relies on outcome predicate partitioning:
Pr(unsatisfactory outcome) + Pr(satisfactory outcome) = Pr(sample-space outcome)Pr(sample-space outcome) is certainty—that is, 1—and so
Pr(satisfactory outcome) = 1 - Pr(unsatisfactory outcome)Pr(unsatisfactory outcome) is known as the compliment probability. In this case, Pr(unsatisfactory outcome) = 10/1023 = 0.01, to get Pr(satisfactory outcome) = 1 - 0.01 = 0.99.
Indirect counting and compliment probability are useful to the extent that counting the unsatisfactory outcomes is easier than counting the satisfactory outcomes. If counting the two kinds of outcomes are of similar difficultly, it’s likely that none of the three solution techniques is going to have a clear advantage over the others.
Sunday, May 22, 2011
Section 1.7 - Subsets of Sample Spaces
Given a predicate over a sample space,?A predicate partitions the sample space into two subsets, one of which contains the outcomes that satisfy the predicate and the other of which contains the outcomes that don’t satisfy the predicate. what is the probability a trial will produce an outcome satisfying the predicate? That outcome probabilities add is an important fact to exploit when answering this question. Because outcome probabilities in a symmetric sample space are identical, the probability of a satisfactory outcome occurring is the sum of the probabilities of the satisfactory outcomes, or the ratio of the number of satisfactory outcomes overthe sample-space size. Handling complex predicates over large sample spaces is difficult and requires more math; just go for the main ideas now.
Probabilities are scaled to be in the range [0 (impossibility) .. 1 (certainty)], but other ranges are possible, such as [1 (certainty) .. \(\infty\) (impossibility)) or [0 (certainty) .. ∞ (impossibility)).?The [1 .. \(\infty\)) range is the inverse of the [0..1] range. In either alternative range, combining outcome probabilities is a problem because simple addition doesn’t behave properly (combining to certainty, for example). Venn diagrams do not scale with the number of sample-space coordinates, and will not be considered further.
Probabilities are scaled to be in the range [0 (impossibility) .. 1 (certainty)], but other ranges are possible, such as [1 (certainty) .. \(\infty\) (impossibility)) or [0 (certainty) .. ∞ (impossibility)).?The [1 .. \(\infty\)) range is the inverse of the [0..1] range. In either alternative range, combining outcome probabilities is a problem because simple addition doesn’t behave properly (combining to certainty, for example). Venn diagrams do not scale with the number of sample-space coordinates, and will not be considered further.
Saturday, May 21, 2011
Section 1.6 - Independence
Given two sample spaces X and Y, the Cartesian product X × Y is the sample space { (xi, yj) | xi ∈ X ∧ yj ∈ Y }. Sample space products may be extended to any number of constituent sample spaces.
Outcomes are independent if prior outcomes have no influence over future outcomes. Independence is a tricky concept, and is unlikely to be realistic.
The Cartesian product of symmetric, independent sample spaces?That is, a sample space of independent outcomes is a symmetric sample space. The probability of an outcome from a Cartesian-product sample space is the product of the probabilities of the constituent outcomes, or you can assume symmetry and assign the resulting probability.
Understand what things are, but then understand why they are that way.
Outcomes are independent if prior outcomes have no influence over future outcomes. Independence is a tricky concept, and is unlikely to be realistic.
The Cartesian product of symmetric, independent sample spaces?That is, a sample space of independent outcomes is a symmetric sample space. The probability of an outcome from a Cartesian-product sample space is the product of the probabilities of the constituent outcomes, or you can assume symmetry and assign the resulting probability.
Understand what things are, but then understand why they are that way.
Friday, May 20, 2011
Section 1.5 - Symmetry as the Measure of Probability
Given the trial X with sample space {xi}, the task now is to assign probabilities pi to the outcomes xi.
By definition, the probability that trial X produces an outcome in the sample space is certainty, conventionally represented by the value 1. If outcomes are symmetric—that is, are fundamentally indistinguishable?For example, rolling a fair die produces outcomes that are fundamentally indistinguishable; the number of pips on the die faces have no influence on the outcome, although they do help distinguish one outcome from another. Similarly, the outcome of drawing a card from a well-shuffled deck is uninfluenced by the cards’ rank and suit. In contrast, the outcomes that result from measuring the height of the next person you meet are not indistinguishable; you’re more likely to meet someone between five and six feet tall than someone outside that range.—then the associated probabilities should also be symmetric—that is, be indistinguishable—which means pi is 1/n, where n is the sample-space size.?It’s not clear why a particular value is being assigned to symmetric probabilities, rather than just requiring that pi = pj for i, j ∈ {1, …, n}. You might want to offer rolling a fair die as a real-life motivation for equal probabilities, but that raises two problems. First, the probabilities are being developed without reference to real-world motivations, making any motivation suspect. Second, and more importantly, it doesn’t seem that you can deduce any particular probability value from any real-world motivation with the single-event model (which would be begging the question in any event); you need to use the frequency model, which was rejected in Section 1.3. Indistinguishable outcome probabilities is a consequence of matching indistinguishable outcomes.
Symmetry is a judgment. Perceived symmetry among sample-space outcomes determines the outcome’s probabilities, otherwise you’re stuck, at least for the moment. Symmetry is determined largely by the absence of contrary evidence; if a sample space is not clearly asymmetric, assume it’s symmetric.
The two axioms used to assign probabilities are 1) a trial X produces an outcome in the sample space X with certainty, represented by probability 1, and 2) a trial X produces an outcome xi from a symmetric sample space X of size n with probability 1/n.
Because probabilities are additive,?Why are probabilities additive? Although probabilities are equated with numbers, where was it established that addition is defined on those numbers? This seems like it should be another axiom. You might want to argue that additive probabilities are most consistent with the real world, but that seems a judgment possible only with the rejected frequency model. You might want to argue that outcome probabilities should combine to 1, but so what? I can define each outcome probability as 1 and multiply them to get 1. You might want to argue that assigning every outcome probability 1 is nonsensical, but why is that so? What is it about the single-event model that brings out the absurdity? the probability of a combination of symmetric outcomes is the sum of the probabilities of the individual outcomes.
Using intuition to battle circularity, consider equally likely outcomes to occur “at random” (intuition will apparently be replaced with more careful reasoning in Section 1.9).
By definition, the probability that trial X produces an outcome in the sample space is certainty, conventionally represented by the value 1. If outcomes are symmetric—that is, are fundamentally indistinguishable?For example, rolling a fair die produces outcomes that are fundamentally indistinguishable; the number of pips on the die faces have no influence on the outcome, although they do help distinguish one outcome from another. Similarly, the outcome of drawing a card from a well-shuffled deck is uninfluenced by the cards’ rank and suit. In contrast, the outcomes that result from measuring the height of the next person you meet are not indistinguishable; you’re more likely to meet someone between five and six feet tall than someone outside that range.—then the associated probabilities should also be symmetric—that is, be indistinguishable—which means pi is 1/n, where n is the sample-space size.?It’s not clear why a particular value is being assigned to symmetric probabilities, rather than just requiring that pi = pj for i, j ∈ {1, …, n}. You might want to offer rolling a fair die as a real-life motivation for equal probabilities, but that raises two problems. First, the probabilities are being developed without reference to real-world motivations, making any motivation suspect. Second, and more importantly, it doesn’t seem that you can deduce any particular probability value from any real-world motivation with the single-event model (which would be begging the question in any event); you need to use the frequency model, which was rejected in Section 1.3. Indistinguishable outcome probabilities is a consequence of matching indistinguishable outcomes.
Symmetry is a judgment. Perceived symmetry among sample-space outcomes determines the outcome’s probabilities, otherwise you’re stuck, at least for the moment. Symmetry is determined largely by the absence of contrary evidence; if a sample space is not clearly asymmetric, assume it’s symmetric.
The two axioms used to assign probabilities are 1) a trial X produces an outcome in the sample space X with certainty, represented by probability 1, and 2) a trial X produces an outcome xi from a symmetric sample space X of size n with probability 1/n.
Because probabilities are additive,?Why are probabilities additive? Although probabilities are equated with numbers, where was it established that addition is defined on those numbers? This seems like it should be another axiom. You might want to argue that additive probabilities are most consistent with the real world, but that seems a judgment possible only with the rejected frequency model. You might want to argue that outcome probabilities should combine to 1, but so what? I can define each outcome probability as 1 and multiply them to get 1. You might want to argue that assigning every outcome probability 1 is nonsensical, but why is that so? What is it about the single-event model that brings out the absurdity? the probability of a combination of symmetric outcomes is the sum of the probabilities of the individual outcomes.
Using intuition to battle circularity, consider equally likely outcomes to occur “at random” (intuition will apparently be replaced with more careful reasoning in Section 1.9).
Thursday, May 19, 2011
Section 1.4 - The Single-Element Model
A trial X produces an outcome drawn from a finite set of possible outcomes { xi }, 0 ≤ i ≤ n.?For example, if X represents the roll of a single die, the set of possible outcomes is { xi } for 1 ≤ i ≤ 6, where xi represents the outcome “the upper-most face of the die shows exactly i spots”. The set of possible outcomes of X, { xi }, is known as the sample space, sometimes also called X. A single trial outcome can be generalized to a sequence containing finitely, countably, or non-countably many outcomes representing a similarly-sized sequence of trials.
Given a trial X, the first task is to find the sample space. Do not confuse the map and the territory: abstraction is at play when defining the sample space for X.?For instance, to continue the die example, a tossed die may get caught in a hole and land point up. However, such an occurrence is so unlikely that most die-roll models can ignore it. In addition, xi is not a number; the operation xi + xj doesn’t make any sense if + is the usual integer addition.
Given a trial X, the first task is to find the sample space. Do not confuse the map and the territory: abstraction is at play when defining the sample space for X.?For instance, to continue the die example, a tossed die may get caught in a hole and land point up. However, such an occurrence is so unlikely that most die-roll models can ignore it. In addition, xi is not a number; the operation xi + xj doesn’t make any sense if + is the usual integer addition.
Wednesday, May 18, 2011
Section 1.3 - The Frequency Approach Rejected
Common probability models include single-event probabilities and probability as the ratio of the number of occurrences of a particular outcome over the large and possibly infinite number of possible outcomes (the frequency model). The odds a to b is a third probability model, equivalent to the probabilities a/(a + b) and b/(a + b)
Science has a strong bias towards the measurable. Mathematics has beautiful and interesting theorems with hypotheses that are incompatible with potential applications. There is a tendency to push for practical results using constructivist or computational methods. New applications of probability should be carefully considered to insure appropriate use.
The frequency model runs into mathematical difficulty in both definition and execution; in particular, it is difficult to deal with large or infinite possibilities, and difficult to characterize the resulting probability values. Thus the frequency model is rejected as a hypothesis, and will be derived as a thesis from other models.
Science has a strong bias towards the measurable. Mathematics has beautiful and interesting theorems with hypotheses that are incompatible with potential applications. There is a tendency to push for practical results using constructivist or computational methods. New applications of probability should be carefully considered to insure appropriate use.
The frequency model runs into mathematical difficulty in both definition and execution; in particular, it is difficult to deal with large or infinite possibilities, and difficult to characterize the resulting probability values. Thus the frequency model is rejected as a hypothesis, and will be derived as a thesis from other models.
Tuesday, May 17, 2011
Section 1.2 - Models in General
We think about the world using models we construct in our heads. Models are neither true nor false, just more or less consistent with our observations of the world. In addition to consistency, models may be more or less complex (Occam’s Razor), have structural criteria (dichotomy, or Levi-Strauss’ groups of three), and be subject to other aesthetic criteria (beauty or crispness).
Models abstract the world, leaving out details considered irrelevant. Probabilistic thinking will be presented as a sequence of models, each model less abstract (more detailed) and more consistent than its predecessor. Successor models remove inconsistencies (paradoxes) from predecessor models. Pay attention to the sequence, and how the sequence grows, so you can develop your own models as needed.
Models abstract the world, leaving out details considered irrelevant. Probabilistic thinking will be presented as a sequence of models, each model less abstract (more detailed) and more consistent than its predecessor. Successor models remove inconsistencies (paradoxes) from predecessor models. Pay attention to the sequence, and how the sequence grows, so you can develop your own models as needed.
Monday, May 16, 2011
Section 1.1 - Introduction
Few circumstances are completely uncertain; there's usually some underlying certainty involved. Probabilistic thinking recognizes and exploits the underlying certainty. Normal life uses probabilistic thinking for prediction; scientific life also uses it in other ways, such as for modeling, as in errors in measurements, or fundamental explanations, as in quantum mechanics a la Copenhagen.
Several probabilistic thinking styles exist, and several are needed. For example, classical probabilistic thinking excludes independence, an important idea for quantum probabilistic thinking. In addition, other areas - such as mathematics and computation - have thinking styles related to, but different from, probabilistic thinking.
Probabilistic thinking can be subjective (belief based) or scientific (objective). Scientific objectivity recognizes and accommodates the judgments underlying probabilistic thinking; subjective thinking leaves these judgments unrecognized. Studying probability as an end is largely unhelpful; probabilistic thinking must be grounded in the uncertainty of the world to be helpful. Probabilistic thinking is difficult, but less formal styles are easier to learn and use.
But before we can quantify uncertainty, we need some tools.
Several probabilistic thinking styles exist, and several are needed. For example, classical probabilistic thinking excludes independence, an important idea for quantum probabilistic thinking. In addition, other areas - such as mathematics and computation - have thinking styles related to, but different from, probabilistic thinking.
Probabilistic thinking can be subjective (belief based) or scientific (objective). Scientific objectivity recognizes and accommodates the judgments underlying probabilistic thinking; subjective thinking leaves these judgments unrecognized. Studying probability as an end is largely unhelpful; probabilistic thinking must be grounded in the uncertainty of the world to be helpful. Probabilistic thinking is difficult, but less formal styles are easier to learn and use.
But before we can quantify uncertainty, we need some tools.
Sunday, May 15, 2011
Table of Contents
1 | Probability | |
2 | Some Mathematical Tools | |
3 | Methods for Solving Problems | |
4 | Countably Infinite Sample Spaces | |
5 | Continuous Sample Spaces | |
6 | Uniform Probability Assignments | |
7 | Maximum Entropy | |
8 | Models of Probability | |
9 | Some Limit Theorems | |
10 | An Essay on Simulation |
Saturday, May 14, 2011
Preface
What is probability?
Fifth, I also found that there is a large assortment of personal probabilists, the most prominent being the Bayesians, at least in volume of noise. Just what the various kinds of Bayesians are proclaiming is not always clear to me, and at least one said that while nothing new and testable is produced still it is “the proper way to think about probability”. (p. vi)
Friday, May 13, 2011
The Object of Our Veneration
This blog is a diary of a trip through The Art of Probability by Richard Hamming, published in 1991 by Addison-Wesley.
Subscribe to:
Posts (Atom)