Monday, June 10, 2024

Gambler's Fallacy vs. Law of Large Numbers

Tom Kando

Here is a baffling mathematical problem which I have pondered for years: The "Gambler’s Fallacy” and the “Law of Large Numbers" seem to contradict each other:

Consider two statements: 
(1) “Previous outcomes do not affect the probabilities of the next (similar) event.” Take coin tosses for example, each having a fifty-fifty probability of head or tail, right? 

(2) The larger the number of coin tosses, the more likely you are to approach a fifty-fifty distribution of heads and tails, right? 

Statement number #2 implies that if you have just tossed a coin twelve times, and ALL twelve of those have resulted in heads (a probability of 1 in 4,096) as you proceed to toss the coin for the thirteenth time, you expect it to come out a tail, and you bet accordingly, as some gamblers sometimes do. 

But actually, the smart gambler might be better off betting on head because, given the outcome of the first twelve tosses, there is a chance that the coin was tampered with and is loaded towards “head.” 

Statisticians try to explain the irreconcilability of the two statements above by quoting the “law of large numbers.” In probability theory, the law of large numbers (LLN) is a mathematical theorem which states that the average of the results obtained from a large number of independent and identical random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean (Wikipedia).

In other words, the “true” value of coin tosses coming out heads, in the long run, is 50%, as is that for tails. The larger the number of coin tosses is, the closer the distribution of outcomes will come to fifty-fifty. 

But this doesn’t help me very much, in my desire to reconcile the LLN with the outcome of each individual coin toss.


I am talking here about two independent realities: 

1. There is the reality of me tossing a coin ONE single time: Every time I do this, the probabilities of the toss coming out head or tail are equal, namely 50%. A coin has no memory. Its behavior cannot be influenced by previous events. 

2. And then there is the law of large numbers: The more often I toss the coin, the more likely it is that the number of heads and the number of tails will each approach roughly half of the total number of tosses. 

#1 and #2, above, are independent from each other. They are ruled by different rules, and they apply to different events. One is a one-time event. The other is about a large number of events. Within that SERIES of events, each toss of the coin is independent, but the prediction is about the collective outcome of, say, hundred, or a million tosses or even an infinite number of tosses, in which case the ratio of heads to tails will indeed be 1 (50/50). 

Does the LLN govern the outcome of coin tosses? Yes, but only when many tosses occur. The larger that number is, the more the outcome conforms to the law. But for any individual coin toss, the law has no applicability. That’s why it is called the law of large numbers. My desire to reconcile the independence of single tosses with the law of large numbers cannot be satisfied because the law does not apply to single individual events. 

I appeal to mathematicians and statisticians to correct, amplify or otherwise comment on my analysis. leave comment here

8 comments:

Anonymous said...

Statistics do not apply to individuals. The desire to extrapolate the Law of Large Numbers to individual events, only shows how tricky the human mind's need to invent connections can be. Small samples are not representative of larger populations, despite most human's desire for them to be. Same problem we see in the "just world hypothesis" fallacy.

Tom Kando said...

Right. Even so, it’s good to know that surveys using relatively small samples are often close to being correct.

Gallup’s opinion surveys, for example, often use samples of 1,000, and their margin of error is only 4%.

Clearly, the larger your sample is, the smaller your random error is likely to be - assuming that everything else is done properly. There are many ways to maximize the probability that your sample is representative of the population which it represents. Ideally, all sample cases are selected randomly. When this is not possible, other methods exist (stratified, quota, etc.).
But of course, the only way to be sure that your random error is zero is to interview the entire population.

It does seem that political projections and predictions are more often wrong now than in the past.. Most pollsters’ (even Nate Silver’s 538) failure to predict Hillary Clinton’s loss to Trump in 2016 was a spectacular example.
But before that, major pollsters’ blunders had been rare, going back for example to the prediction of Dewey’s victory over Truman in 1948.

I dont know why polls may have become less accurate in recent decades. Maybe new technology has made it more difficult to construct truly representative samples. In the past, many surveys were done by phone. That no longer works. The Nielsen ratings are obtained with audimeters attached to TV sets.

All in all, polls and surveys based on samples are an important tool to gather knowledge, even though results need to be taken with a grain of salt, even though there is always the need to test and retest.

Edric said...

The fallacy you seem to recognize in the “law of large numbers” is tied to the absence of any definition of what is meant by “large number.” You toss coins and, after a while, you have a disproportion of “heads.” From that point on, the chance of having a disproportion of “tails” is still as great as having another round with a disproportion of “heads.” So, starting the count after a disproportion of “heads” would seem to imply that, after another round with the same number of tosses, you are likely to have more heads than tails, but you could also have a similar run with now “tails” in the lead, evening out the score at this stage. Keep that long enough and the difference between “heads” and “tails” is almost certain to keep growing. But saying that the difference will keep growing does not imply that the difference will be in favor of “heads.” Things may have changed over time, with “tails” catching up and overtaking “heads,” and then “heads” once again gaining ground and over-taking tails. But in proportion to the now very large number of tosses, that difference will be minimal. That’s what the law of large number recognizes: it refers to proportion, not to absolute numbers. The 50/50 ratio you refer to is not an absolute, and the difference between 49.99% and 50.01% may by now be in the millions. The rule of large numbers is a statement about 49.99% compared to 50.01%, not about the millions that is the difference between 49.99% and 50.01%.

Anonymous said...

A question to ask is why we assume that the Law of Large Numbers applies to small samples? I read somewhere that it is due to a cognitive bias, our brain trying to cut corners to make a decision. We know that something with large teeth and an ominous growl has previously eaten one of our loved ones, so we assume that when we meet something we have never seen before, but has large teeth and an angry growl, it is not our friend. We could be totally wrong, but it is a shortcut to spare us the time and energy that further investigation would require.

Gail said...

Hello Dr. Tom:

I am thinking… Thanks for sharing this piece. I’m on my 2nd cup of coffee and my second reading; hopefully I’ll have something of substance to comment later.

Great topic!

Gail😊

Anonymous said...

Polls are statistics, not actual mathematics. Stats is no more real maths than is accounting. Polling, as all stats, belongs to epistemology. The law of large numbers says nothing about independent events; extrapolating it to individual outcomes is simply the human mind making connections it wants, where there are none. Lazy philosophy, if you will.

Scott said...

I almost flunked statistics because I could not believe that for every flip there was still a 50/50 chance of the same result. Then, I got two rolls of pennies and proved to myself that was indeed the case.

Tom Kando said...

I thank Edric, anonymous, Gail and Scott for their comments. Good points by anonymous, regarding “proportion, not absolute numbers” and by anonymous about “cognitive bias.” Gail, I’m sure, ” has her “sociological brain” in gear. Second anonymous brings up the fact that polling and statistics are methods to acquire knowledge (epistemology). Scott did the empirical test.

Post a Comment

Please limit your comment to 300 words at the most!