We've explained in another article that probabilities are all in our minds. It's an expression of our beliefs about something.
Variance is a measure of how uncertain these beliefs are. Large variance means we're not very sure. Small variance means we're pretty certain but there's some doubt.
All possible outcomes are included in the variance, but they are weighted according to their probability (see below).
For example, if we roll 1,000 dice, the variance of the mean value is very small. We're pretty sure that the average will be close to 3.5. We're almost certain it won't be below 3, but we can't predict with certainty whether it will be a little above or a little below 3.5.
Here's the definition of variance:
Definition
V = E[ (A-m)^2 ]
Where E is the expected value, or expectation. Expressed in words, it's the expectation of the square of the distance of the individual values from the mean value.
Luckily, it can also be formulated like this.
V = E(A^2) - E(A)^2
In other words, it's the expectation of the square minus the square of the expectation. This is easier to calculate, as we'll see below.
Example: Rolling a die
All outcomes are equally likely, so this is a uniform distribution over all possible values A.
A | A^2 |
1 | 1 |
2 | 4 |
3 | 9 |
4 | 16 |
5 | 25 |
6 | 36 |
E(A) = 1/6*1 + 1/6*2 + 1/6*3 + 1/6 * 4 + 1/6*5 + 1/6*6 = 1/6 *21 = 3.5
E(A^2) = 1/6*1 + 1/6*4 + 1/6*9 + 1/6 * 16 + 1/6*25 + 1/6*36 = 1/6 *91 = 15.2
V = E(A^2) - E(A)^2 = 15.2 - 3.5*3.5 = 15.2 - 12.25 = 2.9
Example: Rolling two dice
Now, all outcomes are not equally likely. We have the following probability distribution:
A | A^2 | P(A) |
1 | 1 | 0 |
2 | 4 | 1/36 |
3 | 9 | 2/36 |
4 | 16 | 3/36 |
5 | 25 | 4/36 |
6 | 36 | 5/36 |
7 | 49 | 6/36 |
8 | 64 | 5/36 |
9 | 81 | 4/36 |
10 | 100 | 3/36 |
11 | 121 | 2/36 |
12 | 144 | 1/36 |
The expectation of this distribution is
E(A) = 252/36 = 7
Not surprising, since 7 is the most probable outcome and the distribution is symmetrical.
To find the variance we first need to calculate the expectation of the squared values:
E(A^2) = 54.8
In the same manner as above, we arrive at the following variance:
V = E(A^2) - E(A)^2 = 54.8 - 7*7 = 5.8
In this example we see clearly how all possible outcomes are included in the variance, weighted according to their probability. The '12' outcome gets a '1' weight while a '7' gets six times that weight.
What does variance mean?
Now I'm sure you wonder, what does this tell me? Thing is, interpreting the variance isn't easy. It's not something you can use straight off the shelf.
Instead, you can put it to use in a continued analysis. For example, to find a 90% confidence interval for an outcome, or to figure out what kind of bankroll you need to play in a certain game.
By comparing two players' variance in the same game, you can tell who has the largest variations, or swings.
Next we'll look at the standard deviation, a related concept that is a little more natural and easier to interpret.
Standard deviation
When talking about variation in a probability distribution, instead of variance people often refer to the standard deviation.
The standard deviation, s, is defined as the square root of the variance:
s = sqr(V)
The advantage is that the standard deviation is easier to compare to the mean value, since they have the same unit (such as meter or kg.)
In many cases you can use the standard deviation to compute some meaningful confidence intervals:
- Mean value +- 3s is a 99.7% confidence interval
- Mean value +- 2s is a 95% confidence interval
- Mean value +- s is a 68% confidence interval
For this to work, your outcomes need to be "normally distributed". This is very common, but it's not always the case. I'm just mentioning it is as an example here.
Examples
When rolling a die, the standard deviation is sqr(2.9) = 1.7 (see above). This is close to half the mean value.
When rolling two dice and summing up the dots, the expected value is 7 and standard deviation 2.4. This value is smaller compared to the mean value, and we know that this distribution is narrower than in the above case.
When rolling ten dice and summing up the dots, the expected mean is 35 and standard deviation 5.4. The distribution becomes even narrower.
Variation of a sample
When talking about probability and statistics, some things are hard to keep apart.
For example, two things that we shouldn't mix up are the variation in a probability distribution and the variation in a series of actual outcomes.
The first is called variance, as we saw above. The second is called "sample variance". It's similar to variance, but we can only know it "afterwards" when the outcomes are known.
Variance is a theoretical value that describes the variation in a probability distribution. It's a measure of our uncertainty.
Sample variance is a parameter of an actual sample. It's a measure of variation in the actual outcomes.
For example, when rolling one die, the variance of the probability distribution is 2.9. But if we roll one die a number of times, the sample variance of the series of outcomes won't necessarily be 2.9 (see below).
Definition: The definition of the sample variance resembles the one for variance, but it includes the actual outcomes. For a series of N trials, the sample variance V is:
V = [(A1 - m)^2 + (A2 - m)^2 + (A3 - m)^2 + (A4 - m)^2 +....] / N
For each outcome in the series, you take its distance from the mean value of the series, m, square this value, add up all results and divides by the number of trials.
Or, if you like, it's the average of the squared distances from the average.
Note: As usual, to get the mean value, we add up all outcomes and divide the sum by their number:
m = (A1 + A2 + A3 + ...) / N
In the long run we can expect the variance and the sample variance to get ever closer. But in short series the sample variance may be far away from the theoretical variance in the same way as the average can be far away from the expected value.
Example: Rolling a die
When writing this article, I rolled a die and had the following outcomes: 3, 5, 6, 5, 2, 1, 4, 6, 2, 1
A | A - m | (A - m)^2 |
3 | - 0.5 | 0,25 |
5 | 1.5 | 2,25 |
6 | 2.5 | 6,25 |
5 | 1.5 | 2,25 |
2 | - 1.5 | 2,25 |
1 | - 2.5 | 6,25 |
4 | 0.5 | 0,25 |
6 | 2.5 | 6,25 |
2 | - 1.5 | 2,25 |
1 | - 2.5 | 6,25 |
|
|
We calculate the mean and sample variance:
m = 35 / 10 = 3.5
V = 34.5 / 10 = 3.45
The mean value is right on the expected 3.5, while the sample variance of 3.45 is higher than the expected 2.9 (see above).
If we roll a die many times, we expect the sample variance to approach the theoretical variance of 2.9. Read more about the long run here.
Analyzing poker sessions
When you look back at your poker sessions, you're looking at a series of actual outcomes with a certain mean value and a certain sample variance.
By calculating these values you'll get a good picture of how you've been doing. Let's work through an example.
/Charlie River
-
This article is part of the poker math series:
- Basic Probability Theory
- Where do Probabilities Live?
- Average, Expected Value, Variance and More
- What Is Odds?
- What is Outs in Poker?
- The Math behind Calling and Folding
- EV - How To Calculate It
- Variance - How to Calculate It
Comments on this Article
BB (Nov 02, 2010)
I can't find your definition for variance anywhere? Can you prove your formula in simpler terms please?
I don't think it's correct.
Comment