One can often draw interesting consequences from the information that a certain random variable is almost constant. In many contexts, a good definition of "is almost constant" is "has small variance". The reason this is a good definition is that of all measures of near constancy, the variance is arguably the most natural and is often the easiest to calculate. This is best illustrated with some examples.
Two equivalent definitions of variance of a random variable, basic definitions concerning sets, graphs etc.
This example may seem slightly artificial, but in fact it is an important argument in additive combinatorics.
Let denote the set of integers mod . Let be a subset of of size . Define an additive quadruple in to be a quadruple of elements of with the property that . Suppose that the number of additive quadruples of elements of is at most . We shall show, provided is small enough, that must contain an arithmetic progression of length 3. More precisely, we shall show that there must exist and such that neither nor is equal to 0 mod , and , and all belong to .
To do this, we define a function as follows: for each , is the number of ways of writing as with and belonging to . The following facts about are easy to see. First,
which is the number of additive quadruples of elements of which we are assuming to be at most .
It follows that the average value of is and the average value of is at most . Therefore, the variance of , which is is at most .
We now use Chebyshev's inequality, which tells us that if is any random variable, then Here is a quick proof of the inequality, which is also discussed in the article Bounding probabilities by expectations. We know that This is certainly at least times the probability that so the inequality follows. For us the random variable is , where is chosen randomly and uniformly from Applying Chebyshev's inequality, we find that the proportion of such that is at most Therefore, if the number of such that is at most
Now has elements, so there must be at least elements such that This implies that the number of triples such that is at least Each such triple corresponds to an arithmetic progression as long as and . The number of these degenerate triples is at most , so as long as is sufficiently large (it is enough if which is the case if ) there must be non-degenerate ones, as claimed.
Let us think briefly about how the above argument might have been discovered. The idea was to find some fairly general condition on a set that would guarantee that contained an arithmetic progression of length 3. Now will certainly contain an arithmetic progression of length 3 if we can find some such that can be written in many ways as , with and also belonging to . And this will certainly be the case if the function defined above is close to its average value for more than values of . Speaking more loosely, we would like to be roughly constant. So it is natural to look at the variance of , which turns out to be closely related to the number of additive quadruples in the set .