Quick description
One can often draw interesting consequences from the information that a certain random variable is almost constant. In many contexts, a good definition of "is almost constant" is "has small variance". The reason this is a good definition is that of all measures of near constancy, the variance is arguably the most natural and is often the easiest to calculate. This is best illustrated with some examples.
Prerequisites
Two equivalent definitions of variance of a random variable, basic definitions concerning sets, graphs etc.
Example 1
This example may seem slightly artificial, but in fact it is an important argument in additive combinatorics.
Let denote the set of integers mod
. Let
be a subset of
of size
. Define an additive quadruple in
to be a quadruple
of elements of
with the property that
. Suppose that the number of additive quadruples of elements of
is at most
. We shall show, provided
is small enough, that
must contain an arithmetic progression of length 3. More precisely, we shall show that there must exist
and
such that neither
nor
is equal to 0 mod
, and
,
and
all belong to
.
To do this, we define a function as follows: for each
,
is the number of ways of writing
as
with
and
belonging to
. The following facts about
are easy to see. First,

Secondly,

which is the number of additive quadruples of elements of which we are assuming to be at most
.
It follows that the average value of is
and the average value of
is at most
. Therefore, the variance of
, which is
is at most
.
We now use Chebyshev's inequality, which tells us that if is any random variable, then
Here is a quick proof of the inequality, which is also discussed in the article Bounding probabilities by expectations. We know that
This is certainly at least
times the probability that
so the inequality follows. For us the random variable is
, where
is chosen randomly and uniformly from
Applying Chebyshev's inequality, we find that the proportion of
such that
is at most
Therefore, if
the number of
such that
is at most
Now has
elements, so there must be at least
elements
such that
This implies that the number of triples
such that
is at least
Each such triple corresponds to an arithmetic progression
as long as
and
. The number of these degenerate triples is at most
, so as long as
is sufficiently large (it is enough if
which is the case if
) there must be non-degenerate ones, as claimed.
General discussion
Let us think briefly about how the above argument might have been discovered. The idea was to find some fairly general condition on a set that would guarantee that
contained an arithmetic progression of length 3. Now
will certainly contain an arithmetic progression of length 3 if we can find some
such that
can be written in many ways as
, with
and
also belonging to
. And this will certainly be the case if the function
defined above is close to its average value
for more than
values of
. Speaking more loosely, we would like
to be roughly constant. So it is natural to look at the variance of
, which turns out to be closely related to the number of additive quadruples in the set
.
![]() |
Comments
I may be dense
Wed, 15/06/2011 - 07:29 — kodlu.. but I can't understand why "the proportion of x such that | f(x) - delta N | \geq (delta N / 2)" has delta N instead of delta^2 N which is the average value of f(x) as stated previously. And I haven't been able to turn inline commenting on in google chrome, my apologies.