a repository of mathematical know-how

The second-moment method

Quick description

One can often draw interesting consequences from the information that a certain random variable is almost constant. In many contexts, a good definition of "is almost constant" is "has small variance". The reason this is a good definition is that of all measures of near constancy, the variance is arguably the most natural and is often the easiest to calculate. This is best illustrated with some examples.


Two equivalent definitions of variance of a random variable, basic definitions concerning sets, graphs etc.

Example 1

This example may seem slightly artificial, but in fact it is an important argument in additive combinatorics.

Let \mathbb{Z}_N denote the set of integers mod N. Let A be a subset of \mathbb{Z}_N of size \delta N. Define an additive quadruple in A to be a quadruple (a_1,a_2,a_3,a_4) of elements of A with the property that a_1+a_2=a_3+a_4. Suppose that the number of additive quadruples of elements of A is at most (\delta^4+c)N^3. We shall show, provided c is small enough, that A must contain an arithmetic progression of length 3. More precisely, we shall show that there must exist a and d such that neither d nor 2d is equal to 0 mod N, and a, a+d and a+2d all belong to A.

To do this, we define a function f as follows: for each x\in\Z_N, f(x) is the number of ways of writing x as a_1+a_2 with a_1 and a_2 belonging to A. The following facts about f are easy to see. First,




which is the number of additive quadruples of elements of A, which we are assuming to be at most (\delta^4+c)N^3.

It follows that the average value of f(x) is \delta^2 N and the average value of f(x)^2 is at most (\delta^4+c)N^2. Therefore, the variance of f, which is \mathbb{E}f(x)^2-(\mathbb{E}f(x))^2, is at most cN^2.

We now use Chebyshev's inequality, which tells us that if X is any random variable, then \mathbb{P}[|X-\mathbb{E}X|\geq t]\leq\mathrm{Var} X/t^2. Here is a quick proof of the inequality, which is also discussed in the article Bounding probabilities by expectations. We know that \mathrm{Var} X=\mathbb{E}(X-\mathbb{E}X)^2. This is certainly at least t^2 times the probability that |X-\mathbb{E}X|\geq t, so the inequality follows. For us the random variable is f(x), where x is chosen randomly and uniformly from \mathbb{Z}_N. Applying Chebyshev's inequality, we find that the proportion of x such that |f(x)-\delta N|\geq\delta N/2 is at most 4cN^2/\delta^2N^2=4c/\delta^2. Therefore, if c\leq\delta^3/8, the number of x such that f(x)\leq\delta N/2 is at most \delta N/2.

Now A has \delta N elements, so there must be at least \delta N/2 elements x\in A such that f(2x)\geq\delta N/2. This implies that the number of triples (a_1,a_2,x)\in A^3 such that a_1+a_2=2x is at least \delta^2N^2/4. Each such triple corresponds to an arithmetic progression (a_1,x,a_2), as long as x\ne a_1 and a_1\ne a_2. The number of these degenerate triples is at most 2\delta N, so as long as N is sufficiently large (it is enough if \delta^2N^2/4>2\delta N, which is the case if N>8/\delta) there must be non-degenerate ones, as claimed.

General discussion

Let us think briefly about how the above argument might have been discovered. The idea was to find some fairly general condition on a set A\subset\mathbb{Z}_N that would guarantee that A contained an arithmetic progression of length 3. Now A will certainly contain an arithmetic progression of length 3 if we can find some x\in A such that 2x can be written in many ways as a_1+a_2, with a_1 and a_2 also belonging to A. And this will certainly be the case if the function f defined above is close to its average value \delta^2N for more than (1-\delta)N values of x. Speaking more loosely, we would like f to be roughly constant. So it is natural to look at the variance of f, which turns out to be closely related to the number of additive quadruples in the set A.

Note iconIncomplete This article is incomplete. Several more examples wanted.


I may be dense

.. but I can't understand why "the proportion of x such that | f(x) - delta N | \geq (delta N / 2)" has delta N instead of delta^2 N which is the average value of f(x) as stated previously. And I haven't been able to turn inline commenting on in google chrome, my apologies.

Post new comment

(Note: commenting is not possible on this snapshot.)

Before posting from this form, please consider whether it would be more appropriate to make an inline comment using the Turn commenting on link near the bottom of the window. (Simply click the link, move the cursor over the article, and click on the piece of text on which you want to comment.)