a repository of mathematical know-how

Averaging arguments

Quick description

Given n real numbers a_1,\dots,a_n with average a, there must be some i such that a_i\geq a, and there must also be some i such that a_i\leq a. This extremely simple principle is also very useful indeed. In particular, it sometimes leads to proofs in situations where it is very hard to give an explicit example of an i that works.


Basic definitions of graph theory.

Example 1

Let G be a graph with n vertices and m edges. Is it possible to partition the vertices into two sets X and Y in such a way that the number of edges from X to Y is at least m/2?

Here are two proofs that show that the answer is yes. For the first, we enumerate the vertices as x_1,x_2,\dots,x_n, and then assign each vertex in turn to either X or Y. We do the assignment as follows. If we have already assigned x_1,\dots,x_{k-1}, let X_{k-1} be the set of vertices we have assigned to X, and let Y_{k-1} be the set of vertices we have assigned to Y. Now assign x_k to X if it is joined to more vertices in Y_{k-1} than in X_{k-1}, and assign x_k to Y otherwise. This ensures that at least half the new edges (that is, edges from x_k to the already assigned vertices) are between X and Y. Therefore, when the process is complete, we have at least m/2 edges between X and Y.

That was not an averaging argument. Here is how to do it with an averaging argument. Let us randomly decide, for each vertex, whether to put it into X or Y. Given any edge vw, the probability that it goes between X and Y is the probability that we made different decisions for u and v, which is 1/2. Therefore, the expected number of edges between X and Y is m/2. But then, by the basic principle above, there must be a way of assigning vertices to X and Y such that the number of edges between the two is at least m/2.

The advantage of the first argument above is that it is algorithmic: it actually tells you how to partition the vertices. The advantage of the second is that it is so short and conceptual that it makes it obvious that the result is true. And in fact, though that is another story, the second argument can easily be "derandomized".

General discussion

The method we have just applied can be described as follows. You would like to build a structure with a certain property, and that property can be expressed as the requirement that a certain real parameter must take at least (or at most) a certain value. In order to achieve this, you define a random structure (according to a probability distribution of your choice) and prove that the expected value of the parameter is at least (or at most) the given value. It follows that there must exist a structure with the property you want.

Example 2

The following is a famous argument of Paul Erdős. Suppose you take the complete graph with N vertices and you colour its edges randomly in the simplest possible way: for each edge xy you toss a coin and you colour it red if the coin comes up heads and blue if it comes up tails. Given k vertices x_1,\dots,x_k in the graph, we say that they form a monochromatic clique if all the edges x_ix_j with 1\leq i<j\leq k have the same colour.

The probability that any given collection of k vertices forms a monochromatic clique is 2.2^{-\binom k2}, since the probability that they form a red clique is 2^{-\binom k2} (as there are \binom k2 edges that all independently require the coin to come up heads), and the probability that they form a blue clique is also 2^{-\binom k2}. Therefore, by linearity of expectation, the expected number of monochromatic cliques is 2.2^{-\binom k2}\binom nk.

So far we have not used the simple principle, but now we do. If the average number of monochromatic cliques is 2.2^{-\binom k2}\binom nk, then there must exist a red/blue colouring of the complete graph with n vertices that contains at most s 2.2^{-\binom k2}\binom nk monochromatic cliques of size k. In particular, if the parameters n and k are chosen in such a way that 2.2^{-\binom k2}\binom nk<1, then there must exist a red/blue colouring with no monochromatic cliques.

A small calculation shows that there is therefore a red/blue colouring of the complete graph on n vertices with no monochromatic clique of size 2\log_2n. Indeed, the requirement can be rearranged to \binom nk<2^{\binom k2-1}. If we use the inequality \binom nk\leq(en/k)^k and take logs to base 2, then we see that it is sufficient if 2(\log_2n+\log_2e-\log_2k)<k-1-2/k, which is certainly the case if k\geq 2\log_2n.

The remarkable thing about this argument is that, although it is extremely simple, nobody knows how to define a red/blue colouring with this property.

Note iconIncomplete This article is incomplete. More examples needed.

See also "If a parameter is generating undesirable boundary terms, try averaging over many choices of that parameter".