Tricki

## Double counting

### Quick description

If you have two ways of calculating the same quantity, then you may well end up with two different expressions. If these two expressions were not obviously equal in advance, then your calculations provide a proof that they are. Moreover, this proof is often elegant and conceptual. Double counting can also be used to prove inequalities, via a simple result about bipartite graphs.

### Example 1: sums of binomial coefficients

Prove that .

The left-hand side of this well-known identity counts all the subsets of size 0 from a set of size , then all the sets of size 1, then all the sets of size 2, and so on. By the time it has finished, it has counted all the subsets of a set of size . But an entirely different argument shows that the number of subsets of a set of size is : for each element there are two possibilities: belongs to the subset or does not belong to the subset. Therefore, in determining a subset of we have independent binary choices, so possibilities. (If this argument sounds a bit woolly, then you can prove it by induction. Let be the set . Then every subset of is obtained by taking a subset of and either adding to it or leaving it alone. Therefore, the number of subsets of is twice the number of subsets of . And the number of subsets of is 2.)

### Example 2: counting the edges of an icosahedron

Prove that the number of edges of an icosahedron is 30.

The idea of this example is to deduce the answer very simply from just the fact that an icosahedron has 20 triangular faces. The rough idea of the proof is that each face has three edges, while each edge is an edge of two faces, so the ratio of faces to edges ought to be 2 to 3. To be sure that this argument is correct, we count pairs , where is an edge and is a face that has as one of its edges. Since the icosahedron has 20 triangular faces, the number of such pairs is 60. If is the number of edges, then the number of such pairs is also , since each edge is part of two faces. Therefore, , as claimed.

### General discussion

Many combinatorial applications of the double-counting principle can be regarded as consequences of the following very simple result about bipartite graphs: if is a bipartite graph with vertex sets and , then the sum of the degrees of the vertices in is equal to the sum of the degrees of the vertices in . The proof is a very simple double count, since the two sums are different ways of counting the edges in (one by focusing on the vertex in and the other by focusing on the vertex in ).

For instance, for the icosahedron we could define a bipartite graph where one vertex set consisted of the edges of the icosahedron and the other of the faces, joining an edge to a face if the edge belonged to the face. The edges of this bipartite graph are precisely the edge-face pairs considered above (so this is not a different proof, but a different way of looking at the same proof).

The following equivalent reformulation of the bipartite-graphs principle can often be used very directly: if is the average degree of the vertices in and is the average degree of the vertices in , then .

Many applications of the principle use the following consequence. Let be a bipartite graph with vertex sets and . Suppose that every vertex in has degree and every vertex in has degree . Let be any subset of , and let be the set of all vertices in that are joined to at least one vertex in . Then . To prove this, let be the subgraph of with vertex sets and , with the edges inherited from . Then every vertex in has degree and every vertex in has degree at most .

It often happens that we are in the following situation. We have a set whose size we are trying to bound above, but we do not have an obvious bipartite graph to which to apply an inequality such as the one in the last paragraph. In that case, we must construct one. It is often easier to think about the task in an equivalent way: given a bipartite graph with vertex sets and , each vertex in determines a neighbourhood (defined to be the set of all vertices in that are joined to ), and therefore the bipartite graph itself determines a collection of subsets of , possibly with repeats, so strictly speaking it determines a multiset of subsets of . Conversely, given a collection of subsets of , we can use it to define a bipartite graph, by choosing for each set in the collection a vertex (or set of vertices), and putting these vertices together to form a set . This thought leads to another principle. Let be a collection of subsets of . If each element of is contained in at least sets in and each set in contains at most elements of , then . (To prove this directly, count pairs , where , , and .)

### Example 3: the sizes of upper shadows

Let be a collection of subsets of , with every set in of size . Let and define the -upper shadow to be the collection of all sets of size that contain at least one set in . Then the size of is at least .

To prove this assertion, let us define a bipartite graph with vertex sets and , joining to if and only if . Then each has degree , so in particular this is the average degree. And each has degree at most , so this is an upper bound for the average degree of vertices in . By the bipartite-graphs principle, it follows that the size of is at least . (Here we have applied the first reformulation of the principle, but we could have applied the second one even more directly.)

One can check by direct calculation that . Here is a double-counting argument instead, which shows that . This time we are not presented with an obvious bipartite graph, so let us look at the right-hand side and see what it might be counting. The first term counts subsets of size from a set of size , and the second counts subsets of size from a set of size . So we could regard the product as the number of ways of choosing a set of size from the set and then choosing a subset of of size .

Now another way of choosing such a pair is to choose first. We can do this in ways, and then there will be ways of choosing a set of size that contains , since we need to choose more elements from the elements we still have left.

### Example 4: the Erdős-Ko-Rado theorem

Let be a collection of subsets of such that every has size and such that no two sets in are disjoint. Suppose also that . Then .

Note that the collection of all sets of size that contain the element has size , so this result is best possible. Note also that , so what we are trying to show is that the proportion of sets of size that we can choose if no two are disjoint is at most . Thirdly, note that if then any two sets of size intersect, so that condition is necessary. For convenience, let us us the standard notation for the set and for the collection of subsets of of size . Also, we shall call a set system in which any two sets have non-empty intersection an intersecting system.

To solve this problem let us try to find a collection of subsets of with the following three properties. First, if is any element of (note that we have three levels of sets here, so an element of is a subset of and therefore a collection of subsets of ), then the proportion of sets in that belong to is at most ; second, every element of belongs to the same number of sets ; and third, every element of contains the same number of sets. If we can do that, then we can use the above results, but the nicest way to finish the argument is to reason probabilistically as follows. Let be the proportion of elements of that belong to . Then if we pick a random element of , it has a probability of belonging to . But another way of choosing a random element of is to choose a random element of and then a random set . This gives the uniform distribution on , since every element of has an equal probability of belonging to the randomly chosen set , and it then has a probability of being chosen from , and all the sets have the same size. But if no set in has a proportion greater than of elements that belong to , then the probability of choosing an element of by the second method is at most (since this is true whatever set you begin by choosing). It follows that , which is what we wanted to prove.

It remains to choose . Here we use another trick: exploiting symmetry. Suppose we can find just a single set with the property that no intersecting set-system can contain more than elements. Then the same will be true if we permute the set and take the corresponding transformation of . (To be precise, if is a permutation of we would take the collection of all sets such that .) If we define to be the set of all such transformations of , then the symmetry of the situation means that every element of is contained in the same number of sets in . Thus, the problem is solved if we can find just one set that cannot contain a proportion greater than of sets that all intersect.

We are closing in on our target. How can a proportion of arise? If is a multiple of , then it is quite easy to see how to do it. For instance, if then we could take five disjoint sets of size , and obviously at most one of these sets can belong to an intersecting system. But if is coprime to , as it may well be, then we will need the number of sets in to be a multiple of . Are there any natural examples of subsets of that have a number of elements that is a multiple of ?

It's not quite clear what "natural' means here, so let's jump straight to the answer and then see if we can justify it somehow. The answer is to think of the set as the group of integers mod and to take all sets of the form , where this is interpreted mod .

To see that this works, let us suppose we have an intersecting family of these sets. Without loss of generality one of the sets in the family is . Then all other sets in the family must be intervals that begin or end at one of the numbers from to . Let us write for the interval of length that begins at and for the interval of length that ends at . Then if belongs to the family, we know that does not. Thus, we can have at most one out of each pair . There are such pairs and every interval of length that intersects is contained in one of them, apart from the interval itself. Therefore, the family can have size at most , and the proof is complete.

Why was the idea of choosing a cyclic collection of intervals a natural one? One answer is that it is a fairly natural generalization of the partitioning that worked when was a multiple of . Another is that if we are trying to find a nice subset of then one potential source of examples is orbits of group actions. If we want our subset to have size , then an obvious group to try is the cyclic group of order , since we can easily define an action by cyclically permuting the ground set. And finally, if we are arranging the elements of in a cycle, then the most natural set of size to consider is an interval. So even if it isn't instantly obvious that this example is going to work, it is one of the first examples one is likely to try.

In case that long discussion leaves the impression that the proof is rather long, here is the proof minus the accompanying justification. Let be an intersecting family. Let be the family of intervals of length mod . Let be a random permutation and let be the set of all such that . Then the size of cannot be more than , by symmetry combined with the argument above that proves it when is the identity. On the other hand, the expected size of is . It follows that .

This theorem is called the Erdős-Ko-Rado theorem, and the proof by double counting is due to Gyula Katona.

### Example 5: an identity concerning Euler's totient function

Euler's totient function is the function that takes a positive integer to the number of positive integers less than and coprime to . For example, the integers less than and coprime to are and , so .

An important identity in elementary number theory is that , where "" means that we are summing over all factors of . For example, the factors of are and , and .

To prove this identity, we shall count the set in two different ways. The first way is the utterly obvious way: it has size by definition of "has size ". The definition being used here is that a set has size if and only if there is a one-to-one correspondence between and the set , which there obviously is if actually equals .

For the second way, we shall partition the set into subsets and add up the sizes of those subsets. To do this, we would like to define a subset for each factor of . Is there a natural way of doing this that will give rise to disjoint sets?

The answer to any question like this will be yes if we can find a function from to the set of all factors of . Then for each we take the set of all elements that map to . So is there a natural function of this kind?

Yes, and it's not hard to think of because there aren't any obvious alternatives: we define to be , the highest common factor of and . This certainly maps to a factor of , and it also depends on in a natural way. So let's see if it works.

Given a factor of , which integers map to ? That is, which integers have the property that ? Well, they obviously have to be multiples of . But they also mustn't be multiples of any larger factor of . Since , a multiple will have the property if and only if , that is, if and only if has no factors in common with . Therefore, the number of such that is equal to .

We have now shown that , which is not quite what we were trying to show. However, the map is a bijection from the set of factors of to itself, so it is in fact equivalent to what we wanted to show. (Another way of arguing would be to replace the map by the map . Then the inverse image of would have size .)

### Example 6: evaluating zeta(2)

Euler's famous theorem that can also be proved by calculating the same quantity in two different ways. In this case the problem is no longer combinatorial, so the phrase "double count" is less appropriate, but the general philosophy is the same. This example requires some Fourier analysis.

The quantity we shall calculate is the square of the -norm of the function defined on the interval by taking if and if . Here we shall define the square of the -norm to be . The calculation is extremely easy, since is identically 1, so we obtain the answer 1.

So far, this is not very interesting, but it becomes interesting if we exploit Parseval's identity, which tells us that the -norm of a function is equal to the -norm of its sequence of Fourier coefficients. Thus, a second way of calculating the square of the -norm of is to work out its Fourier coefficients and add up the squares of their absolute values. Therefore, must be equal to , where is defined to be .

Now . If is even, then , so we get 0. If is odd, then , and then we get . After a very similar calculation for the part of the integral between and , and after taking account of the factor we find that if is odd and if is even. It follows that the sum of the absolute values of the Fourier coefficients is . Since the square of an odd number is equal to the square of minus that odd number, we can write this as . Therefore, .

This is already a pretty interesting identity. To obtain Euler's theorem, we just note that the left-hand side is equal to , which is three quarters of . So , which equals .

### Another point of view

Giving a bipartite graph with vertex sets and is the same as giving a relation between and : we declare and to be related if there is an edge joining and . Equivalently, if we identify the relation with its graph (i.e. if we think of as being the subset of consisting of those ordered pairs whose elements are related by ), then is precisely the set of edges in the bipartite graph.

Now there are natural maps and , given by projecting onto the first and second factor respectively. Restricting these maps to , we obtain projections and

If then the degree of (in the bipartite graph view-point) is just the order of the preimage , and similarly the degree of is simply the order of . The sum over of the orders of all the preimages is just the order of , as is the sum of the orders of all the preimages , as ranges over the points of . Thus we get the basic equality

(Both sides are equal to .) This gives another point of view on the basic principle of double counting discussed above.

This point of view, using the graph of the relation and the maps and , can be applied to extend the idea of double counting to other contexts. For example, in algebraic geometry it gives rise to the technique of dimension counting via incidence varieties.

### Example 7: fixed points and orbits of group actions

Here is an example from the theory of group actions. Suppose that the finite group acts on the finite set . There is a rather natural relation to consider on , related to the study of fixed points for the action. Namely, we define and to be related if fixes . More symbolically, we define a relation via

Now as above we have the projections and Let us see what we can deduce from the double counting formula applied in this context.

We begin by interpreting , for . This is just the set of fixed points of acting on , which we denote by .

What about , for ? This is precisely the stabilizer of , i.e. the subgroup of consisting of points which fix , which we denote by .

Thus double counting gives the identity

We can reinterpret this formula by recalling that if and lie in the same orbit of , then and are conjugate. Indeed, if then a simple calculation shows that . Thus we can regroup the second sum into a sum over a set of orbit class representatives and so rewrite it as

(Here denotes the orbit of , and we have used the formula ) Altogether, we conclude that

where denotes the number of orbits of acting on . Dividing both sides by , we find that

or, in words:

The average number of fixed points of an element of acting on is equal to the number of orbits of acting on .

This useful relation between fixed points and orbits is known