Quick description
If you want to find the maximum of a combinatorial quantity, and if the maximum seems to be very far from unique, then try to relate the quantity to the dimension of a certain vector space.
Prerequisites
Basic linear algebra over finite fields. The notions of set, subset, element.
Example 1 (with general discussion)
Let be an even integer. How many subsets of the set can you pick if they all have to have odd size but the intersection of any two of them has to have even size?
Let us try to attack this problem directly, and to begin with let us just try to find the biggest example we can in the greediest way we can. And to save space let us use notation such as 134 for the set with elements 1, 3 and 4 rather than proper notation such as . The obvious set to begin with is 1. Since the next set has to have even-sized intersection with 1 and odd size, the obvious choice is 2. Continuing, the greedy algorithm gives us the sets 1, 2, ..., n, and we then find that we cannot choose any more sets (since any other set of odd size will have an odd intersection with any singleton formed from one of its elements).
Can we do better than this? Let's experiment by starting with a different set and then continuing greedily. If we start with 123, then the next set has to have odd size and even-sized intersection with 123, so the two most obvious choices are 124 and 4. If we go for 124, then the obvious continuation is 125, 126, ..., 12n. Can we choose any more sets? Yes we can: the set 134567...n, which a moment's thought will reveal to have odd size and an even intersection with all the other sets. And then we notice that we can also choose the set 234567...n, at which point it is not hard to prove that we cannot choose any more. (Proof: any set that contains 12 cannot contain any of 3,4,5,...,n, so must be 12, which has odd intersection with 134567...n. Any set that contains 13 must contain all of 4,5,6,...,n because if it doesn't contain m then it has odd intersection with 12m. And any set that contains 23 must contain all of 4,5,6,...,n for a similar reason.) Once again, we have ended up with sets.
If you continue experimenting in this way, you will find the same thing every time: if you have chosen fewer than sets then you can choose more, but once you get to sets then you get stuck. But there seems to be a great deal of freedom about how you choose the sets. This contrasts with many problems in extremal combinatorics, where the best possible examples are often unique, or unique up to some obvious symmetry. (A good example of the latter: how many subsets of can you choose if none of your subsets is contained in any other? The unique best possible collection of sets is all sets of size
Are there any other circumstances where you have a collection of objects, and a condition on pairs of those objects, such that (i) whenever you have chosen objects with there are many different ways of choosing such that the condition holds for each pair , and (ii) it is impossible to choose more than of them if any two have to satisfy the condition? Yes there are: a common one is when the objects are vectors in and the condition on a pair is that and should be orthogonal. Since orthogonality implies independence, we cannot choose more than non-zero vectors that are all orthogonal to each other.
Can we relate these two situations? To do so we would need to associate vectors with our subsets of and interpret the even-intersections condition as a kind of orthogonality. The obvious way to associate a vector with a subset of is to take its characteristic function, that is, the function that takes the value 1 for every that belongs to and 0 for every that does not belong to . Now the size of the intersection of two sets is equal to the standard scalar product of their characteristic functions (a fact that has many uses). Therefore, if two sets have even intersection, then their scalar product is an even integer.
But we wanted orthogonality. That is, we wanted a scalar product of zero. Can we think of an even integer as being "zero" somehow? Yes we can, if we look at it mod 2. So we are led to interpret the characteristic functions of the sets as elements of the set , and to think of that set as an -dimensional vector space over the field
Of course, we must check that orthogonality still implies independence in this context. The usual proof is this: if , then the scalar product with any must be 0; but the orthogonality of the implies that this scalar product is ; since is a non-zero vector, its scalar product with itself is non-zero, so must be zero; since was arbitrary we are done. In the mod-2 context the only part of that proof that fails is the assertion that if is a non-zero vector, then is non-zero: it is equal to the number of 1s in , reduced mod 2. But if is the characteristic function of a set of odd size, then this will be 1, so the proof of linear independence works in this case.
A quick summary of the above proof: you can choose at most sets of odd size with even intersections because the characteristic functions of such sets must be linearly independent in the vector space .
Further general discussion
How can we recognise extremal problems where linear algebra is likely to be useful? One answer was touched on above. It is that such problems are often characterized by the presence of a wide variety of best possible examples. The reason this suggests linear algebra as a possible tool is that the problem, "How many linearly independent vectors can you find in ?" has a very similar quality: you can't find more than , but there are many ways of finding . Of course, if your problem has that quality, there is no guarantee that linear algebra can be used, and even when it can, one often needs much more ingenuity to recast it in terms of vectors in a vector space than was needed above.
Example 2
An obvious follow-up problem to the first is this. How many sets can you pick if their sizes have to be even and their intersections also even?
Note that the approach above now fails, since if has even size and if is its characteristic function, then . Furthermore, a moment's thought reveals that much bigger collections of sets are possible. For instance, take the collection of all possible unions of the sets 12, 34, 56, ..., (n-1) n. There are such unions, they all have even size, and the intersection of any two of them is even. That collection of sets turns out to be merely the simplest and most natural of a wide variety of examples of collections of size with the desired property. Combining that observation with the fact that we can express the even-intersections property as a condition about scalar products mod 2, it seems likely that linear algebra will help us.
There are two approaches we could try: one would be to find some -dimensional space and find some way of converting any system of sets that satisfy the property we want into a collection of linearly independent vectors in . The other is to stick with our previous way of turning sets into vectors and see where it gets us. Since it is not at all obvious what we would choose as in the first approach, let us try the second.
Thus, we are trying to choose as many vectors in as we can, with the property that every that we choose has an even number of 1s, and for any and that we choose. Since the parity of the number of 1s in is , we can say this quite neatly: we want a collection of vectors with the property that for any two vectors (not necessarily distinct).
Imagine that we are choosing a collection of such vectors in a greedy way: we just keep going until we cannot go any further. If we have chosen vectors , then any vector we add to the collection will have to have the property that for . Now the condition is a linear condition of , so we are placing linear conditions on all subsequent vectors. It might seem as though they must therefore all live in an -dimensional subspace of , but of course that would only be the case if the linear conditions were independent, and there is no reason for them to be. But this does give us a huge clue about how to proceed with the proof: if we are trying to keep going for as long as possible, then we should try to make the conditions we impose as dependent as possible, which comes to the same thing as saying that we should try to cram the vectors into as small a subspace of as we can.
From here the proof more or less writes itself. If we have chosen more than vectors, then at least of them must be linearly independent (since an -dimensional subspace of has size . From this it follows that all vectors in the collection are confined to a subspace of dimension at most , which is an obvious contradiction. So the simple example we started with is indeed best possible.
Example 3
To give a glimpse of how much more sophisticated the linear-algebra method can be, here is a beautiful proof, due to Peter Frankl and Janos Pach, of the famous Sauer-Shelah lemma.◊ This is a solution to the following problem. Let be a collection of subsets of . Given any subset , define the projection of onto to be the collection of all sets with . How large can be if there is no set of size for which is the set of all subsets of ? (Many authors say that is shattered by if is the set of all subsets of .)
A moment's thought leads to an obvious maximal example (by which I mean an example that cannot be extended rather than an example of maximal size, though it will eventually turn out to be of maximal size too). It is to take to be the set of all subsets of of size less than .
Imagine now that we have experimented a bit and been unable to find any larger examples. Given the naturalness of that example, we begin to suspect that it may be of maximal size. If we have experimented sufficiently diligently, we will also see that if our hunch is correct, then there are many examples of maximal size. So once again we wonder about using linear algebra. The most obvious approach would be to find associate with each set in a vector in some space of dimension in such a way that the condition we are requiring to satisfy implies that all those vectors are linearly independent.
But how do we associate coordinates with a subset of ? The least unnatural way would seem to be to relate these coordinates in some way to the sets of size at most . With this thought in mind, the following idea is quite a natural one: let be the set of all subsets of of size at most , and associate with any subset of the function that takes a set to 1 if and 0 otherwise.
You may have wondered why I described this as a function to rather than to . The reason was that it turns out to be best to regard the functions as belonging to a vector space over rather than over . (This is often true, so this example is a useful illustration.) Indeed, it turns out that if has no full projections onto sets of size , then the functions are linearly independent.
Let us see how to prove this. To do this, we take a linear dependence and try to prove that every is zero. The information we have is that for no set of size does contain all subsets of . An obvious way to try to use this information is to assume that not all the are zero, to define some set of size in a natural way, and to prove that every subset of is equal to for some .
Let us consider what the statement actually says. If is a set of size at most , then can be rewritten as . So we know that this quantity is 0 for every set of size at most . So we have here a function that vanishes for every set of size , and we are looking for a set of size at least This suggests that we should look for a set such that does not vanish, since such a set is guaranteed to have size at least . We could then hope that every subset of this set was of the form for some .
Must there exist a set such that the expression in question does not vanish? Clearly yes: we can just take to be a maximal such that . However, it is too much to hope that every set with this property will work (and easy to find examples of sets that do not work). If we want to make it as likely as possible that every subset of is of the form then obviously we would like to be as small as possible. So let us choose a minimal such that .
Let . We would like to prove that there exists such that . And a natural way of doing that, given what we have so far, would be to prove that . And a natural way of doing that, if we want to exploit the minimality of , is to try to express this sum as a combination of sums of the form with .
This can indeed be done. One can check (and this is basically the inclusion-exclusion formula) that . By the minimality of , the sum on the right-hand side is equal to , which is non-zero, and we are done.
Recommended further reading
This singleton list needs to be extended, but as a start, one could look at Part III of ``Extremal Combinatorics: with Applications in Computer Science'' by Stasys Jukna. It can be looked at online here (just click on "Preview this book").
Comments
Frankl-Wilson Theorem
Thu, 21/05/2009 - 07:15 — Gil Kalai (not verified)Here is a blog discussion of a dimension argument for this theorem.http://gilkalai.wordpress.com/2009/05/21/extremal-combinatorics-vi-the-frankl-wilson-theorem/ . The linear algebra proof of Erdos-deBruijn theorem (a great grand father of Frankl-Wilson theorem)is described among other things in this post: http://gilkalai.wordpress.com/2008/05/01/extremal-combinatorics-i/
Inline comments
The following comments were made inline in the article. You can click on 'view commented text' to see precisely where they were made.
Shatter function lemma
Fri, 30/10/2009 - 11:35 — Nicolai H (not verified)I would suggest using the less eponymous term "Shatter function lemma" (that Matousek uses in his Lectures on Discrete Geometry) in addition to "Sauer-Shelah lemma". I personally have known this lemma for a long time, but always under the more descriptive name "Shatter function lemma".
choice of the field
Fri, 15/05/2015 - 13:32 — Fedor PetrovWhy do we need instead in the proof of Sauer-Shelah lemma? It looks that any field would work.
Post new comment
(Note: commenting is not possible on this snapshot.)