Tricki
a repository of mathematical know-how

What can a lower bound say about potential proofs of the upper bound?

Quick description

There are certain kinds of arguments that always lead to certain kinds of bounds. Therefore, if you are trying to prove a bound, you can often be sure that certain techniques will not be sufficient. This makes the search for techniques that do work more efficient. This article is closely related to the article about which techniques lead to which kinds of bounds.

Three examples and how they differ

Consider the following three problems. We write \mathbb{Z}_n for the group of integers mod n and assume (for convenience of the discussion) that n is prime.

Problem 1 How large can a subset of \mathbb{Z}_n be if it contains no three elements x, y and z with x+y=z?
Problem 2 How large can a subset of \mathbb{Z}_n be if it contains no four distinct elements x, y, z and w with x+y=z+w?
Problem 3 How large can a subset of \mathbb{Z}_n be if it contains no three distinct elements x, y and z with x+y=2z?

This is another illustration of a phenomenon discussed in the extremal combinatorics front page: that problems that are superficially very similar can have extremely different solutions. This makes it a good idea to have ways of guessing in advance what techniques are likely to be successful. One way is the topic of this article: deducing from a lower bound that certain techniques will be unable to give good results for the upper bound.

In order to see this, let us come up with some lower bounds. We are now trying to find sets with given properties. More precisely, we are trying to find sets that are as large as possible subject to certain constraints. A good meta-technique for tackling problems of this kind is to look at the Tricki articles on existence proofs and extremal combinatorics. If you follow this meta-technique, then one of the ideas you will be led to is the probabilistic method. Upon reading Example 2 in that article, you will be aware of the simple technique of choosing elements of \mathbb{Z}_n randomly and independently with probability p, with p chosen in such a way that the expected number of structures you are trying to avoid is less than half the expected size of the set you will obtain.

Applying that technique to the three problems above, one can easily prove lower bounds of cn^{2/3}, cn^{1/3}, and cn^{2/3}, respectively. Can we improve on these bounds?

As it happens, we can in all three cases. For Problem 1 we can take all elements of \mathbb{Z}_n that lie between n/3 and 2n/3 to obtain a lower bound of about n/3; for Problem 2 we can use a very nice algebraic construction of Erdős (which we have come across because we know that sometimes algebraic techniques can do better than random ones, and it happens that this one is discussed in the Tricki page on algebraic methods) that improves the bound to cn^{1/2}; and for Problem 3 there is a beautiful construction of Behrend that gives a bound of n\exp(-c\sqrt{\log n}). (An account of this construction can be found either in the article about how bounds of the form \exp(c\sqrt{\log n}) can ever arise or in the page of off-the-shelf examples in additive combinatorics.)

The existence of a lower bound of this type for Problem 1 completely changes our attitude to the problem. Suddenly, it seems that the extremal examples are highly structured sets rather than ones that look random, so the appropriate techniques for the problem are likely to be direct and combinatorial. What is more, if we look at this particular example, we rapidly see that it only just works, in the sense that every number not in the set can be written as the sum of two numbers that are in the set. That observation leads to the following question: how small can A+A be if A is a subset of \mathbb{Z}_n of size m? A theorem of Cauchy and Davenport (one of the more interesting mathematical collaborations, since their lives did not come close to overlapping) gives the answer 2m-1, which is attained when A is a set of the form \{a,a+d,\dots,a+(m-1)d\}. (If n is even, we could take all even numbers and this would be false. It is for this kind of reason that we are taking n to be prime.) Therefore, a sum-free set A cannot have more than about n/3 elements.

The precise details of the above proof are less important to this discussion than the fact that the existence of a linear lower bound (and in this case the particular nature of the set that achieves the bound) have a substantial effect on how we go about solving the problem: we look for a direct combinatorial argument and we hope to obtain an exact best possible result.

Now let us look at Problem 2. It seems hard to improve on the bound of cn^{1/2}, and after a moment's thought it becomes obvious why. If a subset A of \mathbb{Z}_n contains no non-trivial solutions to the equation x+y=z+w then all pairs (x,y) must have distinct sums, except that x+y is allowed to equal y+x. This means that if A has size m, then \binom m2 must be at most n, which gives a bound of the form Cn^{1/2}.

This approach doesn't seem to work for Problem 3, so let us see if we can think of any other approaches. One way of trying to get some inspiration is to generalize Problem 2 by asking the following question: if A is a subset of \mathbb{Z}_n, then how few quadruples (x,y,z,w) can there be in A such that x+y=z+w? This question is a good example of one that can be answered by the analytic method. We define a function \mathbb{Z}_n\rightarrow\mathbb{R} by setting f(u) to be the number of pairs (x,y) such that x and y are elements of A and x+y=u. Then the number of quadruples in A with x+y=z+w is \sum_u f(u)^2. It is easy to see that \sum_uf(u)=|A|^2 (since every pair (x,y) has a sum), so we find ourselves wanting to answer the following analytic question: if a_1,\dots,a_n are real numbers such that \sum_{i=1}^na_i=t, then how small can \sum_{i=1}^na_i^2 be? An important special case of the Cauchy-Schwarz inequality tells us that the minimum occurs when the a_i are all equal, and then it is t^2/n. Thus, we know that there must be at least |A|^4/n quadruples (x,y,z,w) in A with x+y=z+w.

Provided that A is reasonably dense, this lower bound can be approximately matched by an upper bound. This is a simple application of the probabilistic method: one simply chooses the elements of A randomly with a probability that gives the right sort of size for A.

Finally we come to the main point of these three examples. It turns out that Problem 3 is still wide open, but we do at least have a good understanding of why it is hard. And a large part of that understanding comes from the existence of the Behrend lower bound. It has been essentially the best known bound for a long time now, so there is some reason to think that it might be the correct bound. But there is no hope of proving such a bound using simple applications of inequalities such as Hölder's inequality or the Cauchy-Schwarz inequality because these inequalities always result in power-type bounds. [Can anyone give a precise justification for this assertion?] The best known upper bounds for Problem 3 are obtained by much more complicated arguments, but the existence of the Behrend bound shows that these arguments (or rather, the complicated aspects of them) are in some sense necessary. (There is another important distinction between Problems 2 and 3, which is that while both problems can be easily reformulated as problems about functions, we have a positivity that allows the reformulation of Problem 2 to work for functions with values in [-1,1], while the reformulation of Problem 3 works only for functions with values in [0,1]. Details about this will be added later.)