a repository of mathematical know-how

Using the fact that a continuous function on a compact set attains its bounds

Quick description

There are many circumstances in which it is very useful to be able to say that a continuous real-valued function attains its bounds. This article gives some examples, some simple and some less so.

Note iconContributions wanted This article could use additional contributions. More examples would be nice.

Example 1

Suppose one is dealing with a complicated function, for example the function f(x) = \sin^4 x + \cos^2(e^{-x^2 \sin^7 x}) + x^{10} on the interval x = [-1,1]. Due to the rather nasty nature of f, the prospect of attempting to find the minimum of f over this interval using calculus is not an inviting one. However f is certainly continuous and [-1,1] is compact, and so f attains its minimum. Manifestly f is non-negative, but it is also never zero: indeed for it to vanish we would require that \sin x = 0, which means that x = 0, but then the second term is \cos^2(1), and this is not zero. This means that there must be some absolute constant c > 0 such that f(x) \geq c for all x \in [-1,1].

Example 2

The above example concerns a concrete function, but the principle that a function that takes positive values is, under appropriate circumstances, bounded below by a positive constant is very useful in more abstract arguments too. For example, let D be an open subset of the complex plane \mathbb{C} and let C be a closed curve in D. In complex analysis, it can be very helpful to know not just that C is a subset of D, but also that there is some \delta>0 such that for every point z in C, the ball of radius \delta about z is contained in C. (For example, one sometimes likes to cover D with a grid of squares in such a way that every square that intersects C is fully contained in D.)

This can be proved easily: define a function f(z) to be the distance from z to the complement of D. Since D is open, f(z)>0 for every z\in D. It is easy to check that f is continuous. Indeed, one can show that |f(z)-f(w)|\leq|z-w|. Moreover, C, being the continuous image of a closed bounded interval, is compact. Therefore, the restriction of f to C attains its lower bound, and since C is a subset of D this lower bound cannot be zero.

Example 3

The first example was rather artificial but the next (from the work of E. Breuillard A Height Gap Theorem For Finite Subsets Of \mbox{SL}_d(\overline{Q}) and Non Amenable Subgroups) is not. Suppose one has a finite set S of d \times d matrices over the complex numbers, where we think of d as fixed. We may associate to S two quantities. Firstly, we define \Vert S \Vert to be the maximum norm of any matrix in S (we define the norm of a matrix in a fairly standard way, by regarding matrices as operators from \ell^2 to \ell^2). Secondly, we define \Lambda(S) to be the modulus of the largest eigenvalue among the matrices in S.

Now if \Vert S \Vert is bounded (say at most 100) then so is \Lambda(S); this is a very easy exercise. The converse, however, is not true (consider for example a single matrix x with 1's on the diagonal and some enormous number in the top-right corner). Remarkably, the converse is nearly true, and this is Breuillard's result.

Theorem 1 (Breuillard) Suppose that S is a finite set of d \times d matrices such that all of \Lambda(S),\dots,\Lambda(S^{d^2}) are bounded by K. Then there is an invertible matrix x such that \Vert x^{-1} S x\Vert \leq C_d K, where the constant C_d depends only on d.

Here S^m means the set consisting of products of m matrices in S. The theorem basically says, very roughly, that up to conjugation the notions of matrix norm and maximal eigenvalue are very similar.

The first step in the proof is the one relevant here, in that it involves rephrasing the problem as one about the minimum of a continuous function on a compact set. As a first observation, note that if one multiplies all the matrices in S by some scalar L then both (\Lambda(S^i))^{1/i} and \min_x\Vert x^{-1}Sx\Vert get multiplied by L. Thus our task is equivalent to showing that if \min_x \Vert x^{-1} S x\Vert = 1 then \Lambda(S^i) \geq c_d for some i = 1,\dots,d^2. Furthermore, conjugation does not change eigenvalues. Choosing a specific x_0 such that \Vert x_0^{-1} S x_0\Vert \geq \frac{1}{2} \min_x \Vert x^{-1} S x\Vert, we may conjugate S by x_0 and rescale again, and thereby assume that \Vert S \Vert = 1 and that \min_x \Vert x^{-1} S x\Vert \geq 1/2.

Now the set of d \times d matrices x with \Vert x \Vert = 1 is certainly compact as a subset of \C^{d^2}. Here, however, we are dealing not with single matrices but with sets of matrices. However these sets of matrices are also members of a compact metric space.

There is, in fact, a quite general construction. Let X be a compact metric space and suppose that P(X) is the collection of compact subsets of X. This may be endowed with the structure of a metric space by defining the Hausdorff distance between sets A and B to be \max\{\sup_{y \in Y} \inf_{x \in X} d(x,y), \sup_{x \in X} \inf_{y \in Y} d(x,y)\}. It is a lengthy but not particularly difficult exercise to show that this is indeed a distance, and that the metric space P(X) is compact.

Furthermore, each of the functions S \mapsto \Lambda(S^i), i = 1,\dots,d^2, is continuous on P(X). By the principle that continuous functions on compact spaces attain their bounds, all we must do is rule out the existence of a compact set S of matrices with \min_x \Vert x^{-1} S x \Vert\geq 1/2 and \Lambda(S) = \dots \Lambda(S^{d^2}) = 0.

The fact that we have =0 here rather than < \epsilon is hugely helpful, and it puts us squarely in the domain of algebra rather than analysis. The algebra is a little deep, and I refer the reader to Breuillard's article for references. Here is a sketch: suppose that \Lambda(S) = \dots = \Lambda(S^{d^2}) = 0 and let \overline{S} be the \C-algebra generated by S. By simple linear algebra this is spanned as a vector space over \C by S, S^2, \dots, S^{d^2}, all of which consist entirely of nilpotent elements by assumption. By a theorem of Wedderburn, \overline{S}^M = 0 for some M depending on d. By a theorem of Engel, the matrices in \overline{S} may simultaneously be put in strictly upper triangular form, and hence a fortiori the same is true of the matrices in S. But one may check directly, by calculation, that any compact set of strictly upper-triangular matrices may be conjugated so as to have arbitraryily small norm, contrary to the assumption that \min_x \Vert x^{-1} S x \Vert \geq 1/2.

General discussion

Breuillard notes that there is a drawback with this argument: it does not yield an explicit value for the constant C_d. He says that he can obtain one, but only by finding "almost" versions of the results of Wedderburn and Engel. It is a common feature of compactness arguments that they do not give explicit bounds: obtaining such bounds is often an interesting problem.


Since it doesn't show up

Since it doesn't show up directly in the article history, I should mention that this article is basically written by Ben Green. I've moved it over from the page How to use compactness, which is now a navigation page, and added one example (the one on the complex plane).