a repository of mathematical know-how

Convergent subsequences and diagonalization

Quick description

A topological space is sequentially compact if every sequence has a convergent subsequence. One form of the Bolzano-Weierstrass theorem states that a closed bounded subset of \mathbb{R}^n is sequentially compact. More generally, compact metric spaces are sequentially compact. These facts have many applications. Also, some useful diagonalization techniques can be interpreted as saying that certain topological spaces are sequentially compact – as a result, one often hears the phrase "by compactness" when no topology has been specifically mentioned.

A common application of compactness is to use the compactness and contradiction method to derive finitary quantitative results from infinitary qualitative ones.


Basic notions of real analysis up to and including the definition of compactness.

Related article

The article on how to use the Bolzano-Weierstrass theorem contains further examples of the use of convergent subsequences.

Example 1

Let us define a binary word to be a finite sequence of 0s and 1s. Given two words W_1 and W_2, say that W_1 is an initial segment of W_2 if W_1 has length m, W_2 has length n\geq m, and the first m terms in the sequence W_2 give you W_1. For example, 001101011 is an initial segment of 00110101110010110.

Suppose now that you are given an infinite set \mathcal{W} of words. Is it always possible to find an infinite sequence of 0s and 1s such that every initial segment of that infinite sequence is an initial segment of a word in \mathcal{W}? The answer is yes, and the proof is typical of a certain kind of compactness/diagonalization argument.

To prove it, we argue as follows. In our set \mathcal{W} there must either be infinitely many words that start with a 0 or infinitely many words that start with a 1. If there are infinitely many words that start with a 0, then there must be infinitely many words that start 00 or else infinitely many words that start 01. Similarly, if there are infinitely many words that start with a 1, then there must be infinitely many words that start 10 or infinitely many words that start 11. We can continue in this way, building up inductively a 01-sequence \epsilon_1,\epsilon_2,\dots such that for every n there are infinitely many words in \mathcal{W} that begin \epsilon_1,\dots,\epsilon_n. This sequence has the desired property.

General discussion

If you are familiar with the proof of the Bolzano-Weierstrass theorem, then you will recognise that the above argument is essentially the same.

Example 2

Let T be an infinite tree such that every vertex has finite degree. Let x be a vertex of T. Must there be an infinite path that starts at x?

Again, the answer is yes, and this is a fact known as König's lemma. The proof is very similar to the proof in the first example. Let us write x_0=x. Then we can partition the other vertices of T according to which neighbour of x you have to go through to get to them. So there must be a neighbour x_1 of x such that x is connected to infinitely many vertices of T via a path that has to go through x_1. Let T_1 be the set of vertices that can be reached from x only by going through x_1. Then we can find x_2 such that there are infinitely many vertices in T_1 that can be reached from x_1 only by going through x_2, and so on. This gives us an inductive construction of an infinite path in T.

General discussion

This second example is a generalization of the first, which can be reinterpreted as a statement about binary trees. (To form a tree out of \mathcal{W}, take an infinite binary tree and label its vertices as finite 01-sequences in the obvious way. Then take all vertices that correspond to initial segments of words in \mathcal{W}.)

Example 3

Here is a typical argument that would often be summed up by experts with the phrase "by compactness". One way of stating Van der Waerden's theorem is to say that for any two positive integers k and r, if you colour the positive integers with r colours then there will always be an arithmetic progression of length k consisting of numbers of just one colour. (Such a progression is called monochromatic.) Another way of stating the theorem is "more finite": for every k and every r there exists N such that if you colour the integers 1,2,\dots,N with r colours then there will always be a monochromatic arithmetic progression of length k.

It is easy to see that the second formulation implies the first. But what is interesting is that the first implies the second – by compactness. What does this mean? To answer this question, let us first see the proof and then what it has to do with compactness.

First proof. Suppose that the second formulation is false. Then for every N we can colour the integers from 1 to N with r colours in such a way that there is no monochromatic arithmetic progression of length k. Let us be slightly formal, and call this colouring c_N, regarding c_N as a function from \{1,2,\dots,N\} to \{1,2,\dots,r\}. Thus, \{1,2,\dots,r\} is our set of "colours" and c_N assigns colours to the integers from 1 to N.

We now want to put together the colourings c_N to create a new colouring c, defined on the whole of \mathbb{N}, in such a way that there is no monochromatic arithmetic progression of length k. This will contradict the first statement of van der Waerden's theorem and thereby give us the (contrapositive of the) deduction we wanted.

To do this, we first find infinitely many N such that c_N(1) is the same. We set X_1 to be the set of all these N, and we define c(1) to be the value taken by c_N(1) whenever N\in X_1.

Next, we find an infinite subset X_2\subset X_1 such that c_N(2) is the same for every N\in X_2 and call the common value c(2). And we continue like this to define our colouring c.

We must check that there is no monochromatic arithmetic progression of length k. But suppose that P is an arithmetic progression of length k and that c(m)=j for every m\in P. Let M be the maximum element of P. Then for every n\leq M and every N\in X_M, our construction above gives us that c_N(n)=c(n). Therefore, there must be some N such that c_N(m)=j for every m\in P, which contradicts our choice of c_N as a colouring with no monochromatic progressions of length k.

Second proof. While we are at it, it is worth seeing that the deduction is also a consequence of König's lemma. Indeed, the finite colourings of \{1,2,\dots,N\} can be regarded as vertices of a tree, where the neighbours of a colouring \kappa of \{1,2,\dots,N\} are all colourings \lambda of \{1,2,\dots,N+1\} that agree with \kappa up to N, as well as the restriction of \kappa to \{1,2,\dots,N-1\}. Now form a subtree by taking the colourings c_N from the previous proof, as well as all their restrictions (which also do not contain any monochromatic arithmetic progressions. Then find an infinite path of such colourings, which naturally defines a colouring of \mathbb{N} with no monochromatic arithmetic progression of length k.

General discussion

Relationship with compactness

So where does compactness come into it? Well, a colouring \mathbb{N}\rightarrow\{1,2,\dots,r\} can be thought of as an element of the infinite product \{1,2,\dots,r\}^{\mathbb{N}}, and by Tychonoff's theorem this product is compact in the product topology. Recall that a basic open neighbourhood of a function \mathbb{N}\rightarrow\{1,2,\dots,r\} will be a set of the form \kappa(1)=c(1),\dots,\kappa(n)=c(n)\} for some n. That is two colourings are thought of as "close" if they agree on the first n integers for some large n. In fact, it is not just compact but sequentially compact, which is essentially what we were showing in the first proof above. We say "essentially" here because that proof constructs a limit colouring c from a collection of colourings c_N of finite sets, but we could if we wanted have defined c_N in some arbitrary way for integers greater than N (e.g. by defining c_N(n) to be 1 whenever n>N) and the proof would have been the same.

Why do we call this a compactness argument rather than a sequential-compactness argument? Simply because the product topology on \{1,2,\dots,r\}^{\mathbb{N}} is metrizable (this is a nice exercise) and compactness and sequential compactness are equivalent for metric spaces.

Relationship with ultrafilters

Arguments such as the ones we have just seen can often be proved with the help of ultrafilters. See the article on how to use ultrafilters for the necessary background to this remark. In the proof above, we could instead have defined a colouring c as follows. Let \mathcal{U} be a non-principal ultrafilter. Then for each n there is a unique j such that c_N(n)=j\} belongs to \mathcal{U}. We define c(n) to be this j. Then if c(m)=j for every m in some finite arithmetic progression P, the set of N such that c_N(m)=j for every m\in P belongs to \mathcal{U} and is therefore non-empty, which gives us the contradiction we had above in a slicker way.

Essentially what the ultrafilter does here is decide for us how to define c(n) when we have a choice. (However, note that although the other proof appeared to use the axiom of countable choice, it was in fact easy to make the infinitely many choices needed in an explicit way, such as choosing the smallest possible j at each stage and making each X_N as large as possible.) It tends to be easy to translate back and forth between ultrafilter arguments of this basic kind and diagonalization arguments. (However, it becomes less routine when one uses ultrafilters with special properties such as being idempotent.)

Lack of quantitative bounds

An interesting feature of compactness arguments is that they give no information about bounds. The next two examples give two rather dramatic illustrations of this. The first is an easy compactness argument that proves that a certain function exists, but the function is known to grow so fast that it cannot be proved to exist in Peano arithmetic. The second is another easy compactness argument that proves that a function exists, but finding any sort of bound for the function is an open problem.

Example 4

Let \mathbb{N}^{(k)} stand for the set of all sets of k integers. Let us colour \mathbb{N}^{(k)} with r colours. Then the infinite Ramsey theorem implies that there is an infinite set X\subset\mathbb{N} such that every subset of X of size k has the same colour. From this we can extract, for any t, a finite set Z\subset X with the property that the cardinality of Z is at least as large as the minimal element of Z, and also at least as large as t, such that every subset of Z of size k has the same colour.

This may seem a rather uninteresting observation, but we can now use a standard compactness argument to prove that for every k, r and t there must exist N such that for any r-colouring of the subsets of \{1,2,\dots,N\} of size k there is a subset Z\subset\{1,2,\dots,N\} with |Z|\geq t and |Z|\geq\min Z, such that every subset of Z of size k has the same colour. The proof is just like the first proof in Example 3: one takes a sequence of purported counterexamples and lets them converge pointwise to a limiting colouring that contradicts the infinitary result that we deduced from Ramsey's theorem.

The interest in this result is that the dependence of N on r,k and t is known to be so vast (because there is a clever way of constructing colourings that give rise to enormous lower bounds) that the resulting function cannot be described in Peano arithmetic. This means that the theorem itself cannot be proved in Peano arithmetic: in a certain sense we were forced to use the axiom of infinity to prove it. This fact, first shown by Paris and Harrington, was philosophically significant because it showed that there were unprovable statements that were far less artificial than the statement cooked up by Gödel in his incompleteness theorem.

Example 5

Let G be a finite graph with maximum degree d, and let r be an integer. The r-neighbourhood of a vertex x is the set of all points that can be reached from x by a path of length at most r. If you pick a random vertex of G, then you can look at its r-neighbourhood as a graph, considered up to isomorphism. Let's call the r-neighbourhood distribution of G the probability distribution associated with these r-neighbourhoods. In other words, given a graph H of radius r about some point z, one is asking what the probability is that a random vertex x of G has an r-neighbourhood that is isomorphic to H with an isomorphism that sends x to z.

Given any c>0 a routine compactness argument shows that there is a finite set of graphs G_1,...,G_N of maximum degree d such that the r-neighbourhood distribution of any finite graph G differs by at most c (pointwise, say, though since there are only finitely many different possible r-neighbourhoods you can take any measure of closeness you like) from the r-neighbourhood distribution of at least one of the G_i. The proof goes as follows: the set of possible r-neighbourhood distributions lives in a compact set, by Tychonoff's theorem (since it belongs to [0,1] raised to the power the set of all possible r-neighbourhoods). Therefore, it is totally bounded. This means that it can be covered by a finite number of open balls of any given radius, which follows instantly from the definition of compactness. And that's all there is to it. But this proof gives no clue about the sizes of the graphs G_i. And in fact, no quantitative bound is known, which is rather extraordinary. Even a huge estimate would be interesting.

General discussion

The last example above was not quite the same as the others: it was a direct application of compactness rather than of sequential compactness. However, it was similar enough in spirit to be contained in this article. (If, however, people can think of more compactness arguments that use total boundedness in this way, then it might be better to create a new sub-article of How to use compactness and transfer this example. The author learned of this open problem from Lászlo Lovász (which is discussed in Section 5.5 of this survey article).