Tricki
a repository of mathematical know-how

Creating real numbers using limiting arguments

Quick description

If you find yourself wanting to prove the existence of a real number with certain properties, then almost always you will need to do so by means of a limiting argument of some kind, though sometimes this can be disguised if you use other results in your proof that are themselves proved using limiting arguments. This article gives examples of the main general techniques that are used.

Prerequisites

Basic concepts and theorems of analysis. In particular, the notion of a real number, sequences, limits, the supremum of a set, the intermediate-value theorem, the nested-intervals property, the Bolzano-Weierstrass theorem. Some of these are briefly explained in the article but advance familiarity with them would definitely be helpful.

Basic techniques

Example 1

The nested-intervals property of the real numbers is the truth of the following theorem.

Theorem 1 Let [u_1,v_1],[u_2,v_2],\dots be a collection of non-empty closed intervals, and suppose that each one contains the next as a subset. Then the intersection of all the intervals is non-empty.

Just before we see how to prove this result, let us see why it is not a trivial statement by observing that the corresponding statement for open intervals is false: for example, each of the intervals (0,1),(0,1/2),(0,1/3),\dots contains the next one, each is non-empty, and yet no real number is contained in every single one of them.

General discussion

Note that we are trying to prove that a real number x exists with a certain property: the property of belonging to all the intervals [u_n,v_n]. Thus, the property we want to prove is naturally regarded as the simultaneous occurrence of a countable sequence of properties P_1,P_2,\dots, where P_n is the property of belonging to the interval [u_n,v_n]. In such a situation, we cannot normally hope to create x all in one go. Instead, we create some auxiliary and more complicated object such as a convergent sequence or a non-empty set that is bounded above, and define x in terms of that. (Given the sequence, one would take x to be the limit, and given the set one would take the supremum.)

This instantly makes the task easier. For instance, if we use a sequence, then normally we have to ensure that each term in the sequence satisfies finitely many properties rather than infinitely many, though in this example it is even easier than that since the sequence suggests itself instantly.

Example 1, continued

Here, then, is a proof of Theorem 1.

Proof. The sequence u_1,u_2,\dots is increasing (though not necessarily strictly increasing) and bounded above (by v_1). Therefore, it converges. Let x be the limit. Then x\geq u_n for every n, since u_m\geq u_n for every m\geq n. (This follows very easily from a principle that is proved as Example 1 in I have a problem to solve in real analysis and I do not believe that a fundamental idea is needed.) We can also show that x\leq v_n for every n, since u_m\leq v_m\leq v_n for every m\geq n. Therefore, x belongs to every interval [u_n,v_n], as required.

General discussion

Note that in order to create our real number we used a basic result of real analysis: that bounded monotone sequences converge. This is typical of the way one proves that real numbers exist, and the next few examples will do the same, but using different basic results.

From now on, we shall think of the nested-intervals property not as a theorem but as one of the basic results about real numbers that we are free to use. The next example illustrates not just that different basic results can be used, but that different basic results can be used for the same problem. In fact, we shall give three (not all hugely different) proofs of the intermediate value theorem. (A fourth proof can be found as Example 1 of Discretization followed by compactness arguments.)

Example 2

Theorem 2 Let f be a continuous function from a closed interval [a,b] to \R and suppose that f(a)<t and f(b)>t. Then there exists c such that a<c<b and f(c)=t.

Proof using nested intervals. In order to use the nested-intervals property, we must construct some nested intervals in such a way that each one has at least a chance of containing c such that f(c)=t. In particular, it would be disastrous if every element of one of the intervals was greater than t, or every element was less than t. But there is an easy way to ensure that this does not happen. What we do is build a sequence of intervals [a_n,b_n] as follows. We begin by setting a_0=a and b_0=b. And once we have [a_n,b_n], we let [a_{n+1},b_{n+1}] be either the left half of [a_n,b_n] or the right half of [a_n,b_n]. That is, we either let a_{n+1}=a_n and b_{n+1}=(a_n+b_n)/2 or we let a_{n+1}=(a_n+b_n)/2 and b_{n+1}=b_n. What governs our decision? Well, by induction we can always make the choice in such a way that f(a_n)\leq t and f(b_n)\geq t. If we've done this up to n, then we choose the left half of [a_n,b_n] if f((a_n+b_n)/2)\geq t and the right half if f((a_n+b_n)/2)<t. Finally, we let c be the intersection of all these intervals, which exists by the nested-intervals property.

Now we appeal to standard theorems about sequences, limits, and continuous functions (all of which are proved in I have a problem to solve in real analysis and I do not believe that a fundamental idea is needed). Since a_n\leq c\leq b_n for every n, and b_n-a_n=2^{-n}(b-a), we find that |a_n-c| and |b_n-c| are both at most 2^{-n}(b-a), so a_n\rightarrow c and b_n\rightarrow c. Since f is continuous, it follows that f(a_n)\rightarrow f(c) and f(b_n)\rightarrow f(c). Since f(a_n)\leq t for every n, it follows that f(c)\leq t. And since f(b_n)\geq t for every n, it follows that f(c)\geq t. So f(c)=t.

General discussion

It might be argued that we used a bit more than the nested intervals there: we used the sequences of their end points. However, we used just the nested intervals to find c, and using the endpoints was perhaps justified by the fact that the way we chose the intervals was in terms of their endpoints.

A purist might like to prove a general principle in advance about nested intervals and continuous functions. It could say the following. Let I_1\supset I_2\supset I_3\supset\dots be a collection of non-empty closed intervals with lengths tending to zero. Let f be a continuous function, and suppose that for each n there exists x_n\in I_n such that f(x_n)\leq t. Then f(x)\leq t, where x is the (unique) element of the intersection of the I_n.

Alternatively, one can deduce the following result from the Bolzano-Weierstrass theorem. Let I_1\supset I_2\supset I_3\supset\dots be a collection of non-empty closed intervals. Let f be a continuous function and suppose that for each n there exists x_n\in I_n such that f(x_n)\leq t. Then there exists x in the intersection of all the I_n such that f(x)\leq t. In fact, perhaps we can prove that later in this article ...

Example 2, continued

Suppose you wanted to work out the square root of 2 using a very unsophisticated algorithm. Then you could do the following: you first find the largest integer that squares to less than 2, then the largest multiple of 0.1 that squares to less than 2, then the largest multiple of 0.01 that squares to less than 2, and so on. What you would produce would be the sequence 1, 1.4, 1.41, 1.414, 1.4142,\dots, of longer and longer truncations of the decimal expansion of 2.

An argument of this kind can be turned into a rigorous proof of the intermediate value theorem. It is slightly odd to use decimal expansions for this (though one could) so let's use binary expansions instead. Also, for convenience let's assume that a=0 and b=1.

Proof using monotone sequences. Let us construct a sequence a_0,a_1,a_2,\dots by taking a_n to be the largest multiple 2^{-n}j of 2^{-n} such that f(2^{-n}j)\leq t. This sequence is monotone increasing, since a_n is the maximum over a larger set than a_{n-1}. It is also bounded above by 1. Therefore it converges to some limit c. For each n, f(a_n)\leq t, so by the continuity of f we have that f(c)\leq t. But the choice of a_n also guarantees that f(a_n+2^{-n})>t, and a_n+2^{-n} also converges to c, so f(c)\geq t. Therefore, f(c)=t.

Next, let us prove the result using the fact that every non-empty set that is bounded above has a supremum.

"Proof using the least upper bound principle.'' Let C be the set of all x such that f(x)<t. Then C is non-empty, since a\in C, and bounded above by b. Therefore, it has a supremum. Let c be this supremum. We prove that f(c)=t by showing that f(c) cannot be greater than t and cannot be less than t. Let us start with the first assertion. If f(c)>t, then let \epsilon=f(c)-t. By continuity we can find \delta>0 such that f(x)>f(c)-\epsilon=t whenever |x-c|<\delta. Since c is an upper bound for C, it follows that c-\delta is also an upper bound for C, which contradicts the fact that c is the least upper bound for C.

If f(c)<t, then let \epsilon=t-f(c). By continuity we can find \delta>0 such that f(x)<f(c)+\epsilon=t whenever |x-c|<\delta. But this implies that c is not after all an upper bound for C.

Example 3

Now let us prove the nested-intervals property but for arbitrary closed bounded sets.

Theorem 3 Let F_1\supset F_2\supset F_3\supset\dots be non-empty closed bounded sets. Then the intersection \bigcup_{n=1}^\infty F_n is non-empty.
[proof]

We are back in the situation of wanting to find a real number x that has every one of a countable sequence of properties. There is no obvious way of using the nested-intervals property, because we don't have any intervals. So let us try using sequences instead. An obvious approach would be to make sure that x_n\in F_n for every n. If we do that, then the fact that each F_n contains all the succeeding ones will imply that x_m\in F_n whenever m\geq n. And then the limit of the x_m will also be in F_n, since F_n is a closed set.

But hang on – who said that the sequence (x_m) converged? So far, we have done nothing to ensure that, and it is not completely obvious how we could.

It is in situations like these that the Bolzano-Weierstrass theorem is often useful. If the good properties you have made your sequence have are shared by all its subsequences, then you are free to pass to a subsequence. So all you need to know then is that your sequence has a convergent subsequence, which it does if it is bounded. But if x_{k_1},x_{k_2},\dots is a subsequence, then k_m\geq m, so x_{k_m}\in F_n whenever m\geq n. Therefore, the property we were interested in is indeed shared by all subsequences. Moreover, the sequence (x_n) belongs to the set F_1, which is bounded, by hypothesis. Therefore, we are done.

[proof]

General discussion

The point of this article is that there are several theorems in basic real analysis that end by asserting that a real number exists with a certain property (such as being the limit of a sequence, or the supremum of a set, or the limit of a subsequence of a sequence, or a point where a continuous function takes a given value). If you are trying to prove that a real number exists with a certain property, and if the existence of this real number is not completely obvious, then instead of trying to prove it directly, try instead to find a sequence, set, or continuous function to which you can apply one of these theorems.

Exercise 1 Let F_1\supset F_2\supset\dots be non-empty closed bounded sets, and let f be a continuous function defined on \R. Suppose that for every n there is a point x_n\in F_n such that f(x_n)\leq t. Prove that there is a point x\in\bigcap_{n=1}^\infty such that f(x)\leq t.

More advanced techniques

We will be much briefer in this section, because the techniques we discuss here are discussed more fully in the article entitled How to change "for all x there exists y" into "there exists y such that for all x".

Example 4

Let C_1,C_2,\dots be translates of the Cantor set. Is it possible for \bigcup_{n=1}^\infty C_n to equal the whole of \R?

The answer is no, and there are several ways that one might consider going about the proof. Note that proving that their union is not the whole of \R amounts to finding a real number for which countably many properties hold simultaneously: property P_n is the property of not belonging to C_n. Thus, we are on familiar territory.

A perfectly good way of going about the problem is to use the ideas of the previous section. We shall try to construct a nested sequence of non-empty closed intervals in such a way that the intersection does not belong to any C_n. And this is not too hard, once we make the following observation about the Cantor set itself: every interval of positive length in \R contains a non-empty closed subinterval that is disjoint from the Cantor set. This of course applies to every translate of the Cantor set as well. Click here if you want the proof. To prove this, let a<b and pick c such that a<b<c and c has two consecutive 1s somewhere in its ternary expansion. Then any number sufficiently close to c has at least one 1 in its ternary expansion, so does not belong to the Cantor set. This gives us an interval of positive length not contained in C, and by shortening it if necessary we can make sure that this interval is closed and contained in the interval (a,b).

So now all we do is use this fact to construct inductively a sequence I_1\supset I_2\supset I_3\supset\dots of closed intervals of positive length in such a way that I_n is disjoint from C_n. By the nested-intervals property, these intervals have a non-empty intersection, and any point in this intersection is disjoint from all the C_n.

Now the point of this section is that we can sometimes avoid going through this kind of argument over and over again by encapsulating it as a general principle and just applying that principle. And the principle that works here is the Baire category theorem. (Note: the Wikipedia article just linked to is a bit dry and formal, but in due course the non-existent Tricki article How to use the Baire category theorem should get written. Also, the article on reversing quantifiers mentioned just before this example contains some information about how to use the theorem.)

In one of its versions, the Baire category theorem is about a certain notion of smallness. The basic idea is that the sets C_n cannot cover all of \R because (i) they are all small, (ii) a countable union of small sets is small, (iii) \R is large. So what does "small" mean in this context? Well, it turns out to mean something like "what comes out of the proof just given". Here is a more formal definition ...

A subset X of \R is called nowhere dense if there is no non-empty open interval (a,b) for which X is dense in (a,b). That is, for every non-empty open interval (a,b) you can find a non-empty open subinterval (c,d)\subset (a,b) such that X\cap (c,d)=\emptyset. A proof very similar to the proof above can be used to show that no countable union of nowhere dense sets can equal the whole of \R, and this is what the Baire category theorem says. (However, it has various equivalent versions and can be generalized to any complete metrizable topological space.) But this shows that countable unions of nowhere dense sets form a class of sets that is closed under countable unions (since a countable union of countable unions of nowhere dense sets is a countable union of nowhere dense sets). Thus, if we define a set to be meagre if it is a countable union of nowhere dense sets, and interpret "small" to mean "meagre", we have our three properties above.

So if you know the Baire category theorem, then solving the problem above becomes easy: all you have to do is check that a translate of the Cantor set is meagre, which it is because it is nowhere dense.

Another proof, which is more advanced, is to use measure theory to give us a different notion of smallness that is in some ways more intuitive (though this impression is slightly misleading). Roughly speaking, the Cantor set has "length zero" and a countable union of sets of length zero also has length zero. But \R has infinite length. I will say little more about this, other than that it can be made to work. Once one has the theory of Lebesgue measure available, this proof is almost trivial. It follows quickly from the definition of Lebesgue measure that the Cantor set has measure zero, and a basic theorem of measure theory is that a countable union of sets of measure zero has measure zero.

Which of these two proofs is the "real" reason that you cannot cover \R with translates of the Cantor set? If you are inclined to say that the second is more natural, then observe that you can vary the construction of the Cantor set by removing smaller and smaller proportions of [0,1] at each stage, so that you end up with a nowhere dense set of positive measure. If you do that, then the second proof stops working but the first one carries on just fine.

Note iconContributions wanted This article could use additional contributions. It would be nice to have an example that works for measure but not category, but I'm struggling to think of a good one.

In fact, Erdős has shown that there is a bijection from \R to \R that takes sets of measure zero to meagre sets, such that its inverse takes meagre sets to sets of measure zero. This proves that there is a kind of equivalence between what you can prove using measure-zero arguments and what you can prove using Baire-category arguments.