a repository of mathematical know-how

Transfinite induction

Quick description

Transfinite induction is similar to induction but the well-ordered set \mathbb{N} is replaced by larger ordinals. This article tells you what you need to know about ordinals in order to be able to prove results by transfinite induction, gives examples of its use, and distinguishes between various different types of transfinite-induction argument.

General discussion

The following facts about ordinals are all you need to know about them in order to carry out proofs by transfinite induction.

  • There is a total ordering, <, defined on the class of all ordinals.

  • 0 is an ordinal and there is no smaller ordinal.

  • Given any ordinal \alpha, the set \beta<\alpha\} is a well-ordered set.

  • For every set X there exists a well-ordering of the elements of X. (This is called the well-ordering theorem.)

  • Given any well-ordered set X, there is exactly one ordinal \alpha such that X is order-isomorphic to the set \beta<\alpha\}.

  • Every ordinal \alpha is either a successor ordinal or a limit ordinal. It is a successor if there is some \beta such that \alpha is the smallest ordinal that is greater than \beta. In this case we write \alpha=\beta+1. It is a limit if it is not a successor.

  • (Optional, but convenient.) The way ordinals are conventionally constructed, \alpha is in fact equal to \beta<\alpha\} (which implies in particular that 0 is the empty set). Therefore, instead of talking about the cardinality of \beta<\alpha\} we can talk more simply about the cardinality of \alpha. We can also say that \alpha is a well-ordered set. The successor of \beta is then \beta\cup\{\beta\}. If \alpha is a limit ordinal, then \beta<\alpha\}.

  • (Not needed in proofs, but you ought to know it just to be safe.) Each ordinal is a set, but there is no such thing as the set O of all ordinals, because if there were then O would be an ordinal as well. This is a contradiction since then we would have O<O. However, it is all right to talk about the class of all ordinals.

  • Given any set Y, there is an ordinal \alpha with the same cardinality as Y, and hence (since \beta<\alpha\} is well-ordered) there is a least such ordinal \alpha. Therefore, every set Y can be well-ordered in such a way that the set of predecessors of any element of Y has strictly smaller cardinality than the cardinality of Y itself.

  • Suppose that we have put a set X into one-to-one correspondence with an ordinal \theta. For each \alpha<\theta, write x_\alpha for the element of X that corresponds to \alpha. Suppose that P is some statement about elements of X, that P(x_0) is true, and that P(x_\alpha) is true whenever P(x_\beta) is true for every \beta<\alpha. Then P(x) is true for every x\in X. (This is the principle of transfinite induction.)

Example 1

Let us prove that every vector space has a basis. This result is also proved in the article on how to use Zorn's lemma (in a better way).

Let V be a vector space. By the well-ordering theorem, there is a well-ordering of the elements of V. Therefore, there is an ordinal \alpha such that the elements of V can be put into one-to-one correspondence with \alpha. For each \beta<\alpha, let us write v_\beta for the element of V that corresponds to \beta. Now let B be the set of all v_\beta that do not belong to the linear span of their predecessors. It is clear that B is linearly independent, since if \sum_{i=1}^n\lambda_iv_{\beta_i}=0, \beta_1<\dots<\beta_n, and \lambda_n\ne 0, then we have expressed v_{\beta_n} as a linear combination of earlier v_\betas. It also spans, since every v_\beta either belongs to B or is a linear combination of earlier elements of B, and every element of V is v_\beta for some \beta.

General discussion

Example 1 might suggest that transfinite induction is just another way of carrying out Zorn's-lemma arguments. And indeed, it can be used for this purpose, but Zorn's lemma is neater and does not require the machinery of ordinals. So Example 1 is not a very good advertisement for the transfinite induction. However, there are situations where transfinite induction really is needed. Typically, these involve a cardinality condition on the initial segments of the well-ordering that one places on the set that one is examining. The next example illustrates this.

Example 2

A famous result of Sierpinski is the existence of a subset X of \mathbb{R}^2 that intersects every line exactly twice. Here is how the proof goes. First, well-order the set of all lines in \mathbb{R}^2, which has cardinality 2^{\aleph_0}, in such a way that each element has strictly fewer than 2^{\aleph_0} predecessors in the well-ordering. Let L_\alpha be the line associated with the ordinal \alpha. (If the continuum hypothesis is assumed, then \alpha must be countable, but we are not assuming the continuum hypothesis.)

We now build up our set X as follows. Suppose that we have found for each \beta<\alpha a set X_\beta such that for every \gamma\leq\beta we have |L_\gamma\cap X_\beta|=2 and such that for every line L we have |L\cap X_\beta|\leq 2. Suppose also that we have done this in such a way that if \gamma<\beta then X_\gamma\subset X_\beta. Finally, suppose that all the sets X_\beta have cardinality strictly less than 2^{\aleph_0}.

Let Y_\alpha=\bigcup_{\beta<\alpha}X_\beta and note that the cardinality of Y_\alpha is strictly less than 2^{\aleph_0} (since for any infinite cardinal \kappa, a union of \kappa many sets of size \kappa has size \kappa). Then |Y_\alpha\cap L|\leq 2 for every line L (since otherwise there would have to be some \beta such that |X_\beta\cap L|>2), and |Y_\alpha\cap L_\beta|=2 for every \beta<\alpha. If |Y_\alpha\cap L_\alpha|=2, then let X_\alpha=Y_\alpha, and the induction has continued to stage \alpha. Otherwise, the number of lines that go through two points of Y_\alpha is strictly less than 2^{\aleph_0}, and each of these lines intersects L_\alpha at most once, so we can find infinitely many points in L_\alpha that do not belong to any of those lines. (We want to avoid those intersections since we do not want to have three points of X_\alpha in the same line.) Add either one or two of these points to Y_\alpha to create X_\alpha, and again the induction has continued to stage \alpha.

This proves that we can construct sets X_\alpha for every \alpha. The union of all these sets X_\alpha has the property that we want of X.

General discussion

The big difference between the second argument and the first is that we insisted that each element of the well-ordering had fewer predecessors than the cardinality of the set of all lines, which happened to be the same as the cardinality of each individual line. A cardinality condition such as this gives rise to an argument that cannot be straightforwardly converted into a Zorn's-lemma argument.

Sometimes one needs a stronger cardinality assumption still: one wants the set of all predecessors of any given element to be countable. This requires the continuum hypothesis. The article on how to use the continuum hypothesis has some examples of arguments of this kind.

Example 3

Many interesting uses of transfinite induction use just the countable ordinals (without any assumption that the number of these is 2^{\aleph_0}). As a simple but representative example, let us prove that there is an uncountable collection of well-ordered subsets of \mathbb{R}, no two of which are order-isomorphic.

We shall prove this by constructing for each countable ordinal \alpha a well-ordered subset W_\alpha of \mathbb{R} that is order-isomorphic to \alpha. This we do using transfinite induction up to (but not including) the first uncountable ordinal. The induction starts with our setting W_0=\emptyset. Now let \alpha be a countable ordinal. As our inductive hypothesis we will take the slightly stronger statement that for every \beta<\alpha we can choose W_\beta to be a subset of [0,1). (But it isn't significantly stronger because [0,\infty) is order-isomorphic to [0,1).)

It is convenient to split the proof into the two cases: either \alpha is a successor or \alpha is a limit ordinal. This splitting is a common but not universal feature of transfinite induction arguments over the countable ordinals.

If \alpha=\beta+1, then the construction of W_\alpha is easy: we just squash W_\beta into [0,1/2) and add an extra point. To be more definite, we can set W_\alpha to be (W_\beta/2)\cup\{\frac 34\}.

We now need a small lemma, which is that for every countable limit ordinal \alpha there is an increasing sequence of ordinals \beta_1,\beta_2,\beta_3,\dots such that \bigcup_{n=1}^\infty\beta_n=\alpha. This is easy to prove. First, enumerate all the ordinals \beta<\alpha, of which there are countably many. Next, pick a subsequence that consists of every ordinal that is larger than all previous ordinals in the enumeration. This sequence cannot be bounded above by any \beta<\alpha, since there exists \gamma such that \beta<\gamma<\alpha (as \alpha is a limit ordinal), so we would have a contradiction when we got to \gamma in the enumeration. Therefore, its union is indeed \alpha (because it contains every \beta<\alpha).

Given such a sequence, construct W_\alpha as follows. For each n, let V_n be an order-isomorphic copy of W_{\beta_n} that lives in the interval [1-\frac 1n, 1-\frac 1{n+1}) (obtained as a linear image of W_{\beta_n}, say). Next, remove from V_n an initial segment that is order-isomorphic to \beta_{n-1} and call this set U_n. (To put this loosely, remove the first \beta_{n-1} elements of V_n.) Then U_n is order-isomorphic to \beta_n\setminus\beta_{n-1} and every element of U_n is larger than every element of U_{n-1}. The union of the sets U_n lives in [0,1) and is order-isomorphic to the union of the sets \beta_n\setminus\beta_{n-1}, which equals the union of the sets \beta_n, which equals \alpha. Thus, the inductive step is proved.

General discussion

Transfinite induction over the countable ordinals is often used to give a measure of complexity. For example, we could say that the complexity of a well-ordered subset of \mathbb{R} is the unique ordinal that is order-isomorphic to it. Another well-known example is the complexity of Borel sets. We can define open sets and closed sets to have complexity 0, and then say that a set X has complexity at most \alpha if it is a countable union or countable intersection of sets X_n such that each X_n has complexity at most \beta_n for some \beta_n<\alpha. The complexity of X is the least ordinal \alpha such that X has complexity at most \alpha. This is a definition by transfinite induction. It is possible to prove, again by transfinite induction, that for every countable ordinal \alpha there is a Borel set that has complexity \alpha. Thus, there are uncountably many different kinds of Borel set.


Symbol for total order

A total order R is supposed to be total, in the sense that for every pair a,b\in X, we have either aRb or bRa. Choosing the symbol < instead of \leq for the total order can be deceptive because it usually means "lesser but not equal". It's clear that this order isn't total, because we don't have a<a. So, either it is a mistake or it is a conscious choice of the symbol. I haven't changed the notation myself precisely because there could be a good explanation to do it like this. If I don't receive an answer in several days I will change it.

I always thought that a total

I always thought that a total order could be strict, and that the condition was that x<y, x=y or y<x. Is that a very nonstandard convention?

The relation < on the ordinals is more convenient to use than \leq, since one wants many statements to be true for every \beta<\alpha, for some given \alpha. I suppose one could add the word "strict" in brackets before the first occurrence of "total order", but that feels a bit strange, given that you can convert orders of the \leq type into orders of the < type and back again. Indeed, because of this I describe them both as total orders and think of them as different ways of describing the same basic underlying object.

You are right

That's true, if you ask for the trichotomy law axiom, then both definitions are equivalent. I don't know exactly how nonstandard is this, but since Wikipedia and Planetmath both mention it (Mathworld doesn't!), it should be safe. Maybe in the transfinite context it is more used? (I just checked my copy of Kamke's "Theory of Sets". He uses your convention!).

Excuse me for my comment, I thought the notation was somewhat confusing (but it was more like I was the one confused!)