Create an epsilon of room

Quick description

You want to prove some statement $S_0$ about some object $x_0$ (which could be a number, a point, a function, a set, etc.). To do so, pick a small $\varepsilon > 0$ , and first prove a weaker statement $S_\varepsilon$ (which allows for “losses” which go to zero as $\varepsilon \to 0$ ) about some perturbed object $x_\varepsilon$ . Then, take limits $\varepsilon \to 0$ . Provided that the dependency and continuity of the weaker conclusion $S_\varepsilon$ on $\varepsilon$ are sufficiently controlled, and $x_\varepsilon$ is converging to $x_0$ in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement $S_\infty$ about some object $X_\infty$ , by first proving a weaker statement $S_N$ on some approximation $X_N$ to $X_\infty$ for some large parameter N, and then send $N \to \infty$ at the end.

General discussion

The first, simplest instance of this trick would be to prove that an object $x_0$ satisfies a property $S$ , which we know that conmutes with a limit in some sense, by finding a sequence of objects $x_i$ such that $S$ is invariant on it and such that it converges (in the same limit sense) to $x_0$ .^♦ The power of the method goes in the difficulty of proving the satisfaction of $S$ : $S$ should be easy to prove for every $x_i$ by some standard techniques, but difficult for $x_0$ with the same techniques. Note that this instance is somewhat the "reciprocal" of a limiting process (where we replace a sequence by its limit).

But of course, we don't need $S$ to be invariant in the sequence: we just need to have a perturbed property $S_i$ for every $x_i$ such that $S_i$ converges to $S$ as $x_i$ converges to $x_0$ . Moreover, we don't need the index set to be countable: we can use a net; for example, we can indize with the reals and use the variable $\varepsilon$ .

Here are some typical examples of a target statement $S_0$ , and the approximating statements $S_\varepsilon$ that would converge to $S$ :

$S_0$	$S_\varepsilon$
$f(x_0) = g(x_0)$	$f(x_\varepsilon) = g(x_\varepsilon) + o(1)$
$f(x_0) \leq g(x_0)$	$f(x_\varepsilon) \leq g(x_\varepsilon) + o(1)$
$f(x_0) > 0$	$f(x_\varepsilon) \geq c - o(1)$ for some $c>0$ independent of $\varepsilon$
$f(x_0)$ is finite	$f(x_\varepsilon)$ is bounded uniformly in $\varepsilon$
$f(x_0) \geq f(x)$ for all $x \in X$ (i.e. $x_0$ maximises f)	$f(x_\varepsilon) \geq f(x)-o(1)$ for all $x \in X$ (i.e. $x_\varepsilon$ nearly maximises f)
$f_n(x_0)$ converges as $n \to \infty$	$f_n(x_\varepsilon)$ fluctuates by at most o(1) for sufficiently large n
$f_0$ is a measurable function	$f_\varepsilon$ is a measurable function converging pointwise to $f_0$
$f_0$ is a continuous function	$f_\varepsilon$ is an equicontinuous family of functions converging pointwise to $f_0$ OR $f_\varepsilon$ is continuous and converges (locally) uniformly to $f_0$
The event $E_0$ holds almost surely	The event $E_\varepsilon$ holds with probability 1-o(1)
The statement $P_0(x)$ holds for almost every x	The statement $P_\varepsilon(x)$ holds for x outside of a set of measure o(1)

Of course, to justify the convergence of $S_\varepsilon$ to $S_0$ , it is necessary that $x_\varepsilon$ converge to $x_0$ (or $f_\varepsilon$ converge to $f_0$ , etc.) in a suitably strong sense. (But for the purposes of proving just upper bounds, such as $f(x_0) \leq M$ , one can often get by with quite weak forms of convergence, thanks to tools such as Fatou's lemma or the weak closure of the unit ball.) Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control $S_\varepsilon$ on the approximating object $x_\varepsilon$ is somehow "uniform in $\varepsilon$ ", although for " $\sigma$ -closed" conclusions, such as measurability, this is not required. [It is important to note that it is only the final conclusion $S_\varepsilon$ on $x_\varepsilon$ that needs to have this uniformity in $\varepsilon$ ; one is permitted to have some intermediate stages in the derivation of $S_\varepsilon$ that depend on $\varepsilon$ in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis. For instance, by replacing "rough", "infinite-complexity", "continuous", "global", or otherwise "infinitary" objects $x_0$ with "smooth", "finite-complexity", "discrete", "local", or otherwise "finitary" approximants $x_\varepsilon$ , one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals). [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. $x_\varepsilon$ should be expected to degrade in the limit $\varepsilon \to 0$ , and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in $\varepsilon$ .] Similarly, issues such as whether the supremum $x \in X \}$ of a function on a set is actually attained by some maximiser $x_0$ become moot if one is willing to settle instead for an almost-maximiser $x_\varepsilon$ , e.g. one which comes within an epsilon of that supremum M (or which is larger than $1/\varepsilon$ , if M turns out to be infinite). Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

In applying the epsilon regularization trick, one often needs to approximate rough functions by smooth ones. One useful way of doing so is to use the trick "To make a function nicer without changing it much, convolve it with an approximate delta function".

To summarise: one can view the epsilon regularisation argument as a "loan" in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to "pay back" the loan by establishing a final "hard analysis" estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

A variant: It may seem that the epsilon regularisation trick is useless if one is already in "hard analysis" situations when all objects are already "finitary", and all formal computations easily justified. However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a sufficiently small (but not infinitesimally small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day. (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.) Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

Prerequisites

Graduate real analysis. (Actually, this isn't so much a prerequisite as it is a corequisite: the limiting argument plays a central role in many fundamental results in real analysis.) Some examples also require some exposure to PDE.

Example 1

The "soft analysis" components of any real analysis textbook will contain a large number of examples of this trick in action. In particular, any argument which exploits Littlewood's three principles of real analysis is likely to utilise this trick.^◊

Example 2: The Riemann-Lebesgue lemma

Given any absolutely integrable function $f \in L^1({\Bbb R})$ , the Fourier transform ${\Bbb R} \to {\Bbb C}$ is defined by the formula

$= \int_{\Bbb R} f(x) e^{-2\pi i x \xi}\ dx.$

The Riemann-Lebesgue lemma asserts that $\hat f(\xi) \to 0$ as $\xi \to \infty$ . It is difficult to prove this estimate for f directly, because this function is too "rough": it is absolutely integrable (which is enough to ensure that $\hat f$ exists and is bounded), but need not be continuous, differentiable, compactly supported, bounded, or otherwise "nice". But suppose we give ourselves an epsilon of room. Then, as the space $C^\infty_0$ of test functions is dense in $L^1({\Bbb R})$ , we can approximate $f$ to any desired accuracy $\varepsilon > 0$ in the $L^1$ norm by a smooth, compactly supported function ${\Bbb R} \to {\Bbb C}$ , thus

$\int_{\Bbb R} |f(x)-f_\varepsilon(x)|\ dx \leq \varepsilon$ (1)

The point is that $f_\varepsilon$ is much better behaved than f, and it is not difficult to show the analogue of the Riemann-Lebesgue lemma for $f_\varepsilon$ . Indeed, being smooth and compactly supported, we can now justifiably integrate by parts to obtain

$\hat f_\varepsilon(\xi) = \frac{1}{2\pi i \xi} \int_{\Bbb R} f'_\varepsilon(x) e^{-2\pi i x \xi}\ dx$

for any non-zero $\xi$ , and it is now clear (since $f'$ is bounded and compactly supported) that $\hat f_\varepsilon(\xi) \to 0$ as $\xi \to \infty$ .

Now we need to take limits as $\varepsilon \to 0$ . It will be enough to have $\hat f_\varepsilon$ converge uniformly to $\hat f$ . But from (1) and the basic estimate

$\sup_\xi |\hat g(\xi)| \leq \int_{\Bbb R} |g(x)|\ dx$ (2)

(which is the single "hard analysis" ingredient in the proof of the lemma) applied to $= f - f_\varepsilon$ , we see (by the linearity of the Fourier transform) that

$\sup_\xi |\hat f(\xi) - \hat f_\varepsilon(\xi)| \leq \varepsilon$

and we obtain the desired uniform convergence.

Remark The same argument also shows that $\hat f$ is continuous; we leave this as an exercise to the reader.

Remark This example is a model case of a much more general instance of the limiting argument: in order to prove a convergence or continuity theorem for all "rough" functions in a function space, it suffices to first prove convergence or continuity for a dense subclass of "smooth" functions, and combine that with some quantitative estimate in the function space (in this case, (2)) in order to justify the limiting argument.

Example 3

The limiting argument in the previous example relied on the linearity of the Fourier transform $f \mapsto \hat f$ . But, with more effort, it is also possible to extend this type of argument to nonlinear settings. We will sketch (omitting several technical details, which can be found for instance in this PDE book) a very typical instance. Consider a nonlinear PDE, e.g. the nonlinear wave equation

$- u_{tt} + u_{xx} = u^3$ (3)

where ${\Bbb R} \times {\Bbb R} \to {\Bbb R}$ is some scalar field, and the t and x subscripts denote differentiation of the field $u(t,x)$ . Formally - if u is sufficiently smooth, and sufficiently decaying at spatial infinity, one can show that the energy

$= \int_{\Bbb R} \frac{1}{2} |u_t(t,x)|^2 + \frac{1}{2} |u_x(t,x)|^2 + \frac{1}{4} |u(t,x)|^4\ dx$ (4)

is conserved, thus $E(u)(t) = E(u)(0)$ for all t. This can be formally justified by computing the derivative $\partial_t E(u)(t)$ by differentiating under the integral sign, integrating by parts, and then applying the PDE (3); we leave this as an exercise for the reader. (There are also more fancy ways to see why the energy is conserved, using Hamiltonian or Lagrangian mechanics or by the more general theory of stress-energy tensors, but we will not discuss these here.) However, these justifications do require a fair amount of regularity on the solution u; for instance, requiring u to be three-times continuously differentiable in space and time, and compactly supported in space on each bounded time interval, would be sufficient to make the computations rigorous by applying "off the shelf" theorems about differentiation under the integration sign, etc.

But suppose one only has a much rougher solution, for instance an energy class solution which has finite energy (4), but for which higher derivatives of u need not exist in the classical sense. (There is a non-trivial issue regarding how to make sense of the PDE (3) when u is only in the energy class, since the terms $u_{tt}$ and $u_{xx}$ do not then make sense classically, but there are standard ways to deal with this, e.g. using weak derivatives, which we will not discuss further here.) Then it is difficult to justify the energy conservation law directly. However, it is still possible to obtain energy conservation by the limiting argument. Namely, one takes the energy class solution u at some initial time (e.g. t=0) and approximates that initial data (the initial position $u(0)$ and initial data $u_t(0)$ ) by a much smoother (and compactly supported) choice $(u^{(\varepsilon)}(0), u^{(\varepsilon)}_t(0))$ of initial data, which converges back to $(u(0), u_t(0))$ in a suitable "energy topology" related to (4) which we will not define here. It then turns out (from the existence theory of the PDE (3)) that one can extend the smooth initial data $(u^{(\varepsilon)}(0), u^{(\varepsilon)}_t(0))$ to other times t, providing a smooth solution $u^{(\varepsilon)}$ to that data. For this solution, the energy conservation law $E( u^{(\varepsilon)} )(t) = E( u^{(\varepsilon)} )(0)$ can be justified.

Now we take limits as $\varepsilon \to 0$ (keeping t fixed). Since $(u^{(\varepsilon)}(0), u^{(\varepsilon)}_t(0))$ converges in the energy topology to $(u(0), u_t(0))$ , and the energy functional E is continuous in this topology, $E( u^{(\varepsilon)} )(0)$ converges to $E( u )(0)$ . To conclude the argument, we will also need $E( u^{(\varepsilon)} )(t)$ to converge to $E( u )(t)$ , which will be possible if $(u^{(\varepsilon)}(t), u^{(\varepsilon)}_t(t))$ converges in the energy topology to $(u(t), u_t(t))$ . Thus in turn follows from a fundamental fact (which requires a certain amount of effort to prove) about the PDE to (4), namely that it is well-posed in the energy class. This means that not only do solutions exist and are unique for initial data in the energy class, but they depend continuously on the initial data in the energy topology; small perturbations in the data lead to small perturbations in the solution, or more formally that the map $(u(0),u_t(0)) \to (u(t),u_t(t))$ from data to solution (say, at some fixed time t) is continuous in the energy topology. This final fact concludes the limiting argument and gives us the desired conservation law $E(u(t)) = E(u(0))$ .

Remark It is important that one have a suitable well-posedness theory in order to make the limiting argument work for rough solutions to a PDE; without such a well-posedness theory, it is possible for quantities which are formally conserved to cease being conserved when the solutions become too rough or otherwise "weak"; energy, for instance, could disappear into a singularity and not come back.

Example 4: The maximum principle

The maximum principle is a fundamental tool in elliptic and parabolic PDE (for example, it is used heavily in the proof of the Poincaré conjecture, see e.g. my lecture notes on this topic). Here is a model example of this principle:

Proposition 1 (Maximum principle) Let $\overline{\Bbb D} \to {\Bbb R}$ be a smooth harmonic function on the closed unit disk $x^2+y^2 \leq 1\}$ . If M is a bound such that $u(x,y) \leq M$ on the boundary $x^2+y^2 = 1 \}$ . Then $u(x,y) \leq M$ on the interior as well.

A naive attempt to prove proposition 1 comes very close to working, and goes like this: suppose for contradiction that the proposition failed, thus u exceeds M somewhere in the interior of the disk. Since u is continuous, and the disk is compact, there must then be a point $(x_0,y_0)$ in the interior of the disk where the maximum is attained. Undergraduate calculus then tells us that $u_{xx}(x_0,y_0)$ and $u_{yy}(x_0,y_0)$ are non-positive, which almost contradicts the harmonicity hypothesis $u_{xx} + u_{yy} = 0$ . However, it is still possible that $u_{xx}$ and $u_{yy}$ both vanish at $(x_0,y_0)$ , so we don't yet get a contradiction.

But we can finish the proof by giving ourselves an epsilon of room. The trick is to work not with the function u directly, but with the modified function $= u(x,y) + \varepsilon (x^2+y^2)$ , to boost the harmonicity into subharmonicity. Indeed, we have $u^{(\varepsilon)}_{xx} + u^{(\varepsilon)}_{yy} = 4\varepsilon > 0$ . The preceding argument now shows that $u^{(\varepsilon)}$ cannot attain its maximum in the interior of the disk; since it is bounded by $M+\varepsilon$ on the boundary of the disk, we conclude that $u^{(\varepsilon)}$ is bounded by $M + \varepsilon$ on the interior of the disk as well. Sending $\varepsilon \to 0$ we obtain the claim.

Remark Of course, proposition 1 can also be proven by much more direct means, for instance via the Green's function for the disk. However, the argument given is extremely robust and applies to a large class of both linear and nonlinear elliptic and parabolic equations, including those with rough variable coefficients.

Exercise 1 Actually, we can prove more: we can ignore a "small" set from the boundary (in this exercise, a finite number of points) and the theorem still works. Use the maximum modulus principle as it is already stated and create yet one more epsilon of room to prove that if $\overline{\Bbb D} \to {\Bbb R}$ is a smooth harmonic function on the closed unit disk and $M$ is a bound such that $u(x,y) \leq M$ on "near all" the boundary, i.e., in $\partial{\Bbb D} \setminus \{\xi_1, \xi_2,...,\xi_n\}$ , then $u(x,y) \leq M$ on the interior as well. Hint.: Create an epsilon of room with the functions $u_\epsilon = u+\epsilon\cdot v$ , where $v(z)=\sum_{j=1}^N log\frac{|z-\xi_j|}{\alpha}$ and $\alpha$ is a constant greater than 2 (the diameter of $D$ ).

Exercise 2 Use the maximum modulus principle to prove the Phragmén-Lindelöf principle: if f is complex analytic on the strip $0 \leq \hbox{Re}(z) \leq 1\}$ , is bounded in magnitude by 1 on the boundary of this strip, and obeys a growth condition $|f(z)| \leq C e^{|z|^C}$ on the interior of the strip, then show that f is bounded in magnitude by 1 throughout the strip. Hint: multiply f by $e^{-\varepsilon z^m}$ for some even integer m.

Example 5: Manipulating generalised functions

In PDE one is primarily interested in smooth (classical) solutions; but for a variety of reasons it is useful to also consider rougher solutions. Sometimes, these solutions are so rough that they are no longer functions, but are measures, distributions, or some other concept of "generalised function" or "generalised solution". For instance, the fundamental solution to a PDE is typically just a distribution or measure, rather than a classical function. A typical example: a (sufficiently smooth) solution to the three-dimensional wave equation $-u_{tt} + \Delta u = 0$ with initial position $u(0,x)=0$ and initial velocity $u_t(0,x) = g(x)$ is given by the classical formula

$u(t) = t g * \sigma_t$

where $\sigma_t$ is the unique rotation-invariant probability measure on the sphere $x^2+y^2+z^2 = t^2 \}$ of radius t, or equivalently, the area element dS on that sphere divided by the surface area $4\pi t^2$ of that sphere. (The convolution $f*\mu$ of a smooth function $f$ and a (compactly supported) finite measure $\mu$ is defined by $= \int f(x-y)\ d\mu(y)$ .)

For this and many other reasons, it is important to manipulate measures and distributions in various ways. For instance, in addition to convolving functions with measures, it is also useful to convolve measures with measures; the convolution $\mu * \nu$ of two finite measures on ${\Bbb R}^n$ is defined as the measure which assigns to each measurable set E in ${\Bbb R}^n$ , the measure

$= \int \int 1_E(x+y)\ d\mu(x) d\nu(y).$ (5)

For sake of concreteness, let's focus on a specific question, namely to compute (or at least estimate) the measure $\sigma * \sigma$ , where $\sigma$ is the normalised rotation-invariant measure on the unit circle $|x|=1 \}$ . It turns out that while $\sigma$ is not absolutely continuous with respect to Lebesgue measure $m$ , the convolution is: $d(\sigma*\sigma) = f d m$ for some absolutely integrable function f on ${\Bbb R}^2$ . But what is this function f? It certainly is possible to compute it from the definition (5), or by other methods (e.g. the Fourier transform), but I would like to give one approach to computing these sorts of expressions involving measures (or other generalised functions) based on epsilon regularisation, which requires a certain amount of geometric computation but which I find to be rather visual and conceptual, compared to more algebraic approaches (e.g. based on Fourier transforms). The idea is to approximate a singular object, such as the singular measure $\sigma$ , by a smoother object $\sigma_\varepsilon$ , such as an absolutely continuous measure. For instance, one can approximate $\sigma$ by

$= \frac{1}{m(A_\varepsilon)} 1_{A_\varepsilon}\ dm$

where $1-\varepsilon \leq |x| \leq 1+\varepsilon \}$ . It is clear that $\sigma_\varepsilon$ converges to $\sigma$ in the vague topology, which implies that $\sigma_\varepsilon * \sigma_\varepsilon$ converges to $\sigma*\sigma$ in the vague topology also. Since

$\sigma_\varepsilon * \sigma_\varepsilon = \frac{1}{m(A_\varepsilon)^2} 1_{A_\varepsilon} * 1_{A_\varepsilon}\ dm,$

we will be able to understand the limit f by first considering the function

$= \frac{1}{m(A_\varepsilon)^2} 1_{A_\varepsilon} * 1_{A_\varepsilon}(x) = \frac{m( A_\varepsilon \cap (x - A_\varepsilon))}{m(A_\varepsilon)^2}$

and then taking (weak) limits as $\varepsilon \to 0$ to recover f.

Up to constants, one can compute from elementary geometry that $m(A_\varepsilon)$ is comparable to $\varepsilon$ , and $m( A_\varepsilon \cap (x - A_\varepsilon))$ vanishes for $|x| \geq 2 + 2 \varepsilon$ , and is comparable to $\varepsilon^2 (2-|x|)^{-1/2}$ for $1 \leq |x| \leq 2 - 2 \varepsilon$ (and of size $O(\varepsilon^{3/2})$ in the transition region $|x| = 2 + O(\varepsilon)$ ) and is comparable to $\varepsilon^2 |x|^{-1}$ for $\varepsilon \leq |x| \leq 1$ (and of size about $O(\varepsilon)$ when $|x| \leq \varepsilon$ . (This is a good exercise for anyone who wants practice in quickly computing the orders of magnitude of geometric quantities such as areas; for such order of magnitude calculations, quick and dirty geometric methods tend to work better here than the more algebraic calculus methods you would have learned as an undergraduate.) The bounds here are strong enough to allow one to take limits and conclude what f looks like: it is comparable to $|x|^{-1} (2-|x|)^{-1/2} 1_{|x| \leq 2}$ . And by being more careful with the computations of area, one can compute the exact formula for f(x) (more detail needed here).

Remark Epsilon regularisation also sheds light on why certain operations on measures or distributions are not permissible. For instance, squaring the Dirac delta function $\delta$ will not give a measure or distribution, because if one looks at the squares $\delta_\varepsilon^2$ of some smoothed out approximations $\delta_\varepsilon$ to the Dirac function (i.e. approximations to the identity), one sees that their masses go to infinity in the limit $\varepsilon \to 0$ , and so cannot be integrated against test functions uniformly in $\varepsilon$ . On the other hand, derivatives of the delta function, while no longer measures (the total variation of derivatives of $\delta_\varepsilon$ become unbounded), are at least still distributions (the integrals of derivatives of $\delta_\varepsilon$ against test functions remain convergent).

Comments

Inline comments

The following comments were made inline in the article. You can click on 'view commented text' to see precisely where they were made.

I find this rather hard to

Tue, 05/05/2009 - 23:50 — gowers

I find this rather hard to understand. I also find it difficult to see what it adds to the quick description. Perhaps instead one could have an easy example, such as approximating a function in $L_2(\mathbb{T})$ by a trigonometric polynomial by first approximating by a step function, then approximating the step function by a continuous function, and finally approximating the continuous function uniformly by a trigonometric polynomial. The first two steps tell you that you can find a sequence of continuous functions that approximate your function in $L_2$ , and the property of being approximable in $L_2$ is obviously closed under limits. (Incidentally, I think "closed under limits" would be clearer than "commutes with limits in some sense".

First paragraph

Wed, 06/05/2009 - 00:36 — JoseBrox

I'll try to answer:

1. Why exactly do you fin it hard to understand? Is it caused by a word? Should it be wholly rephrased? (English is not my native language, so my writing can be convoluted or even plainly wrong!)

2. What does it add to the quick description? I wanted to make the article more "friendly". I wanted to write one paragraph that an undergraduate could fully understand; I think the quick description is too technical for that level, because understanding it implies being fairly used to "trick thinking" already - and then, at this level the trick is rather obvious!
With this in mind (not everyone who comes here should be used enough to "tricks jargon"!), I included the simplest example because it is conceptually near to the "limiting argument" technique, which I think is well-known for undergraduates.

3. Why don't just have an easy example? Because an example, being as good as any other tool, is not an explanation. Actually, to properly show a general technique, you must give a metaexample (a collection of similar examples or an abstract example), and that's what I tried to do: give this example where the approximation is done just by a sequence and the approximating properties are not just converging to the desired property - they are the very same during all the process.
In my humble opinion, examples should be confined to the "examples" section and the general discussion and quick description should give abstract definitions and metaexamples.

4. Does "closed under limits" really sound clearer in English or is it just "more rigorous"? (I ask because I don't really know!). To me, the idea that the "operators" 'limit' and 'property' conmute looked like the easiest picture to grasp, and I just wanted the general idea to be understood. If an english student would understand better "closed under limits", then change it, please!

Let me try to explain in

Wed, 06/05/2009 - 10:50 — gowers

Let me try to explain in detail.

The first, simplest instance of this trick

The word "instance" leads the reader to expect an example – indeed, from what you say, the simplest non-trivial example.

would be

Do you mean "is"?

to prove that an object $x_0$ satisfies a property $S$ ,

Ah – it's not an example after all but a very abstract situation (since the concepts of "object" and "property" are about as general as you can get).

which we know that conmutes with a limit in some sense,

An undergraduate would not be comfortable with this notion of "commutes", and "in some sense" is unnecessarily vague: one should say something more precise, such as "A limit of objects that satisfy $S$ also satisfies $S$ ."

by finding a sequence of objects $x_i$ such that $S$ is invariant on it

The word "invariant" is unnecessarily off-putting. Does this mean just that $S$ holds for every $x_i$ ? Or do you mean that $S$ is actually a function that is constant on the $x_i$ ?

and such that it converges (in the same limit sense) to $x_0$ .

Ah, now I start to wonder whether "in some sense" above was referring to the limit rather than the commutation. In any case, better to declare right at the start that we think of $x_0$ as a limit of the $x_i$ .

The power of the method goes in the difficulty of proving the satisfaction of $S$ : $S$ should be easy to prove for every $x_i$ by some standard techniques, but difficult for $x_0$ with the same techniques.

I'm afraid I completely disagree with you about the status of examples. One example makes this clear, whereas with no examples one might think, "Surely if $x_0$ is the limit of the $x_i$ , then the only way of proving that $x_0$ satisfies $S$ will be along the lines you suggest. So what's the trick here?" Of course, the answer is that you actually construct the sequence, and do so in such a way that it consists of simpler objects (something you don't say here).

Note that this instance is somewhat the "reciprocal" of a limiting process (where we replace a sequence by its limit).

I don't really see what you're getting at here.

But of course,

If you're talking to undergraduates, then you shouldn't say "of course" about something like this.

we don't need $S$ to be invariant in the sequence: we just need to have a perturbed property $S_i$ for every $x_i$ such that $S_i$ converges to $S$

Without an example, what undergraduate will know what you mean by a sequence of properties converging to another property?

as $x_i$ converges to $x_0$ . Moreover, we don't need the index set to be countable: we can use a net; for example, we can indize with the reals and use the variable $\varepsilon$ .

This is also not a suitable remark for a "friendly" introduction!

Incidentally, I agree that the existing quick description is a little bit sophisticated for an undergraduate. My solution would probably be to start with a simple example, then continue with a more abstract definition of that example, then give the table, and then give the existing examples.

Any chance of telling us what

Wed, 06/05/2009 - 12:08 — gowers

Any chance of telling us what these principles are? I probably ought to know, but I don't, and maybe others don't either.

Article has been truncated

Wed, 05/01/2011 - 16:44 — tao

For some strange reason, all revisions of this article have been truncated at the first paragraph, and I cannot find a way to retrieve any subsequent portion of the text. Has this also occurred for other articles? Is there any way to recover some copy of the full article?

Article restored

Mon, 10/01/2011 - 09:08 — olof

A few other articles had also been mysteriously truncated, but I have now restored them (and this one). If anyone spots another page that does not look right then please let us know, for example via the forums.

Post new comment

(Note: commenting is not possible on this snapshot.)