### Quick description

Suppose you have a function and want to prove that it can be approximated by a smoother function. A method that often works is convolution. If you convolve a function with the delta function, you don't change it; if you convolve it with a smooth approximation to the delta function, then you may well change it only very slightly and end up with a smooth function. Indeed, a general principle that often applies is that a convolution of two functions inherits the nice properties of each function.

### Prerequisites

Basic real analysis

### Example 1

Weierstrass's approximation theorem states that every continuous function can be uniformly approximated by polynomials. That is, for every there exists a polynomial such that for every .

There are several ways of proving this result (though they are not always as distinct as they at first appear). Let us prove it here by starting with the observation, which we shall not try to state rigorously, that if you convolve with the delta-function, then you get .

What does this mean? Well, the convolution of two functions and is the function defined by the formula

The delta-function is not really a function, but whatever it is, it has the property that , and we can think of it as taking the value "with mass 1" at and the value everywhere else. Therefore,

Now let us do another calculation. What is ? We shall assume that decays sufficiently rapidly for this integral to be finite. (Indeed, soon we shall assume that vanishes outside some interval.) Substituting for we can rewrite the integral as . If we now differentiate under the integral sign times we keep differentiating the term and end up with . Therefore, is a polynomial of degree at most .

We have just shown that convolving by a delta-function has no effect, and convolving by a polynomial of degree gives a polynomial of degree at most . This suggests a possible way of approximating by a polynomial: convolve it by a polynomial approximation of the delta-function. And this, give or take one small technicality, works.

The small technicality is that we would like to be continuous and defined on all of rather than just on . We would also like it to have good decay, so all we do is replace it by a continuous function that equals on and vanishes outside . Let us call this new function . Our aim will be to approximate uniformly by a polynomial on even though both and the polynomial are defined on . (Obviously we can't hope to approximate uniformly on all of , since any non-constant polynomial will tend to .)

So now let us try to approximate the delta-function by a polynomial. What we mean by this is that we would like a polynomial that "looks like the delta-function" for all values that have any chance of being involved when we convolve with . Since vanishes outside a bounded interval, this just means that our polynomial should look like the delta-function inside some appropriate bounded interval. The interval will do fine (and is in fact bigger than necessary). For a polynomial to look like the delta-function on this interval, we would like to take non-negative values (this is not essential, but it is nice), and for to be almost equal to for some small , and for both of these integrals to be approximately 1. Thus, we take the delta-function and replace "mass on an interval of width zero and zero everywhere else" by "mass approximately on a very narrow interval and almost zero everywhere else".

It is easy to construct such a polynomial. First we take a polynomial that has a unique maximum at the origin and is non-negative on . The simplest example is so let us take that, but the precise choice is not too important. Next, we raise this polynomial to some large power , obtaining a polynomial . This function is at the origin, and becomes very small as soon as you get any distance from the origin. (To be more precise about this, we can think of as being something like , which becomes small when is a large multiple of . The appearance of a Gaussian function here is not a total coincidence ...)

We want our delta-function imitation to integrate to approximately over the interval , so let us define to be , where . An easy but slightly tiresome calculation, which we shall omit here, shows that even after we have multiplied by the constant , which will be quite large, is very small outside an interval of width that tends to zero with .

What happens when we convolve with our original function ? Let . Then

The first equality is the definition of convolution. The second uses the fact that vanishes outside and that . For any fixed , the approximation is valid provided is large enough, since , being continuous, is bounded in modulus by some constant , and we can ensure with our choice of that is much smaller than whenever .

This still leaves us free to choose . We do that as follows. If we want to approximate to within for every , then we choose such that implies that is less than . This we can do because is uniformly continuous. This step is using the standard result that a continuous function on a closed bounded interval is uniformly continuous, which is like being continuous except that we can choose the same for every single rather than having to let depend on . Then

But

so we are done.

### General discussion

To summarize, the strategy of the proof above was as follows.

Observe that the delta-function is an identity for the binary operation of convolution.

Observe that convolving with a polynomial gives you a polynomial.

Approximate the delta-function by a polynomial , in some appropriate sense.

Then one can expect that convolving with ought to give a polynomial that approximates .

Work out the details.

### Example 2

Let be a uniformly continuous function defined on the whole of . (As an example, one could take a continuous piecewise linear function that zigzags up and down, always with gradient .) We would like to approximate uniformly by an infinitely differentiable function. How can we do so?

Let us follow a very similar strategy. We shall take an infinitely differentiable approximation to the delta-function and convolve with that. We describe the proof only very briefly, since it is similar to the proof in the first example. First of all, let us suppose that we have managed to find a non-negative function that is infinitely differentiable, integrates to , and vanishes outside a very small interval about . As before, differentiation under the integral sign allows us to prove that is also infinitely differentiable, and the proof that it uniformly approximates is basically the same as the proof in the previous example. So all that is left is to construct .

To do this we use the well-known function when and when . This function is infinitely differentiable, non-negative, and zero when . We then let . This function is obviously still infinitely differentiable (since it is a product of infinitely differentiable functions), and vanishes when or . And just for good measure it is an even function with maximum at . The next step in the building process is to adjust the height and width of to taste, by defining for constants and of our choice. The constant allows us to choose the interval outside which vanishes (which will be ) and allows us to ensure that .

### Example 3

The previous example also implies that the space of smooth, compactly supported functions (or "test functions") is dense in for any . Indeed, any function in can be approximated to arbitrary accuracy in norm by a continuous, compactly supported function (this can be seen for instance by truncating the function to be compactly supported and then applying Lusin's theorem), and by convolution with a smooth approximation to the identity, the latter function can in turn be approximated uniformly (and hence in , thanks to the compact support) by a smooth, compactly supported function.

A variant of this argument also shows that if and is a sequence of approximations to the identity, then converges to in (since this is true for the dense subclass of test functions, and one can take limits using Young's inequality).

### General discussion

The ability to approximate rough functions by smoother ones is often employed in the trick "Create an epsilon of room".

### Example 4

A fact that is sometimes of use in convex geometry is that if you have norm on , then you can approximate it arbitrarily closely by a norm that is infinitely differentiable except at . It would be good to have a detailed sketch proof of this fact, which would be very tedious to prove directly I think.

Let be a norm on a finite-dimensional vector space. Take a positive smooth function on the space of linear isomorphisms of , with support a compact neighborhood of the identity. Then integrating with respect to the Haar measure on , the function

is smooth away from , nonnegative, and convex, being a linear combination of the positive convex functions . It is similarly linear under scaling; hence, a smooth norm.

In order to better approximate as above, it might be handy to use the exponential map from the lie algebra which is again a vector space, and naturally supports parameter re-scaling.

## Comments

## Showing that the Schwartz space dense in L^p

Thu, 07/05/2009 - 03:00 — tao...is another good example of this trick, which I just covered in my class actually. I might put it in here later (and also interlink with "create an epsilon of room".)

## On Example 4

Fri, 15/05/2009 - 14:58 — mckeown_j.cAlso a good example for uses of duality is the case: consider a sublevel set for the norm, and Minkowski-sum with a smooth and suitably symmetric convex region; its size doesn't matter! The result is an approximating symmetric convex region, dual to a norm.