Quick description
Suppose you have a function and want to prove that it can be approximated by a smoother function. A method that often works is convolution. If you convolve a function with the delta function, you don't change it; if you convolve it with a smooth approximation to the delta function, then you may well change it only very slightly and end up with a smooth function. Indeed, a general principle that often applies is that a convolution of two functions inherits the nice properties of each function.
Prerequisites
Basic real analysis
Example 1
Weierstrass's approximation theorem states that every continuous function
can be uniformly approximated by polynomials. That is, for every
there exists a polynomial
such that
for every
.
There are several ways of proving this result (though they are not always as distinct as they at first appear). Let us prove it here by starting with the observation, which we shall not try to state rigorously, that if you convolve
with the delta-function, then you get
.
What does this mean? Well, the convolution of two functions
and
is the function
defined by the formula
The delta-function
is not really a function, but whatever it is, it has the property that
, and we can think of it as taking the value
"with mass 1" at
and the value
everywhere else. Therefore,
Now let us do another calculation. What is
? We shall assume that
decays sufficiently rapidly for this integral to be finite. (Indeed, soon we shall assume that
vanishes outside some interval.)
Substituting
for
we can rewrite the integral as
. If we now differentiate under the integral sign
times we keep differentiating the
term and end up with
. Therefore,
is a polynomial of degree at most
.
We have just shown that convolving by a delta-function has no effect, and convolving by a polynomial of degree
gives a polynomial of degree at most
. This suggests a possible way of approximating
by a polynomial: convolve it by a polynomial approximation of the delta-function. And this, give or take one small technicality, works.
The small technicality is that we would like
to be continuous and defined on all of
rather than just on
. We would also like it to have good decay, so all we do is replace it by a continuous function
that equals
on
and vanishes outside
. Let us call this new function
. Our aim will be to approximate
uniformly by a polynomial on
even though both
and the polynomial are defined on
. (Obviously we can't hope to approximate
uniformly on all of
, since any non-constant polynomial will tend to
.)
So now let us try to approximate the delta-function by a polynomial. What we mean by this is that we would like a polynomial that "looks like the delta-function" for all values that have any chance of being involved when we convolve with
. Since
vanishes outside a bounded interval, this just means that our polynomial should look like the delta-function inside some appropriate bounded interval. The interval
will do fine (and is in fact bigger than necessary). For a polynomial
to look like the delta-function on this interval, we would like
to take non-negative values (this is not essential, but it is nice), and for
to be almost equal to
for some small
, and for both of these integrals to be approximately 1. Thus, we take the delta-function and replace "mass
on an interval of width zero and zero everywhere else" by "mass approximately
on a very narrow interval and almost zero everywhere else".
It is easy to construct such a polynomial. First we take a polynomial that has a unique maximum at the origin and is non-negative on
. The simplest example is
so let us take that, but the precise choice is not too important. Next, we raise this polynomial to some large power
, obtaining a polynomial
. This function is
at the origin, and becomes very small as soon as you get any distance from the origin. (To be more precise about this, we can think of
as being something like
, which becomes small when
is a large multiple of
. The appearance of a Gaussian function here is not a total coincidence ...)
We want our delta-function imitation to integrate to approximately
over the interval
, so let us define
to be
, where
. An easy but slightly tiresome calculation, which we shall omit here, shows that even after we have multiplied by the constant
, which will be quite large,
is very small outside an interval of width that tends to zero with
.
What happens when we convolve
with our original function
? Let
. Then
The first equality is the definition of convolution. The second uses the fact that
vanishes outside
and that
. For any fixed
, the approximation is valid provided
is large enough, since
, being continuous, is bounded in modulus by some constant
, and we can ensure with our choice of
that
is much smaller than
whenever
.
This still leaves us free to choose
. We do that as follows. If we want
to approximate
to within
for every
, then we choose
such that
implies that
is less than
. This we can do because
is uniformly continuous. This step is using the standard result that a continuous function on a closed bounded interval is uniformly continuous, which is like being continuous except that we can choose the same
for every single
rather than having to let
depend on
. Then
But
so we are done.
General discussion
To summarize, the strategy of the proof above was as follows.
Observe that the delta-function is an identity for the binary operation of convolution.
Observe that convolving with a polynomial gives you a polynomial.
Approximate the delta-function by a polynomial
, in some appropriate sense.Then one can expect that convolving
with
ought to give a polynomial that approximates
.Work out the details.
Example 2
Let
be a uniformly continuous function defined on the whole of
. (As an example, one could take a continuous piecewise linear function that zigzags up and down, always with gradient
.) We would like to approximate
uniformly by an infinitely differentiable function. How can we do so?
Let us follow a very similar strategy. We shall take an infinitely differentiable approximation to the delta-function and convolve with that. We describe the proof only very briefly, since it is similar to the proof in the first example. First of all, let us suppose that we have managed to find a non-negative function
that is infinitely differentiable, integrates to
, and vanishes outside a very small interval about
. As before, differentiation under the integral sign allows us to prove that
is also infinitely differentiable, and the proof that it uniformly approximates
is basically the same as the proof in the previous example. So all that is left is to construct
.
To do this we use the well-known function
when
and
when
. This function is infinitely differentiable, non-negative, and zero when
. We then let
. This function is obviously still infinitely differentiable (since it is a product of infinitely differentiable functions), and vanishes when
or
. And just for good measure it is an even function with maximum at
. The next step in the building process is to adjust the height and width of
to taste, by defining
for constants
and
of our choice. The constant
allows us to choose the interval outside which
vanishes (which will be
) and
allows us to ensure that
.
Example 3
The previous example also implies that the space
of smooth, compactly supported functions (or "test functions") is dense in
for any
. Indeed, any function in
can be approximated to arbitrary accuracy in
norm by a continuous, compactly supported function (this can be seen for instance by truncating the function to be compactly supported and then applying Lusin's theorem), and by convolution with a smooth approximation to the identity, the latter function can in turn be approximated uniformly (and hence in
, thanks to the compact support) by a smooth, compactly supported function.
A variant of this argument also shows that if
and
is a sequence of approximations to the identity, then
converges to
in
(since this is true for the dense subclass of test functions, and one can take limits using Young's inequality).
General discussion
The ability to approximate rough functions by smoother ones is often employed in the trick "Create an epsilon of room".
Example 4
![]() |
A fact that is sometimes of use in convex geometry is that if you have norm on
, then you can approximate it arbitrarily closely by a norm that is infinitely differentiable except at
. It would be good to have a detailed sketch proof of this fact, which would be very tedious to prove directly I think.
![]() |
Let
be a norm on
a finite-dimensional vector space. Take a positive smooth function
on the space
of linear isomorphisms of
, with support a compact neighborhood of the identity. Then integrating with respect to the Haar measure
on
, the function

is smooth away from
, nonnegative, and convex, being a linear combination of the positive convex functions
. It is similarly linear under scaling; hence, a smooth norm.
In order to better approximate
as above, it might be handy to use the exponential map from the lie algebra
which is again a vector space, and naturally supports parameter re-scaling.
Tricki
Comments
Showing that the Schwartz space dense in L^p
Thu, 07/05/2009 - 03:00 — tao...is another good example of this trick, which I just covered in my class actually. I might put it in here later (and also interlink with "create an epsilon of room".)
On Example 4
Fri, 15/05/2009 - 14:58 — mckeown_j.cAlso a good example for uses of duality is the
case: consider a sublevel set for the norm, and Minkowski-sum with a smooth and suitably symmetric convex region; its size doesn't matter! The result is an approximating
symmetric convex region, dual to a
norm.