Tricki
a repository of mathematical know-how

Revision of How to use the implicit function theorem to prove smoothness from Sat, 25/04/2009 - 00:14

Quick description

Suppose you have a function {\mathbb R}^n\rightarrow {\mathbb R}^m and you would like to see that f is C^r. One way to show this is to find a C^r function {\mathbb R}^n \times {\mathbb R}^m \rightarrow {\mathbb R}^m such that \det(D_y F) \neq 0 and

 F(x,f(x)) = c.

The implicit function theorem will imply that f is C^r.

Prerequisites

Calculus

Example 1

Deterministic and Stochastic Optimal Control, Wendell H. Fleming, Raymond W. Rishel, page 8. Take a C^r cost function {\mathbb R}^3 \rightarrow {\mathbb R} and consider the calculus of variation problem of minimizing

 J(x) = \int_{t_0}^{t_1} L(t,x(t),\dot{x}(t))dt, (1)

over piecewise C^1 functions x with the given fixed end points x(t_0)= x_0 and x(t_1) = x_1. A piecewise C^1 function x is called an extremal of (1) if it satisfies

 -\int_{t_0}^t L_x(s,x^*(s), \dot{x}^*(s)) ds + L_{y}(t,x^*(t),\dot{x}^*(t)) = c (2)

for t_0 \le t \le t_1 where c is a constant and where y denotes the variable of L for which \dot{x} is substituted in (1) (the third variable).

Now let us assume that L_{yy } > 0 and consider an extremal x of (1) that is also C^1, i.e., \dot{x} is continuous. We will show using the implicit function theorem that x must be C^r, i.e., if the extremal is C^1 then it has to be as smooth as the cost function L.

Remark Before we proceed further a quick remark about the assumption L_{yy} > 0. This assumption precludes \dot{x} from sudden changes and forces it to be C^1, even if this were not assumed. In order not to detract from the argument based on the implicit function theorem we assumed that x is C^1. The assumption L_{yy} >0 will also come into play in the invocation of the implicit function theorem.

Now let us continue with our argument that x must be as smooth as L. The argument proceeds by induction. We already have the base case that x is C^1. Now let us suppose that it is C^k for a k \le r-1. Then

 P(t) = \int_{t_0}^t L_x(s,x(s),\dot{x}(s) ) ds

is C^k as well. By (2)

 -P(t) + L_y(t,x(t),\dot{x}(t)) =c (3)

for some constant c. Define

 \Phi(t,y) = -P(t) + L_y(t,x(t),y).

L is C^r and k \le r-1 and x is C^k, it follows that \Phi is C^k. We can rewrite (3) as \Phi(t,\dot{x}) = c. \Phi_y = L_{yy} and this is strictly positive by assumption. These imply that \dot{x} is at least as smooth as \Phi, thus \dot{x} is C^k. It follows that x is C^{k+1}.

Comments

are there other examples of the argument in example one?

The above example has a more sophisticated method than the one I tried to describe in the quick description section. In the generalization of the method of example 1, there would be a function f and the goal would be to improve our understanding of its smoothness. To that end, we use f itself to define another function

 {\mathbb R}^n \times {\mathbb R}^{m\times n} \rightarrow {\mathbb R}^{m\times n}

such that \Phi(x,Df(x) ) = c, \det(D_y\Phi) \neq 0 and \Phi is as smooth as f. Now the implicit function theorem says Df is as smooth as \Phi and hence f. The argument is continued as many steps as possible. Does anyone know of another application of this argument, or an argument similar to this?

Notation

The notation "D_y\,F" isn't very clear. My first thought was: "partial derivative with respect to y". After this, I figured you meant the operator on {\mathbb R}^m given by the decomposition of D\, F (the Jacobian of F), since this is the condition required by the theorem. Still, the notation gives rise the possibility D_y\,\Phi=\Phi_y.

What does the notation x^* mean in equation 2?

Instead of using L_y, why not use L_{\dot x}?

Notation

Thanks for the comment. I like the D_y notation and frequently use it because it consisely states everything related to the operation (we are taking a derivative with respect to a variable.) The one you suggest (\Phi_y) is also good I think. As for L_{\dot{x}}. This is common notation in many books, including Fleming and Rishel. I don't prefer it because I think it is confusing to use the same symbol to mean two entirely different things on the same page. In this case, if we use the L_{\dot{x}} notation, \dot{x} will mean the derivative of the function x with respect to time and also the name of a free variable.

x^* is a is a free variable representing a function satisfying the Euler-Lagrange equation. It probably is not a good choice of notation because as you point out upon reading it one thinks it must be related to the x in its context. I changed it to x. This I think causes an abuse of notation, but hopefully not a confusing one.