a repository of mathematical know-how

How to use the Cauchy-Schwarz inequality

Stub iconThis article is a stub.This means that it cannot be considered to contain or lead to any mathematically interesting information.

Quick description

The Cauchy-Schwarz inequality asserts that on any measure space (X,\mu), and any measurable  X \to \C, one has

 |\int_X f(x) g(x)\ d\mu| \leq (\int_X |f(x)|^2\ d\mu)^{1/2}  (\int_X |g(x)|^2\ d\mu)^{1/2}.

Thus for instance one has

 |\sum_{n=1}^N a_n b_n| \leq  (\sum_{n=1}^N |a_n|^2)^{1/2} (\sum_{n=1}^N |b_n|^2)^{1/2}.

This inequality is useful for decoupling an expression involving two functions f, g, replacing that expression with one involving just f, and one involving just g. Only one of these latter expressions needs to be small (and the other one bounded) in order for the original expression to be small. Thus, one focus on estimating expressions involving just g (say), effectively eliminating f from view.


Undergraduate analysis.

Example 1

Consider the following two ways of measuring the "size" of vectors in R^n. The 1-norm of a vector x \in R^n is defined as ||x||_1 = \sum_{i=1}^n |x_i|, and the 2-norm is defined as ||x||_2 = (\sum_{i=1}^n x_i^2)^{1/2}.

What is the relationship between these two norms? It follows from the triangle inequality that the 1-norm is always bigger than the 2-norm. How much bigger?

The answer can be found through an application of the Cauchy-Schwarz inequality to the sequences a_i=1 and b_i=|x_i|:

 ||x||_1 = \sum_{i=1}^n a_i b_i \leq (\sum_{i=1}^n a_i^2 )^{1/2} (\sum_{i=1}^n b_i^2)^{1/2} = \sqrt{n} ||x||_2.

Moreover, the \sqrt{n} factor in the above bound is the best possible: to see this plug in the vector x_i=(1/n, 1/n, \ldots, 1/n).

Example 2

(Counting 4-cycles in graphs)

Example 3

(Some instance of the large sieve inequality)

Example 4

(Suggestions welcome!)

General discussion

The Cauchy-Schwarz inequality is efficient as long as you do expect f and g to behave in a roughly "parallel" manner. If instead they are behaving in an "orthogonal" manner then the inequality is quite lossy.

Another useful tool for decoupling is the Arithmetic-geometric mean inequality

 |ab| \leq \frac{1}{2} |a|^2 + \frac{1}{2} |b|^2

or (slightly more generally)

 |ab| \leq \frac{\varepsilon}{2} |a|^2 + \frac{1}{2\varepsilon} |b|^2

for any complex numbers a,b, where \varepsilon > 0 is a parameter one can optimize in later.

(Discussion of converse Cauchy-Schwarz).

See also "A tiny remark about the Cauchy-Schwarz inequality" by Tim Gowers.



In this article, "decoupling" sounds as an interesting general strategy. We could add a parent article about this more general metatechnique (I don't have the knowledge to elaborate it). Why is it important? In which cases ought we try to get a decoupling? Etc

Another "decoupling" technique that comes to mind now is that of "separating variables" in ODEs and PDEs.

I agree, a lot more work is needed here

(Actually I am only now just beginning to realise the sheer scope of this project - it may end up making the Princeton Companion to Mathematics seem like a short story!) My strategy at this point is to lay down a large number of stubs and let them grow, develop, merge, etc. in unexpected ways, on the theory that the more articles already exist, the more likely it is that each potential contributor can find a niche.

I also want to write down a number of proofs of the Cauchy-Schwarz inequality here, as some of them emphasise some cute tricks (e.g. optimising in a parameter to be chosen later). I suppose initially we can put "How to use X" and "Proof of X" on the same page, though as discussed in the "different kind of article" thread we may eventually want to split them into two (interlinked) pages.