Tricki
a repository of mathematical know-how

Turn sets into functions

Quick description

Let X be a set. Then there is a one-to-one correspondence between subsets of X and functions from X to \{0,1\}, which takes the subset A to its characteristic function (or indicator function) \chi_A, which is defined by \chi_A(x)=1 if x\in A and \chi_A(x)=0 otherwise. It can often be very fruitful to regard \chi_A as a function from X to a bigger set such as the closed interval [0,1], or \mathbb{R}, or \mathbb{C}, or to a set with more structure such as the field \mathbb{F}_2 with two elements. The basic idea is to prove more general theorems about functions and to derive from them theorems about sets. A great advantage of doing this is that it allows one to import methods from other areas such as analysis and linear algebra.

This page could do with several more examples. (Even examples of reformulated problems, without accompanying accounts of the solutions, would be better than nothing.)

Prerequisites

A basic familiarity with the first half of a typical undergraduate course in mathematics.

Example 1

Let n be an even positive integer and let [n]^{(n/2)} be the collection of all subsets of \{1,2,\dots,n\} of size n/2. Let \delta>0 and let \mathcal{A} be a subset of [n]^{(n/2)} such that |A\cap B|\leq n/4-\delta n for every A,B\in\mathcal{A}. Then the size of \mathcal{A} is bounded above by a constant that depends on \delta only.

Here is a way of looking at this fact. If you choose a typical pair of subsets of \{1,2,\dots,n\} of size n/2 then their intersection will be very likely to have size about n/4. If you want to find many sets of size n/2 with larger intersections than this, then you can choose subsets of a smaller set such as \{1,2,\dots,2n/3\} (which has exponentially many subsets of size n/2, any two of which intersect in at least n/3). But if you want intersections that are smaller than the expected size, then you are in trouble: the number of sets you can choose does not even tend to infinity with n.

This rather surprising fact (if you are not familiar with it) has an unforgettably beautiful proof: one that is not only short but also one that tells you exactly what is really going on. The idea is to associate with each set in \mathcal{A} a vector. The idea is to do this in such a way that if A\cap B has size n/4 then the inner product of the corresponding vectors is zero. This one does by taking not the characteristic function of a set A but rather the function f_A defined by f_A(m)=1 if m\in A and f_A(m)=-1 otherwise.

We now make this vector a unit vector in a Hilbert space by taking as inner product the bilinear form \langle f,g\rangle=\E_mf(m)g(m)=n^{-1}\sum_{m=1}^nf(m)g(m). Clearly, \langle f_A,f_A\rangle=1 for every set A. If A and B are two sets of size n/2, then f_Af_B is 1 on A\cap B, 1 on A^c\cap B^c, and -1 everywhere else. It is easy to prove that |A^c\cap B^c|=|A\cap B|, so we find that \langle f_A,f_B\rangle=n^{-1}(2|A\cap B|-(n-2|A\cap B|))=4n^{-1}|A\cap B|-1. Thus, the inner product ranges from -1 (if A and B are disjoint) to 1 (if A and B are equal). Also, and this is what matters to us, if |A\cap B|<n/4-\delta n, then \langle f_A,f_B\rangle<-4\delta, and conversely.

We are thus led to ask a more general question: how many unit vectors can one find in a Hilbert space, subject to the constraint that the inner product of any two of them is less than -\gamma, for some fixed positive \gamma?

To answer this question we use another trick: averaging: if every inner product is at most -\gamma then the average inner product is certainly at most -\gamma. If we call the vectors v_1,\dots,v_k, then one way of writing this is \sum_{i\ne j}\langle v_i,v_j\rangle\leq -\gamma k(k-1).

Now this will give us an upper bound for k if we can show that it is not possible for the left-hand side to be too large and negative. It is not immediately obvious how to prove this, but the following simple observation does it: the closely related sum \sum_{i,j}\langle v_i,v_j\rangle is equal to \langle v_1+\dots+v_k,v_1+\dots+v_k\rangle and is therefore non-negative. Moreover, what we have added to get to this sum from the one that we are interested in is \sum_i \langle v_i,v_i\rangle, which equals k (since the v_i are unit vectors). Therefore, \sum_{i\ne j}\langle v_i,v_j\rangle\geq -k. From this it follows that -k\leq-k(k-1)\gamma, which implies that k\leq\gamma^{-1}+1.

This inequality turns out to be sharp: if you arrange k unit vectors as vertices of a simplex of dimension k-1, then the inner product of any two vectors is -1/(k-1). For example, if k=3, then this is true because the cosine of 120^o is -1/2. In general, one can prove it by making the simple observation that the sum of these unit vectors is 0, and the inner products are all the same (both following from the symmetry of a simplex), so that all the inequalities in the above argument are sharp.

Since we have bounded k in terms of \gamma only, the number of sets one can choose in the first problem is bounded above by a constant that depends on \delta only, as claimed.

Example 2

Note iconIncomplete This article is incomplete. Need to expand on this example.

One basic example of this in the passage from measures (which are functions on a \sigma-algebra of sets) to integrals (which are functionals on a space of functions). See the discussion of Lebesgue integral and the Riesz representation theorem for explanations of this.

Example 3

One particularly powerful tool which is available for functions (and hence, by this trick, for sets), is Fourier analysis. When applied to sets, this technique is sometimes known as the Hardy-Littlewood circle method, especially in number theory.

For instance, if one wants to count the number of ways a positive integer N can be expressed as the sum N=p_1+p_2+p_3 of three numbers p_1,p_2,p_3 in some set A (e.g. the set of primes), one can turn the set A into a function 1_A and write thus quantity as

 n_1+n_2+n_3 = N} 1_A(n_1) 1_A(n_2) 1_A(n_3).(1)

One can recognize this as a discrete convolution 1_A * 1_A * 1_A(N) evaluated at N. Using Fourier analysis, one can rewrite this expression as

 \int_0^1 \widehat{1_A}(\theta)^3 e^{2\pi i N \theta}\ d\theta

where = \sum_n 1_A(n) e^{-2\pi i n \theta} is the Fourier transform of 1_A. So the task is now a Fourier-analytic one, namely to compute the exponential sum \sum_n 1_A(n) e^{-2\pi i n \theta}. This is by no means a non-trivial task, but it is still a simpler one than the original problem (basically because it only involves one instance the complicated set A, while the original problem involved three such instances).

General discussion

It is also possible to partially invert this trick and turn functions back into sets, see "Control level sets" for more discussion.