Quick description
If you are hoping to use the probabilistic method as part of a long and complicated proof, you will probably want to begin by producing a plausible sketch of the argument before you go into the technical details. For this purpose it is very useful to be able to guess upper (and sometimes lower) bounds for the probabilities of various complicated events. This article discusses heuristic principles that can help one do this, and illustrates them with examples.
Principle 1: pretend your variables are independent
This is the single most useful method for guessing probabilistic bounds: if you have some variables that exhibit a reasonable degree of independence, then they will probably give estimates that are very similar to the ones that would apply if they actually were independent. Of course, one needs to be clear about what "reasonably independent" might mean, so let us look at a few examples.
Example 1
Let
be a random graph with
vertices in which each edge is chosen independently with probability
. Let
be the number of triangles in
. Then
is a random variable: how should we expect
to be distributed?
Given any triangle
in the complete graph on the
vertices of
, then
will belong to
with probability
. The number of such triangles is
, which is about
, so the expectation of
is about
. Also, if
are disjoint triangles, then the events "
belongs to
" are independent; and a typical pair of triangles will be disjoint.
It is clear then that we have a lot of independence about. What would happen if the events "triangle
belongs to
" were all independent? Then
would be a sum of
Bernoulli random variables, each with probability
. In other words, it would be counting the number of occurrences when you have lots of independent unlikely events. The probability distribution that's appropriate for this is the Poisson distribution, so we might guess that
is distributed roughly like a Poisson random variable of mean
.
This is not a proof of course, and it turns out to be a hard problem to determine the distribution of
. Nevertheless, the exercise of pretending that the events are independent is a helpful one: it gives us some idea what to expect, and it also gives us a starting point if we want to prove it. (The starting point would be the thought that we could look very carefully at the proof that
is approximately Poisson when the events are independent and try to relax the assumptions we make, allowing a small amount of dependence. And there are indeed important proofs like this in probabilistic combinatorics: see Janson's inequality, for instance.)
Example 2
A certain amount of care is needed even when one is just guessing. For instance, suppose we tried to use the same reasoning to count the number of copies of a graph
that consists of five vertices
with the first four all joined to each other and the fifth one joined just to the fourth (so
has seven edges). And suppose that the edges of
have been chosen with probability
. The expected number
of copies of
is about
(the factor of
is there because
has a symmetry between three of its vertices). So do we expect the distribution of
to be approximately Poisson with mean
?
To see that we don't, note that the expected number of copies of
in
is
, which is about
. In other words, it is far less than 1 (for fixed
and large
). What this implies is that although the expected number of copies of
is
, the norm is to get no copies at all, and just occasionally you get a K_ with a huge number of extra edges joined to its vertices forming copies of
.
Example 3
Let us return to triangles, but this time let us take a larger value of
.
Tricki