Tricki
a repository of mathematical know-how

Revision of Greedy algorithms from Mon, 20/04/2009 - 22:01

Quick description

Suppose that you have a handful of coins in your pocket and want to choose some of them to add up to a given total. The following would be an obvious method: keep selecting the coin of highest value that does not cause the total to become too big. If you have a large supply of coins of all denominations, this will result in an efficient selection. Because at each stage of the procedure, you try to take a step that gets you as close as possible to the target, an algorithm like this is called a greedy algorithm. Although greedy algorithms can seem rather unsophisticated, the concept is surprisingly important in many parts of mathematics.

Prerequisites

They vary from example to example, but this article is written at a fairly elementary level.

Example 1

A proper vertex colouring of a graph G is a function \kappa defined on the vertices of G such that if v and w are any two neighbouring vertices, then \kappa(v) does not equal \kappa(w). A simple result in graph theory is the following: let G be a finite graph such that each vertex has degree at most d. Then there is a proper colouring of G that uses at most d+1 colours. (This means that the function \kappa takes at most d+1 distinct values.)

To prove this, we choose some ordering of the vertices: v_1,v_2,\dots,v_n and colour them inductively as follows. Having coloured v_1,\dots,v_{k-1}, we would like to colour v_k. By hypothesis, v_k has at most d neighbours, so there are at most d neighbours of v_k amongst the vertices v_1,\dots,v_{k-1}. Therefore, there are at most d colours that we must avoid when choosing a colour for v_k. Since there are d+1 colours available to us, we can arbitrarily choose some colour that is not one of the ones we have to avoid. And thus the induction continues.

Exercise 1 Use transfinite induction and the well-ordering theorem to extend the above result to infinite graphs. (You do not need transfinite induction if the graph has countably many vertices, but if it is big enough then you do.) Now do it using Zorn's lemma.

General discussion

Why do we call that a greedy algorithm? The reason is that we do not worry about the future when we choose the colour for v_k. For some problems, it turns out that we do not need to worry about the future, while for others it very definitely does. To give an example of a problem of the latter type, suppose that we have a bipartite graph G with two vertex sets X and Y, both of size n. And suppose that we want to find a perfect matching, meaning a bijection X\rightarrow Y such that \phi(x) is joined to x for every x. A greedy approach would be to choose an ordering x_1,\dots,x_n of the vertices of X and then choose distinct neighbours for each vertex in turn. However, it may well happen that even if a perfect matching exists, we get stuck when we do this process: then we find ourselves wanting to backtrack. (It turns out to be possible to devise a systematic way of backtracking that leads to a polynomial-time algorithm for this problem.)

It might be more accurate to call Example 1 a just-do-it proof, since there is another feature that many greedy algorithms have, which is that instead of making an arbitrary choice at each stage, one makes a choice that is extreme in one way or another. The coins example in the quick description has this property: we chose the coin of largest value at each stage, subject to the condition that we didn't go over our target. Another example that illustrates this type of greed is the game of Othello. Here, a greedy strategy for playing the game would be always to place your counter somewhere that causes as many as possible of your opponent's pieces to be turned over. Interestingly, this is virtually the worst strategy you can play, at least until near the end of the game (when obviously it becomes more sensible, though even in the endgame a greedy strategy is far from optimal).

Note iconIncomplete This article is incomplete. More examples planned, and I'm sure others have their favourites too. One important example is finding structures in sparse random graphs by using greedy algorithms and showing that with high probability they don't get stuck.

Comments

real analysis

Would the following example be too trivial?

One sees that any open set O is the union of disjoint intervals using a greedy algorithm. For an element x in O let  a < x < b, (a,b) \subset O\}, be the collection of all open intervals lying in O and containing x. Let a^*(x) = \inf_{(a,b) \in {\mathcal C}} a and b^*(x) = \sup_{ (a,b) \in {\mathcal C}} b. The intervals in {\mathcal C} are open, connected and they are not disjoint of each other because they all contain x. These imply that (a^*(x),b^*(x)) is also a member of {\mathcal C} and by its definition it is the maximal element of {\mathcal C}.

We now enumerate the rationals in O and iterate over them as follows. Suppose I_1, I_2,...,I_k are the intervals chosen upon iterating over the first n-1 rationals. For the n^{th} rational q_n we check if I=(a^*(q_n),b^*(q_n)) intersects with I_1, I_2,...,I_k. If there is a nonempty intersection with a I_j then I=I_j by the maximality of I and in this case we complete the step and proceed to the next rational. If I is disjoint from I_1, I_2,...,I_k then we let I_{k+1} = I.

Let N be the largest number the index k reaches in the above iteration (it may be \infty). Because the rationals are dense in {\mathbb R} it follows that \cup_{k=1}^N I_k = 0.

I've thought about this, and

I've thought about this, and in the end I think it isn't really an example, because its algorithmic nature isn't playing a genuine role. The "real" proof that underlies the argument you give is this: take all maximal open subintervals of your set, and check that no two of them can intersect.

Minimum spanning trees?

I'm not really sure about the etiquette for this site, so I hope this comment isn't too low-level.

In computer science, at least, pretty much the first example of greedy algorithms is for the problem of finding a minimum weight spanning tree in a graph (and then the generalization to the characterization of matroids by greedy algorithms).

Also, sometimes when greedy algorithms don't work optimally, you can still get something useful, like finding a linear-sized independent set in a planar graph or approximating set cover. (There are many examples of this nature.)