Markov Chains in Theoretical Computer Science, Spring 2002

Markov Chains in Theoretical Computer Science, Spring 2002

Markov Chains in Theoretical Computer Science
Spring 2002, Hebrew University, Jerusalem, Israel

Instructor: Dorit Aharonov
Office: Ross 72, Hebrew University, Jerusalem, 02-6584611

Overview of the course:

Markov chains, or Random Walks on Graphs are probably one of the most important concepts in computer science in particular, and in the exact and natural sciences in general. They seem to appear everywhere: In statistical physics, biology, ecology, economy and the stock market, the study of the web, and they have been unmeasurably useful in combinatorial applications such as approximation, optimization and counting algorithms.

For example, google search yields 85600 pages with the words "Markov chain", and 157000 pages with the words "Random Walk"...

Markov chains appear everywhere. Why? Because the model is so general: You are interested in a large set of objects, and you can handle local manipulations on these objects, eg. modifying an object slightly to get a new one. You want to study global properties of this set of objects, such as its approximate size. This very natural problem, which is an abstraction of problems in many fields, can be modeled by a random walk on a graph, the nodes of which are the objects of interest. Starting from an arbitrary node, an elementary step is to move to a neighboring node with a certain (say uniform over all neighbors) probability. This very simple elementary step applied many times, results in a wonderful phenomenon- the distribution over the nodes converges to a limiting distribution depending only on the structure of the graph, and not on the arbitrary node we started with. This limiting distribution can tell us a lot about the global properties of the set of objects of interest, such as the size or structure of the graph. The time it takes for convergence, or the "mixing time" is thus of crucial importance for algorithmic and modeling applications. One of the more active fields in probability and theoretical computer science is the analysis of Markov chains for various applications. Many clever and beautiful mathematical tools have been developed to analyze the mixing time in the various cases.

The following colorful applet from David Wilson's page demonstrates one of the more beautiful methods to bound the mixing time of Markov chains, namely "Coupling from the past". The applet demonstrates a Markov chain on the set of all possible domino tilings of a square. Starting from an arbitrary tiling, we move to a different tiling by changing only two tiles at a time; The question is after how many elementary steps we can say that we have a random tiling of the square? To answer this question, Jim Propp and David Wilson suggested the ingenious idea of "coupling from the past": roughly speaking, two Markov chains are run in parallel and when they collide at the same state, you are guranteed that the state is distributed exactly (not approximately!) uniformly.

My purpose in this course is to describe the different ways in which Markov chains can be used in algorithms and in modeling of natural systems, and mainly to teach the wide variety of possible ways to analyze and bound the "mixing time" of these Markov chains, (like coupling from the past.) The tools we will study will include conductance, Fourier analysis, multi-commodity flow, electrical networks, strong stopping rules, coupling, and more. I will try to give interesting applications of these tools to various fields. Below is a tentative syllabus, requirements for the course, and a list of interesting Markov chain links.

Summary of lectures so Far:

Lec 1. Introduction Overview of the course. Definitions of Markov chains, ergodicity, proof of existence of limiting distribution. Coupling Card trick... [Ref for Proof of limiting distribution: Norris page 41.]
Lec 2. Mixing time, Cover time, Spanning tree algorithm. Definitions of hitting time, mixing time, cover time, commute time. proof that $T_{i,i}=1/\pi(i)$. definition of reversibility. Proof that cover time for undirected graphs is polynomial. Broder+ Aldous's algorithm for sampling exactly uniform spanning tree. Analysis using coupling. [Ref: for various mixing time definitions, either Aldous and Fill book on Aldous's home page, or any other basic book. For the algorithm- see Aldous's paper, Siam Journal of Discrete Math, Vol 3. 450-465, 1990 or Lovasz's random walk survey on his home page, page 39. ]
Lec 3. Coupling, Spanning tree algorithm, Stopping rules Definition of coupling. The coupling lemma. Simple examples. Sampling random spanning trees non exactly- easy proof with coupling. strong stopping rules, and simple examples. Using Strong stopping times to bound mixing times. [Ref: Coupling lemma- Aldous+ Fill book.]
Lec 4. Random walk on $k-$coloring. Definition of the chain, analysis that it converges rapidly for large enough $k$, algorithm for approximating the number of $k-$colorings of a graph. [Ref: Jerrum 95]
Lec 5. phase transitions. Path coupling. Definitions of Ising Model, Potts model, Ferro magnetic and Anti Ferro magnetic, definition of phase transition, Using the rapid mixing of $k$-colorings to show non existence of phase transition in zero temperature in the anti ferromagnetic Potts model. Markov chain on linear extensions of a partial order, path coupling argument. [Ref: for path coupling: Bubley and Dyer 97.]
Lec 6. Coupling from the past. Monotonicity in Markov chains. optimality of CFTP for monotonic chains. [Ref: original paper by Wilson and Propp, in Wilson's homepage]
Lec 7. Spectral gap, Conductance, expansion and expanders. Eigenvectors, the symmetric version of a Markov chain, the connection between spectral gap and rapid mixing, expansion, expanders, using expanders to reduce the number of random bits used to amplify probability of a BPP algorithm, The notion of conductance of a graph. Connection of spectral gap and conductance- beginning of proof. [Ref: for recycling randomness using expanders: Impagliazzo and Zuckerman, how to recycle random bits. For the proof of connection of spectral gap and conductance: see review Volume estimates and Rapid mixing by Bollobas, 1997.]
Lec 8. Conductance continued, Multi commodity flow Finished proof of relations between spectral gap and conductance. Calculation of conductance for the hypercube. definition of Multi commodity flow, cost of flow, relation to conductance, simple examples of flows. [Ref for Multi commodity flow: Lovasz's survey.]
Lec 9. canonical paths, matchings. Canonical paths defined, simple examples, chain on all matchings in a graph with weight proportional to number of edges in matchings; Proof that is converges in time polynomial is lambda, 1/epsilon and n.
Lec 10. Permanent Using the chain designed last week to approximate the number of matching of size k if m_k/m_{k-1} is not too small, using log concavity of m_k. Beginning of the algorithm of approximating the permanent, or the number of perfect matching in a bipartite graph, a'la jerrum sinclair and vigoda. [Ref for permanent result: paper by Jerrum Sinclair and Vigoda, STOC 2001. ]
Lec. 11 Permanent (contd), Fourier Transforms In this lecture we have finished the permanent algorithm. A new Topic: Diaconis Shahashahani methods for bounding mixing times for walks on Cayley graphs of groups. [Ref: The excellent book by Diaconis, Group representations in probability and Statistics]
Lec. 12 Fourier Transforms (contd), Convex bodies Applegate-Kannan Markov chain algorithm for approximating the volume of a convex body, and integrating Log concave functions. Lovasz-Simonovitz isoperimetric inequality. [Ref: Applegate Kannan Paper, in STOC 1991, and the Lovasz Simonovits paper, Focs 1990 (or 1991- almost surely 1990.)]
Lec 13 Simulated Annealing and tempering, Genetic Algorithms We will survey two other topics in the area of rapidly mixing Markov chains: simulated annealing and tempering (random walk on temperatures) and genetic algorithms: the "go with the winners" algorithm (Aldous and Vazirani.) [Refs: "go with the winners algorithm", Vazirani's homepage.Sorkin's paper about fractal example for simulated annealing, Algorithmica, 6, 1-3, 1991, pages 367-418. For simulated tempering: Madras and Zheng, the swapping algorithm, to be published.]

Requirements:

I will assume basic knowledge of linear algebra, probability, complexity theory, and a very basic knowledge of finite groups.
The course is intended for graduate students in computer science. Strong third year students are also wellcome. The course might interest math and physics students aswell; Please see me before class to check that you have the right background.
The grade will be given according to biweekly exercises and a final exam/home exam.

Exercises:

The problem sets are to be submitted in pairs, or triples. The intention is not that you divide the work between you, but that you discuss all the solutions between you. It is also greatly encouraged that each of you writes part of each of the problem sets, so that you get to think about the formal aspects of the concepts every two weeks, and not every month. The couples or triples may vary with different exercises, if you wish. These problems are intended to help you understand the concepts learned in class, and discussion within larger groups is greatly encouraged (looking at written solutions of others is not considered discussion, though.) The difficulty ranges from routine to slightly tricky, with bonus exercises which can add to the final grade quite a bit, and will be marked with a star. Feel free to talk to me about any points you are having trouble with.
There will be 5 exercises during the semester, one every two weeks. The weight of the exercises in the grade will be about 40-50 percent.

Exercise 1
Exercise 2 (question 4 corrected)
Exercise 3
Exercise 4
Exercise 5 (correction- question 1 omitted. Question 2.b. is not a bonus. This exercise will count as one exercise and not 2.)

Exam related issues:

There will be 6 questions out of which you will have to choose 4. I will not publish a list of exercises, but instead here is a list of topics you should concentrate on. Understanding all the exercises should be a very good preparation for the exam.
The rehearsal will take place in the usual location (Shprintsak 102), at 10:00 Thursday, 4/7. It will save a lot of time if you send me the questions you want me to go through ahead of time by e-mail.

Some Markov chain related links, with relevant material inside indicated:

David Wilson's Page (Lots of nice illustrations and coupling papers)
Laci Lovasz's page (mixing time survey paper, volume approximation papers)
David Zuckerman's page (first course in page has random walks on expanders)
David Aldous' Page (book on random walks, mixing times, etc, plus papers)
Yuval Peres' page (book on random walk on trees and electrical networks)
Alistair Sinclair's page (conductance, multi commodity flow, app. counting: papers, survey)
Dana Randall's page (lecture notes of course on approximate counting)
Ravi Kannan's page (volume approximation papers)
Probabalists' page (find here many home pages of people in probability)