Markov Chains in Theoretical Computer Science
Instructor:
Dorit Aharonov
Overview of the course:
Markov chains appear everywhere. Why? Because
the model is so general: You are interested in a large set of objects,
and you can handle local manipulations on these objects, eg.
modifying an object slightly to get a new one. You want to study global properties of this set of objects, such as its approximate size. This very natural problem, which
is an abstraction of problems in many fields, can be modeled by a random walk
on a graph, the nodes of which are the objects of interest.
Starting from an arbitrary node,
an elementary step is to move to a neighboring node with
a certain (say uniform over all neighbors) probability.
This very simple elementary step
applied many times, results in a wonderful phenomenon- the
distribution over the nodes converges to a limiting distribution
depending only on the structure of the graph, and not on the
arbitrary node we started with.
This limiting distribution can tell us a lot about the global properties
of the set of objects of interest,
such as the size or structure of the graph. The time it takes for convergence,
or the "mixing time"
is thus of crucial importance
for algorithmic and modeling applications.
One of the more active fields in probability and theoretical computer science
is the analysis of Markov chains for various applications.
Many clever and beautiful mathematical tools have been developed to analyze
the mixing time in the various cases.
My purpose in this course is to describe the different ways
in which Markov chains can be used in algorithms and in modeling of natural
systems, and mainly to teach the wide variety of possible ways to analyze
and bound
the "mixing time" of these Markov chains, (like coupling from the past.)
The tools we will study
will include conductance, Fourier analysis, multi-commodity flow,
electrical networks, strong stopping rules, coupling, and more.
I will try to give interesting applications of these tools to various
fields. Below is a tentative syllabus, requirements for the course,
and a list of interesting Markov chain links.
Summary of lectures so Far:
Lec 2. Mixing time, Cover time, Spanning tree algorithm. Definitions of hitting time, mixing time, cover time, commute time.
proof that $T_{i,i}=1/\pi(i)$. definition of reversibility. Proof that cover time for undirected graphs is polynomial.
Broder+ Aldous's algorithm for sampling exactly uniform spanning tree.
Analysis using coupling.
[Ref: for various mixing time definitions, either Aldous and Fill book on
Aldous's home page, or any other basic book. For the algorithm- see Aldous's
paper, Siam Journal of Discrete Math, Vol 3. 450-465, 1990 or Lovasz's random walk survey
on his home page, page 39. ]
Lec 3. Coupling, Spanning tree algorithm, Stopping rules
Definition of coupling. The coupling lemma.
Simple examples. Sampling random spanning trees non exactly- easy proof with coupling. strong stopping rules, and simple examples.
Using Strong stopping times to bound mixing times. [Ref: Coupling lemma- Aldous+ Fill book.]
Lec 4. Random walk on $k-$coloring. Definition of the chain,
analysis that it converges rapidly for large enough $k$, algorithm
for approximating the number of $k-$colorings of a graph.
[Ref: Jerrum 95]
Lec 5. phase transitions. Path coupling.
Definitions of Ising Model, Potts model, Ferro magnetic and Anti Ferro magnetic, definition of phase transition,
Using the rapid mixing of $k$-colorings to show
non existence of phase transition in zero temperature in the anti ferromagnetic Potts model. Markov chain on linear extensions of a partial order,
path coupling argument.
[Ref: for path coupling: Bubley and Dyer 97.]
Lec 6. Coupling from the past. Monotonicity in Markov chains.
optimality of CFTP for monotonic chains.
[Ref: original paper by Wilson and Propp, in Wilson's homepage]
Lec 7. Spectral gap, Conductance, expansion and expanders.
Eigenvectors, the symmetric version of a Markov chain, the connection between
spectral gap and rapid mixing, expansion, expanders, using expanders to
reduce the number of random bits used to amplify probability of a
BPP algorithm, The notion of conductance of a graph. Connection
of spectral gap and conductance- beginning of proof.
[Ref: for recycling randomness using expanders: Impagliazzo and Zuckerman,
how to recycle random bits.
For the proof of connection of spectral gap and conductance: see review
Volume estimates and Rapid mixing by Bollobas, 1997.]
Lec 8. Conductance continued, Multi commodity flow
Finished proof of relations between spectral gap and conductance.
Calculation of conductance for the hypercube.
definition of Multi commodity flow, cost of flow, relation to conductance,
simple examples of flows.
[Ref for Multi commodity flow: Lovasz's survey.]
Lec 9. canonical paths, matchings.
Canonical paths defined, simple examples, chain on all matchings
in a graph with weight proportional to number of edges in matchings;
Proof that is converges in time polynomial is lambda, 1/epsilon and n.
Lec 10. Permanent
Using the chain designed last week to approximate the number of matching of
size k if m_k/m_{k-1} is not too small, using log concavity of m_k.
Beginning of the algorithm of approximating the permanent,
or the number of perfect matching in a bipartite graph,
a'la jerrum sinclair and vigoda.
[Ref for permanent result: paper by Jerrum Sinclair and Vigoda, STOC 2001. ]
Lec. 11 Permanent (contd), Fourier Transforms
In this lecture we have finished the permanent algorithm.
A new Topic: Diaconis Shahashahani methods for bounding mixing times
for walks on Cayley graphs of groups.
[Ref: The excellent book by Diaconis, Group representations in probability and Statistics]
Lec. 12 Fourier Transforms (contd), Convex bodies
Applegate-Kannan Markov chain algorithm for approximating
the volume of a convex body, and integrating Log concave functions.
Lovasz-Simonovitz isoperimetric inequality.
[Ref: Applegate Kannan Paper, in STOC 1991, and the
Lovasz Simonovits paper, Focs 1990 (or 1991- almost surely 1990.)]
Lec 13
Simulated Annealing and tempering, Genetic Algorithms
We will survey two other topics in the area of rapidly mixing
Markov chains: simulated annealing and tempering
(random walk on temperatures)
and genetic algorithms: the "go with the winners" algorithm (Aldous
and Vazirani.) [Refs: "go with the winners algorithm", Vazirani's homepage.Sorkin's paper about fractal example for simulated annealing, Algorithmica, 6, 1-3, 1991, pages 367-418. For simulated tempering:
Madras and Zheng, the swapping algorithm, to be published.]
Requirements:
The course is intended for graduate students in computer science.
Strong third year students are also wellcome.
The course might interest math and physics students aswell;
Please see me before class to check that you have the right background.
The grade will be given according to biweekly exercises and a final
exam/home exam.
Exercises:
Exam related issues:
The rehearsal will take place in the usual location (Shprintsak 102), at 10:00
Thursday, 4/7. It will save a lot of time if you send me the questions you want me to go through ahead of time by e-mail.
Some Markov chain related links, with relevant material inside indicated:
Spring 2002, Hebrew University, Jerusalem, Israel
Office: Ross 72, Hebrew University, Jerusalem, 02-6584611
Markov chains, or Random Walks on Graphs
are probably one of the most important concepts
in computer science in particular, and in
the exact and natural sciences in general.
They seem to appear everywhere:
In statistical physics, biology, ecology, economy
and the stock market, the study of the web, and
they have been unmeasurably useful in combinatorial applications such
as approximation, optimization and counting algorithms.
For example, google search yields
85600 pages with the
words "Markov chain", and 157000 pages with the words "Random Walk"...
The following colorful
applet
from David Wilson's page demonstrates one of the more
beautiful
methods to bound the mixing time of Markov chains, namely "Coupling from the past".
The applet demonstrates a Markov chain on the set of all
possible domino tilings of a square.
Starting from an arbitrary tiling,
we move to a different tiling
by changing only
two tiles at a time; The question is after how many elementary steps
we can say that we have a random tiling of the square?
To answer this question, Jim Propp and David Wilson suggested
the ingenious idea of "coupling from the past": roughly speaking,
two Markov chains are run in parallel and when they collide at the same state,
you are guranteed that the state is distributed
exactly (not approximately!)
uniformly.
Lec 1.
Introduction Overview of the course. Definitions of Markov chains,
ergodicity, proof of existence of limiting distribution.
Coupling Card trick... [Ref for Proof of limiting distribution: Norris page 41.]
I will assume basic knowledge of linear algebra, probability,
complexity theory, and a very basic knowledge of finite groups.
The problem sets are to be submitted in pairs, or triples.
The intention is not that you divide the work between you, but that you
discuss all the solutions between you.
It is also greatly encouraged that each of you
writes part of each of the problem sets, so that
you get to think about the formal aspects of the concepts every
two weeks, and not every month.
The couples or triples may vary with different exercises, if you wish.
These problems are intended to help you understand the
concepts learned in class, and discussion within larger groups is greatly
encouraged (looking at written solutions of others is not considered
discussion, though.) The difficulty ranges from routine
to slightly tricky, with bonus exercises which can add to the final grade
quite a bit, and will be marked with a star.
Feel free to talk to me about any points you are having
trouble with.
There will be 5 exercises during the semester, one every two weeks.
The weight of the exercises in the grade will be about 40-50 percent.
There will be 6 questions out of which you will have to choose 4.
I will not publish a list of exercises, but instead here is a list of
topics
you should concentrate on.
Understanding all the exercises should be a very good preparation for the
exam.