Markov Chains in Theoretical Computer Science
Instructor: Dorit Aharonov
Overview of the course:
Markov chains appear everywhere. Why? Because the model is so general: You are interested in a large set of objects, and you can handle local manipulations on these objects, eg. modifying an object slightly to get a new one. You want to study global properties of this set of objects, such as its approximate size. This very natural problem, which is an abstraction of problems in many fields, can be modeled by a random walk on a graph, the nodes of which are the objects of interest. Starting from an arbitrary node, an elementary step is to move to a neighboring node with a certain (say uniform over all neighbors) probability. This very simple elementary step applied many times, results in a wonderful phenomenon- the distribution over the nodes converges to a limiting distribution depending only on the structure of the graph, and not on the arbitrary node we started with. This limiting distribution can tell us a lot about the global properties of the set of objects of interest, such as the size or structure of the graph. The time it takes for convergence, or the "mixing time" is thus of crucial importance for algorithmic and modeling applications. One of the more active fields in probability and theoretical computer science is the analysis of Markov chains for various applications. Many clever and beautiful mathematical tools have been developed to analyze the mixing time in the various cases.
My purpose in this course is to describe the different ways in which Markov chains can be used in algorithms and in modeling of natural systems, and mainly to teach the wide variety of possible ways to analyze and bound the "mixing time" of these Markov chains, (like coupling from the past.) The tools we will study will include conductance, Fourier analysis, multi-commodity flow, electrical networks, strong stopping rules, coupling, and more. I will try to give interesting applications of these tools to various fields. Below is a tentative syllabus, requirements for the course, and a list of interesting Markov chain links.
Summary of lectures so Far:
Lec 2. Mixing time, Cover time, Spanning tree algorithm. Definitions of hitting time, mixing time, cover time, commute time. proof that $T_{i,i}=1/\pi(i)$. definition of reversibility. Proof that cover time for undirected graphs is polynomial. Broder+ Aldous's algorithm for sampling exactly uniform spanning tree. Analysis using coupling. [Ref: for various mixing time definitions, either Aldous and Fill book on Aldous's home page, or any other basic book. For the algorithm- see Aldous's paper, Siam Journal of Discrete Math, Vol 3. 450-465, 1990 or Lovasz's random walk survey on his home page, page 39. ]
Lec 3. Coupling, Spanning tree algorithm, Stopping rules Definition of coupling. The coupling lemma. Simple examples. Sampling random spanning trees non exactly- easy proof with coupling. strong stopping rules, and simple examples. Using Strong stopping times to bound mixing times. [Ref: Coupling lemma- Aldous+ Fill book.]
Lec 4. Random walk on $k-$coloring. Definition of the chain, analysis that it converges rapidly for large enough $k$, algorithm for approximating the number of $k-$colorings of a graph. [Ref: Jerrum 95]
Lec 5. phase transitions. Path coupling. Definitions of Ising Model, Potts model, Ferro magnetic and Anti Ferro magnetic, definition of phase transition, Using the rapid mixing of $k$-colorings to show non existence of phase transition in zero temperature in the anti ferromagnetic Potts model. Markov chain on linear extensions of a partial order, path coupling argument. [Ref: for path coupling: Bubley and Dyer 97.]
Lec 6. Coupling from the past. Monotonicity in Markov chains. optimality of CFTP for monotonic chains. [Ref: original paper by Wilson and Propp, in Wilson's homepage]
Lec 7. Spectral gap, Conductance, expansion and expanders. Eigenvectors, the symmetric version of a Markov chain, the connection between spectral gap and rapid mixing, expansion, expanders, using expanders to reduce the number of random bits used to amplify probability of a BPP algorithm, The notion of conductance of a graph. Connection of spectral gap and conductance- beginning of proof. [Ref: for recycling randomness using expanders: Impagliazzo and Zuckerman, how to recycle random bits. For the proof of connection of spectral gap and conductance: see review Volume estimates and Rapid mixing by Bollobas, 1997.]
Lec 8. Conductance continued, Multi commodity flow Finished proof of relations between spectral gap and conductance. Calculation of conductance for the hypercube. definition of Multi commodity flow, cost of flow, relation to conductance, simple examples of flows. [Ref for Multi commodity flow: Lovasz's survey.]
Lec 9. canonical paths, matchings. Canonical paths defined, simple examples, chain on all matchings in a graph with weight proportional to number of edges in matchings; Proof that is converges in time polynomial is lambda, 1/epsilon and n.
Lec 10. Permanent Using the chain designed last week to approximate the number of matching of size k if m_k/m_{k-1} is not too small, using log concavity of m_k. Beginning of the algorithm of approximating the permanent, or the number of perfect matching in a bipartite graph, a'la jerrum sinclair and vigoda. [Ref for permanent result: paper by Jerrum Sinclair and Vigoda, STOC 2001. ]
Lec. 11 Permanent (contd), Fourier Transforms In this lecture we have finished the permanent algorithm. A new Topic: Diaconis Shahashahani methods for bounding mixing times for walks on Cayley graphs of groups. [Ref: The excellent book by Diaconis, Group representations in probability and Statistics]
Lec. 12 Fourier Transforms (contd), Convex bodies Applegate-Kannan Markov chain algorithm for approximating the volume of a convex body, and integrating Log concave functions. Lovasz-Simonovitz isoperimetric inequality. [Ref: Applegate Kannan Paper, in STOC 1991, and the Lovasz Simonovits paper, Focs 1990 (or 1991- almost surely 1990.)]
Lec 13 Simulated Annealing and tempering, Genetic Algorithms We will survey two other topics in the area of rapidly mixing Markov chains: simulated annealing and tempering (random walk on temperatures) and genetic algorithms: the "go with the winners" algorithm (Aldous and Vazirani.) [Refs: "go with the winners algorithm", Vazirani's homepage.Sorkin's paper about fractal example for simulated annealing, Algorithmica, 6, 1-3, 1991, pages 367-418. For simulated tempering: Madras and Zheng, the swapping algorithm, to be published.]
Requirements:
The course is intended for graduate students in computer science. Strong third year students are also wellcome. The course might interest math and physics students aswell; Please see me before class to check that you have the right background.
The grade will be given according to biweekly exercises and a final exam/home exam.
Exercises:
Exam related issues:
The rehearsal will take place in the usual location (Shprintsak 102), at 10:00 Thursday, 4/7. It will save a lot of time if you send me the questions you want me to go through ahead of time by e-mail.
Some Markov chain related links, with relevant material inside indicated: