Non-asymptotic convergence analysis for the Unadjusted Langevin Algorithm

This is an important arXival by Alain Durmus and Eric Moulines. The title is slightly optimized for effect, as the paper actually contains non-asymptotic and asymptotic analysis.

The basic theme of the paper is in getting upper bounds on total variation (and more general distribution distances) between an uncorrected discretized Langevin diffusion wrt some target \pi and \pi itself. The discretization used is the common scheme with the scary name Euler-Maruyama:

X_{k} \sim \mathcal{N}(\cdot|X_{k-1} + \gamma_{k}\nabla \log \pi(X_{k-1}), \sqrt{2\gamma_{k}I} = R_{\gamma_{k}}(\cdot |X_{k-1})

Under a Foster-Lyapunov condition, R_{\gamma_{k}} is a Markov kernel that admits a unique stationary distribution \pi_{\gamma_{k}} that is close to the desired \pi in total variation distance and gets closer when \gamma_{k} decreases.

Now in the non-asymptotic case with fixed \gamma = \gamma_k, the authors provide bounds that explicitly depend on dimensionality of the support of the target, the number of samples drawn and the chosen step size \gamma . Unfortunately, these bounds contain some unknowns as well, such as the Lipschitz constant L of the gradient of the logpdf \log \pi(\cdot) and some suprema that I am unsure how to get explicitly.

Durmus and Moulines particularly take a look at scaling with dimension under increasingly strong conditions on \pi, getting exponential (in dimension) constants for the convergence when \pi is superexponential outside a ball. Better convergence can be achieved when assuming \pi to be log-concave or strongly log-concave. This is not surprising, nevertheless the theoretical importance of the results is clear from the fact that together with Arnak Dalalyan this is the first time that results are given for the ULA after the Roberts & Tweedie papers from 1996.

As a practitioner, I would have wished for very explicit guidance in picking  \gamma or the series \{\gamma_k\}_{k=1}^\infty. But hopefully with Alains paper as a foundation that can be the next step. As a non-mathematician, I had some problems in following the paper and at some point I completely lost it. This surely is in part due to the quite involved content. However, one might still manage to give intuition even in this case, as Sam Livingstones recent paper on HMC shows. I hope Alain goes over it again with readability and presentation in mind so that it will get the attention it deserves. Yet another task for something that already took a lot of work…

(photo: the Lipschitz Grill diner in Berlin – I don’t
know about their food, but the name is remarkable)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s