# Importance Sampling Session at MCQMC 2016

On behalf of the mcqmc 2016 organizing committee I am pleased to accept your proposal.
-Art Owen
I got this nice message from Art yesterday night. My proposal for a session on Advances in Importance Sampling at MCQMC 2016 got accepted. Which is great, as I think the session is made up of strong papers (obviously). This session will almost surely be moderated by Nicolas Chopin.

### MCQMC session on Advances in Importance Sampling

#### The sample size required in Importance Sampling

S. Chatterjee, P. Diaconis

The goal of importance sampling is to estimate the expected value of a given function with respect to a probability measure ν using a random sample of size n drawn from a different probability measure μ. If the two measures μ and ν are nearly singular with respect to each other, which is often the case in practice, the sample size required for accurate estimation is large. In this article it is shown that in a fairly general setting, a sample of size approximately exp(D(ν||μ)) is necessary and sufficient for accurate estimation by importance sampling, where D(ν||μ) is the Kullback–Leibler divergence of μ from ν. In particular, the required sample size exhibits a kind of cut-off in the logarithmic scale. The theory is applied to obtain a fairly general formula for the sample size required in importance sampling for exponential families (Gibbs measures). We also show that the standard variance-based diagnostic for convergence of importance sampling is fundamentally problematic. An alternative diagnostic that provably works in certain situations is suggested.

#### Generalized Multiple Importance Sampling

V. Elvira, L. Martino, D. Luengo, M. Bugallo

Importance Sampling methods are broadly used to approximate posterior distributions or some of their moments. In its standard approach, samples are drawn from a single proposal distribution and weighted properly. However, since the performance depends on the mismatch between the targeted and the proposal distributions, several proposal densities are often employed for the generation of samples. Under this Multiple Importance Sampling (MIS) scenario, many works have addressed the selection or adaptation of the proposal distributions, interpreting the sampling and the weighting steps in different ways. In this paper, we establish a general framework for sampling and weighing procedures when more than one proposal are available. The most relevant MIS schemes in the literature are encompassed within the new framework, and, moreover novel valid schemes appear naturally. All the MIS schemes are compared and ranked in terms of the variance of the associated estimators. Finally, we provide illustrative examples which reveal that, even with a good choice of the proposal densities, a careful interpretation of the sampling and weighting procedures can make a significant difference in the performance of the method.

#### Continuous-Time Importance Sampling

K. Łatuszyński, G. Roberts, G. Sermaidis, P. Fearnhead

We will introduce a new framework for sequential Monte Carlo, based on evolving a set of weighted particles in continuous time. This framework can lead to novel versions of existing algorithms, such as Annealed Importance Sampling and the Exact Algorithm for diffusions, and can be used as an alternative to MALA for sampling from a target distribution of interest. These methods are amenable to the use of sub-sampling, which can greatly increase their computational efficiency for big-data applications; and can enable unbiased sampling from a much wider-range of target distributions than existing approaches.

# Quasi-Monte Carlo Methods in Finance

This second reference Mathieu Gerber gave me in the quest for educating myself about QMC, is paper by Pierre L’Ecuyer from the Winter Simulation Conference in 2004. It was much clearer as a tutorial (for me) as compared to the Art Owen paper. Maybe because it didn’t contain so much ANOVA. Or maybe because I was more used to ANOVA from Arts paper.

This paper specifically and quite transparently treats different constructions for low discrepancy point sets, in particular digital nets and their special cases. On the other hand, randomization procedures are discussed, which sometimes seem to be very specialized to the sequence used. One seemingly general transform after randomization called the baker transformation results in surprisingly high variance reduction of order $O(n^{-4+\epsilon})$. The transformation being to replace the uniform coordinate $u \in [0,1)$ by $2u$ for $u\leq 0.5$ and $2(1-u)$ else.

In the examples L’Ecuyer mentions that using an Eigenzerlegung of covariance matrices (i.e. PCA) results in much higher variance reductions as compared to using Cholesky factors. Which he attributes to dimension reduction – a naming I find odd, as the complete information is retained (as opposed to, e.g. tossing the components with lowest Eigenvalue). My intuition is that maybe the strong empirical gains with PCA might rather be attributed to the fact that Eigenvectors are orthogonal, making this decomposition as close as possible to QMCs beloved unit hypercube.