murbard2 and glial have pointed out some caveats. Another thing that ties down Monte Carlo sampling at times is when it is not easy to generate the samples required for Monte Carlo. A common route is to use a Markov chain sampler, hence MCMC, but that's far from a solved problem. Its very hard to reason and test that the Markov chain mixes fast, and if it does not it would take exponential time to generate the samples from its stationary distribution. Coming up with schemes better than a naïve Markov chain is an intellectual industry in itself.
HMC has better asymptotics than MCMC. If your probability surface is differentiable you should always use that. Riemannian MC takes curvature into account but you need to start storing Hessians which isn't practical. Two missing pieces would be:
- An SGD equivalent for sampling "tall" data. The generalized Poisson sampler can do that, but the variance is crappy.
- An L-BFGS like version of Riemannian MC, to get rid of the Hessians.
Parallel tempering should also be exploited more.
That said, at the end of the day, the ideal sampler would be able to reason about the distribution as program and not just as a mathematical black box. It should build tractable approximations intelligently and use those to bootstrap exact sampling schemes.
I think we're going to see a wealth of better samplers come out in the next decade, following the path of combinatorial optimization towards preserving the structure of the programs.