HMC has better asymptotics than MCMC. If your probability surface is differentiable you should always use that. Riemannian MC takes curvature into account but you need to start storing Hessians which isn't practical. Two missing pieces would be:
- An SGD equivalent for sampling "tall" data. The generalized Poisson sampler can do that, but the variance is crappy.
- An L-BFGS like version of Riemannian MC, to get rid of the Hessians.
Parallel tempering should also be exploited more.
That said, at the end of the day, the ideal sampler would be able to reason about the distribution as program and not just as a mathematical black box. It should build tractable approximations intelligently and use those to bootstrap exact sampling schemes.
I think we're going to see a wealth of better samplers come out in the next decade, following the path of combinatorial optimization towards preserving the structure of the programs.
Parallel tempering should also be exploited more.
That said, at the end of the day, the ideal sampler would be able to reason about the distribution as program and not just as a mathematical black box. It should build tractable approximations intelligently and use those to bootstrap exact sampling schemes.
I think we're going to see a wealth of better samplers come out in the next decade, following the path of combinatorial optimization towards preserving the structure of the programs.