murbard2 and glial have pointed out some caveats. Another thing that ties down M...

murbard2 · on Nov 4, 2016

HMC has better asymptotics than MCMC. If your probability surface is differentiable you should always use that. Riemannian MC takes curvature into account but you need to start storing Hessians which isn't practical. Two missing pieces would be: - An SGD equivalent for sampling "tall" data. The generalized Poisson sampler can do that, but the variance is crappy. - An L-BFGS like version of Riemannian MC, to get rid of the Hessians.

Parallel tempering should also be exploited more.

That said, at the end of the day, the ideal sampler would be able to reason about the distribution as program and not just as a mathematical black box. It should build tractable approximations intelligently and use those to bootstrap exact sampling schemes.

I think we're going to see a wealth of better samplers come out in the next decade, following the path of combinatorial optimization towards preserving the structure of the programs.

marmaduke · on Nov 4, 2016

> you need to start storing Hessians

aren't there methods for online sparse estimates of the Hessian?

I'd expect a lot of "large" problems for which RMC is useful would have sparse structures.

> ideal sampler would be able to reason about the distribution as program

Are you a developer of Stan? If not, you might be interested.

http://mc-stan.org

_edit_ by online estimate of a Hessian, I meant online numerical approximation based on the sequence of Jacobians.

jmalicki · on Nov 4, 2016

"An L-BFGS like version of Riemannian MC, to get rid of the Hessians." - that's what he said

murbard2 · on Nov 4, 2016

I'm not a developer of Stan, but I know many of them and I'm a fan of the project in general.