Hacker News new | past | comments | ask | show | jobs | submit login

RNG quality is probably the least of your worries.

The curse of dimensionality means you often need a huge number of samples to guarantee a useful level of precision. Additionally, many quantities you'd like to know come up in Monte Carlo simulations themselves. That creates an intractable O(N^2) problem.

Here's an example that I encountered at work yesterday:

You want to know P(x>=a and y>=b) where (x,y) is a standard bivariate normal with correlation R.

You can:

1. Simulate a bunch of (x,y) and take the mean of the indicator function 1[x>=a, y>=b]. This will take whole seconds to compute.

2. Use an adaptive quadrature method like scipy.integrate.nquad. This will take dozens to hundreds of milliseconds.

3. Write a fixed grid Gauss-Legendre quadrature routine in c. This will take less than a microsecond.

I needed this value inside a Monte Carlo simulation. If I had to simulate the solution, the problem would be intractable.




Wouldn’t it be even faster to solve that problem conditionally? P(x >= a)P(y >= a | x >= a), since the conditional distribution is known and normal?


You still need to integrate over x. y|x=c is Gaussian for each c, but there are many such c's in [a,inf].

If you're curious:

https://www.ams.org/journals/mcom/1978-32-141/S0025-5718-197...


Ah, duh. Not enough coffee.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: