In summary, the Rust code in the direct source translation is about 3.5–4× faste...

kibwen · on June 10, 2014

In addition, "I made no effort to remove/reduce/streamline allocations", which is a bit of a cruel tease. I want to know how much faster it could still be! :)

dbaupp · on June 10, 2014

I just ran perf on it: the slowest allocation function takes 0.04% CPU time in total, meaning there's not much time to gain from just removing the allocations directly. There may still be a benefit from better data locality from fewer allocations.

kibwen · on June 10, 2014

In the future I'd recommend perf for benchmarking as well. `perf stat -r 3 ./foo` will do the repeated runs for you, and give you output like "1.002251432 seconds time elapsed ( +- 0.025% )", where that latter number appears to be the coefficient of variation.

dbaupp · on June 10, 2014

Oh that's nice, thanks.