Exploiting Vector Instructions with Generalized Stream Fusion [pdf]

harpocrates · on Dec 23, 2017

This is a cool (if not particularly new) paper. Interestingly enough, the [vector library][0] mentioned in the paper isn't just a research project - it is one of the most widely used Haskell libraries for vector.

That said, I am saddened by the state of vectorization in GHC. AFAICT it seems all but abandoned [1]. I think there is some hope on this front in the recent progress that has been made on the LLVM backend (producing good input code for LLVM, and choosing the most fruitful LLVM optimization passes) [2].

  [0]: https://hackage.haskell.org/package/vector
  [1]: https://www.reddit.com/r/haskell/comments/51gvxl/whatever_happened_to_automatic_vectorization_in/
  [2]: https://ghc.haskell.org/trac/ghc/wiki/ImprovedLLVMBackend

nh2 · on Dec 22, 2017

I don't get it, isn't this from 2013?

It also says 2013 in the paper. And papers from 2013 cite this paper according to Google scholar. And I'm relatively sure I read this paper years ago.

im3w1l · on Dec 22, 2017

> Benchmarks show that high-level Haskell code written using our compiler and libraries can produce code that is faster than both compiler- and hand-vectorized C.

Big, if it holds up.

Veedrac · on Dec 22, 2017

We're talking <10% difference on one particular toy example exclusively on arrays at least ¼MiB large.

Given how similar the code produced is from both compilers (literally just movapd-mulpd-addpd), the only relevant difference seems to be usage of prefetching, which C compilers don't do much because the common wisdom is that it hurts in most scenarios.