Aren't many bash programs written in C? So this is implying Julia is somehow fas...

socialdemocrat · on Feb 15, 2022

I was talking about bash scripts. But outperforming C with Julia is perfectly possible. Julia JIT compilation means you can remove overhead of a lot of function calls which C cannot do. A simple example would be sort taking a function pointer doing object comparison.

High level functional style code with things like map and filter can frequently be JIT compiled to optimal machine code.

Fortran is considered faster for numerical code than C and well polished Fortran libraries like BLAS is already getting outperformed by Julia.

For typical systems programming with need to tight control of memory and real time system C will still have the edge. But for anything crunching lots of numbers like data analysis or machine learning Julia will likely outperform everybody else.

KolenCh · on Feb 15, 2022

Any reference to how Julia outperform Fortran in BLAS? I thought it is/was calling existing BLAS libraries?

adgjlsfhk1 · on Feb 15, 2022

Julia currently ships with fortran based Blas, but Octavian.jl is apure Julia matmul that is faster. (it's nowhere near finished though)

cbkeller · on Feb 15, 2022

And the "how" behind Octavian.jl is basically LoopVectorization.jl [1], which helps make optimal use of your CPU's SIMD instructions.

Currently there can some nontrivial compilation latency with this approach, but since LV ultimately emits custom LLVM it's actually perfectly compatible with StaticCompiler.jl [2] following Mason's rewrite, so stay tuned on that front.

[1] https://github.com/JuliaSIMD/LoopVectorization.jl

[2] https://github.com/tshort/StaticCompiler.jl

KolenCh · on Feb 15, 2022

Thanks. But how LoopVectorization.jl is helping here, say comparing to C/Fortran optimized w.r.t. to the CPU? Is there somewhere in their doc mentioning this?

adgjlsfhk1 · on Feb 15, 2022

The basic answer is that LLVM doesn't do as good a job with some types of vectorization because it is working on a lower level representation. There are several causes of this. One is that LoopVectorization has permission to replace elementary functions with hand-written vectorized equivalents, another is that it does a better job using gather/scatter instructions.

KolenCh · on Feb 15, 2022

Thanks! Very interesting.

Link here for others: https://octavian.julialinearalgebra.org/stable/