Why I’m not on the Julia bandwagon (yet) (2012)

x0x0 · on Nov 29, 2013

Julia has a lot of rough edges. For example, the dataframe support is primitive. Type issues show up all over the place; for example, if you extract a column from a dataframe, then combine it with other columns via [], it sometimes fails if the column you extracted is first and succeeds otherwise. Saving matrices is still really primitive; nowhere near as good as R or matlab. In julia, you still have to remember what you put in the file and how big it is.

Nonetheless, julia is awesome and if you haven't tried it you're missing out. It's pleasant to work in in a way that numpy + python isn't. It's a higher performance R, one which has a hope of scaling to the sizes of data I want to use. You can almost always write in matrix notation then decay to loops if necessary.

a toy irls implementation looks like:

   sigma = function(X)
      1/(1 + exp(-X*beta))
   end

   beta = zeros( size(X,2) );
   max_iter = 50
   iter = 0
   converged = false
   while iter < max_iter && ~converged
     s = sigma(X);
     cost = sum(Y .* log(s) + (1 - Y).*log(1 - s));
     grad = X' * (s - Y);
     H = X' * diagm( s .* (1-s) ) * X;
     d = H\-grad;
     # print([beta d])
     beta_new = beta + d;
     delta = norm(beta_new - beta);
     beta = beta_new;  # NB: exact hessian so no line search
     @printf("iter: %d; beta delta %f log like %f\n", iter, delta, cost)
     iter += 1
     if delta < 1e-4 converged = true end
   end
   
   beta

If you haven't tried it and you're an R or numpy user, you should give it a try immediately. Lots of people seems to use ipython or ipython in the browser to efficiently use the repl with code. I prefer tmux with 2 windows; vim in the left, julia in the right, and vim-slime to move code over.

jamesjporter · on Nov 29, 2013

FWIW, running Wes' code with a fresh pull of Julia today shows some improvement in the pure-Julia benchmarks at least:

    Array operations took 79.55683398 ms
    inner took 23.88444026 ms

    BLAS took 48.182116560000004ms

Obviously its possible that this is just due to differences between our machines; just figured I'd throw it out there.

eliteraspberrie · on Nov 29, 2013

In most cases I've come across (signals processing) a trivial Python (NumPy) implementation was much faster than a trivial Julia implementation, because of the vector syntax. This, IMHO, is the Achilles heel of the Julia language.

Generally speaking, any language for numerical computing must feature vector syntax. Most numerical algorithms work on arrays (vectors, matrices, and so on) and can be easily translated into a language which supports vector notation: MATLAB, Fortran, and Python (NumPy).

Asking people, in 2013, to explicitly write loops in numerical code is a definite step backwards.

nick_dm · on Nov 29, 2013

Is there any thing in particular that Julia is missing here?

http://docs.julialang.org/en/release-0.2/manual/arrays/#vect...

http://docs.julialang.org/en/release-0.2/manual/linear-algeb...

My understanding was that Julia gives you the option to write loops explicitly but it doesn't stop you using vector and matrix operations when appropriate, so I'm curious if there are any particular operations that its array implementation are missing?

micro2588 · on Nov 29, 2013

I think this comment is referencing the fact that as of right now, slicing a julia array creates a copy and not a lightweight "view" of the sliced data. This makes vectorized operations on slices of arrays sometimes faster in numpy than in julia. See https://github.com/JuliaLang/julia/issues/3701.

StefanKarpinski · on Nov 29, 2013

Note that you already can make view slices, it just isn't the default. As per the issue you linked to, the default will change to views in version 0.3.

StefanKarpinski · on Nov 29, 2013

This is a bizarrely misinformed comment. You can write vectorized code in Julia as much as you want and it's generally as fast as it is in Matlab or NumPy, although there are a few cases where it can be up to 2x slower; we're working on those. The difference is that you can write for loops and those will be as fast as for loops in C. Julia's vectorized code only looks bad in comparison to its for loops – because those are even faster.

rrrrtttt · on Nov 29, 2013

No matter which language you use, you will not beat Intel MKL because that code is optimized by Intel specifically for each architecture. Precisely because automatic optimization by compilers is not good enough. These comparisons are completely irrelevant. Incidentally, this also shows why pursuing the goal of a "fast" numerical computing language is a waste of time: the speed derives from using the appropriate hardware vendor libraries, not from the language runtime.

egocodedinsol · on Nov 29, 2013

I'm confused: Matlab uses MKL, Armadillo can be configured to work with MKL if you have it, and Julia supports MKL, and NumPy too. So if everything uses the same hardware vendor libraries, I'm not sure if that's where your actual speedup will come from. Also, MKL syntax itself is not particularly fun compared with Matlab/NumPy/Julia/etc.

rrrrtttt · on Nov 29, 2013

Once your language allows you to use MKL and its equivalents, your code will use every arithmetic unit of the CPU in almost every cycle and therefore performance-wise there is no difference which language it is. So it's all down to which language offers you better library support and nicer syntax.

egocodedinsol · on Nov 29, 2013

Hmm. If it we're true that speed is independent from language once mkl is installed, wouldn't you expect there to be no observed performance differences? And yet there are large measurable ones for reasonable tasks even with mkl.

StefanKarpinski · on Nov 29, 2013

While it's great having fast kernels, there is a huge amount of computing that can't be crammed into a matrix multiply or any other high-performance kernel. If your assertion were even remotely true, there wouldn't be so many Python, Matlab and R extensions written in C, and things like Cython [1] and Numba [2] wouldn't exist. Sometimes you just want to write a for loop or use recursion and not have it be dog slow. Being "allowed" to use iteration and recursion without a performance penalty is one of the things that Julia provides.

As to not being able to beat MKL, it isn't even necessarily hands down the best BLAS around. For example, OpenBLAS [3] is about as good as MKL, depending on what you're doing. They each have their strengths and weaknesses:

1. OpenBLAS is faster than MKL in all the level-1 tests for small numbers of threads (1-4). The difference is larger for smaller problems. In small level-2 and level-3 instances, however, MKL does better. Specifically in the case of matrix-vector products, MKL seems to do much better.

2. On various linear algebra tests with LAPACK, MKL is faster for smaller problem sizes, whereas OpenBLAS is faster for larger problems.

3. In general, MKL seems to have better tuning for threads. OpenBLAS, on the other hand, has optimized kernels for LU and Cholesky factorizations, which is what GotoBLAS [4] – on which OpenBLAS is based – did too.

Blake Johnson, who is a regular Julia contributor, did an excellent analysis of this, complete with pretty Gadfly-generated graphs, which can be found in the discussion of this issue: https://github.com/JuliaLang/julia/issues/3965. Interestingly, I believe that Kazushige Goto, who originally created GotoBLAS, now works on MKL at Intel.

[1] http://cython.org

[2] http://numba.pydata.org

[3] https://github.com/xianyi/OpenBLAS

[4] http://en.wikipedia.org/wiki/GotoBLAS

stephencanon · on Nov 29, 2013

MKL is not magic. It doesn't use special secret instructions that no one else can use. The Intel engineers who work on it are not genetically-designed evil super optimization geniuses. Hand-optimizing math libraries does not require information that isn't available in Intel's optimization manuals or determinable from simple experiments. It is absolutely possible to beat MKL. There's just not much reason for most people to bother doing so, when MKL (and other competing libraries) are already available for them to use.

All of the "modern" numerical computing environments can use whatever BLAS and FFT (and ...) libraries are available on the host system, including MKL. What new languages offer isn't (usually) better execution speed (though there's still plenty that can be done with optimizing evaluation of linear algebra expressions at a high level), it's faster and more pleasant development of numerical codes.

jordigh · on Nov 29, 2013

MKL is only optimised for Intel architecture and intentionally pessimised for any non-Intel architecture.

Oh, sure they say, "well, we don't know the other architecture, so we can't optimise for it," but you know this is full of lies, as if AMD were some obscure architecture that Intel can't possibly know if it supports SSE or not.

msl09 · on Nov 29, 2013

This feels almost like a parody to http://calculist.org/blog/2013/11/27/on-on-asm-js/

GhotiFish · on Nov 29, 2013

seems odd to parady a starry eyed wishful thinking post with a negative pessimistic post.

Are you referring to this article? http://acko.net/blog/on-asmjs/ Because this article is awesome