How to Make Python Run as Fast as Julia

jordigh · on Dec 16, 2015

I don't think this is accurately representing Julia's aims. Of course the Julia team wrote the Python code in a way that makes it run slowly. But it looks like perfectly natural Python code! It's almost a literal translation of the Julia code. Julia's benchmark is even far worse with Octave, which they almost deliberately wrote in the worst way possible for Octave, with lots of loops and recursion.

We have written some documentation for Octave in order to guide people towards writing faster Octave code:

https://www.gnu.org/software/octave/doc/interpreter/Vectoriz...

But look at how much we have to explain, and look at all the hoops we have to jump through in Python and Octave in order to write faster code. Matlab used to have similar guides telling people don't write this, write that instead.

What Matlab eventually did was look at what people were writing and making that fast. Julia did the same and made it even faster.

This is a lesson from C and C++ compilers that seems to be taking a long time to trickle to other programming languages: as long as you're writing reasonable code, speeding up your code is your compiler's job, not yours. Your compiler usually knows better than you how to unroll loops, how to cache results, how to use multiple cores, how to elide unnecessary intermediate results, how to completely remove dead code. You should focus on writing easy to understand, maintainable, high-level code. That's why you're using a programming language and not machine code.

jacobolus · on Dec 16, 2015

> What Matlab eventually did was look at what people were writing and making that fast.

Except in practice, to write fast Matlab code you need very deep understanding of Matlab’s internals and years of experience. Seemingly trivial patterns end up slowing your code down by multiple orders of magnitude, and there’s more “guess and check” involved when trying to write code that can be effectively JIT compiled by Matlab than careful reasoning. Even worse, the JIT and the profiler don’t get along, so it’s often impossible to get any insight into the reasons for JIT-related performance differences.

In general, Matlab is an extremely unpleasant and frustrating environment compared to almost any other language I’ve worked with. The only thing Matlab has on Python/Numpy is a nice quantity of publicly available code for various technical functions. Most of this code is hacky academic prototype stuff, but that’s much better than nothing if you’re trying to follow someone’s algorithm written up in a paper.

The Matlab GUI and tooling is a buggy and unpolished Java turd from the 90s which fits in poorly with any modern operating system.

dagw · on Dec 16, 2015

The Matlab GUI and tooling is a buggy and unpolished Java turd from the 90s which fits in poorly with any modern operating system.

And yet every time I try to get Matlab users to use something else (Python or Julia) the one thing they almost immediately complain about is the lack of a GUI IDE as good as the Matlab one.

dr_zoidberg · on Dec 17, 2015

> And yet every time I try to get Matlab users to use something else (Python or Julia) the one thing they almost immediately complain about is the lack of a GUI IDE as good as the Matlab one.

Pycharm[0] has become my prefered Python IDE about 2 years ago (when they went free commmunity/pay for pro) and it's a way better environment than MATLAB. It now has "Scientific tools" too[1], and I understand some good features, though I haven't used them yet.

The only good thing I still recognize of MATLAB is that most toolboxes work together without (too much?) hacking. Working with Python/NumPy you eventually find stuff that you need to adjust to work with that other library you just found implements that thing you absolutely need.

[0] https://www.jetbrains.com/pycharm/

[1] https://www.jetbrains.com/pycharm/features/scientific_tools....

skierscott · on Dec 16, 2015

> And yet every time I try to get Matlab users to use something else (Python or Julia) the one thing they almost immediately complain about is the lack of a GUI IDE as good as the Matlab one.

Spyder offers a similar environment. It has a variable explorer, script editor, console, etc all in one window (which I think is what matlab users are looking for).

MagnumOpus · on Dec 16, 2015

It's not just that - it is also the extremely extensive and comprehensive documentation of every function (integrated in the GUI), and the powerful profiler (integrated in the GUI), and the charting (integrated in the GUI)...

pkofod · on Dec 16, 2015

Well, they're probably not Linux (Ubuntu only problems perhaps?) users then. Matlab crashes again and again, locking up my entire computer, so I have to hard reset it. It happens so frequently that I simply had to install Windows on my computer, and boot up in that when I need to work on a project where my collaborators voted for Matlab.

jarvic · on Dec 16, 2015

I run MATLAB on Ubuntu 14.04 every day and have never had any of the issues you're describing. Are you sure there isn't something else going on there?

pkofod · on Dec 16, 2015

It must be then. I had a student doing a term paper with the exact same problem. Often plotting related, but not necessarily. Havn't been able to find a solution myself, or online.

x0x0 · on Dec 16, 2015

And that's the pitch for julia. Oh, and mostly fast enough to write in julia the stuff that, for python, you'd write as C extensions, crucially enabling lots of optimizations that are terribly hard to do for python. And avoiding the whole pypy / C api plugin problem.

It's still pretty immature, but promising.

That said, matlab's linear algebra syntax is still better than python's.

chm · on Dec 16, 2015

Your first quip also applies to Mathematica. It's very easy to write quite inefficient code in Mathematica. One classic example of this is a newcomer not using N[] and evaluating every expression symbolically. Seems trivial, but it really isn't for someone new. I still could get speedups on linalg code two years into grad school.

TheOtherHobbes · on Dec 16, 2015

It applies to most languages.

In practice you can't rely on the compiler, because you don't know what the compiler is doing.

You certainly can't assume it's going to make the smartest possible decision for all of your code. There's no standardisation for optimisations, and they're often a context-dependent trade-off anyway.

The only way to write fast code is to learn the quirks of the tool chain and profile production binaries to find the bottlenecks.

gh02t · on Dec 16, 2015

Mathematica also has the double disadvantage that it's syntax and semantics are pretty different from what most people are used to. I find it quite pleasant as it's quite consistent, but it's not the most straightforward thing to learn coming from e.g. [naive] MATLAB. When I show people how to use it, the first thing I warn them is "no For[] loops and only use Table[] sparingly."

reikonomusha · on Dec 16, 2015

This is a really good comment and covers what a lot of people seem to miss in benchmarks. The fact I can install 2 different compilers (with varying degrees of support) and some hyperoptimized linear algebra libraries don't really tell me about how fast the system is in general when I need performance.

andrepd · on Dec 16, 2015

Exactly! The point is to compare standard, idiomatic and straightforward Julia code with standard, idiomatic and straightforward Python code, not with a jumble of workarounds, experimental JITs, and cumbersome syntax. Because if you are writing statically typed python, then you are not really writing python anymore, are you?

JupiterMoon · on Dec 16, 2015

Looping over lists/arrays is not idiomatic Python.

tavert · on Dec 16, 2015

The point of that benchmark is looping over lists or arrays. The benchmark verifies the well-known fact that straight Python is a lousy choice for those situations when you absolutely need to do this. Dropping into C extensions or some-of-the-language Numba would be a different comparison, and show a different result.

JupiterMoon · on Dec 16, 2015

Did they at least attempt list comprehensions rather than loops?

EDIT Also in the scientific fields I've encountered not using the various libraries like numpy is non-idomatic. I personally think that python should absorb numpy into the standard library but effectively people I know treat it as if it were already there.

You can argue that numpy etc are implemented in C. But is the Julia interpretor not implemented in C?

tavert · on Dec 16, 2015

Don't know whether list comprehensions were tried. Refactoring suggestions could be considered, if they still capture the main point of the benchmark.

Most of the Julia standard library is implemented in Julia.

JupiterMoon · on Dec 16, 2015

There is no point benchmarking badly written language A vs well written language B. I actually find the Julia benchmarks misleading and that kind of thing leaves a bad taste.

Arguments about liking to loop over arrays to operate on them are a matter of taste. I like whole array operations, it's just what I learned first and best (Fortran 9x+ and numpy). Someone else may prefer to write loops (maybe they started doing numerical work with F77 or C). There is no universal correct way to express things -- but there is sometimes a language specific best way and to benchmark ignoring this is not good.

grayclhn · on Dec 16, 2015

An astonishing amount of Julia is written in Julia.

JupiterMoon · on Dec 16, 2015

Are they throwing away 40-50 years of work on things like LAPACK/BLAS? Or do they still build on the basic foundations and implement higher level things in Julia?

grayclhn · on Dec 16, 2015

No --- I thought about adding that caveat but didn't. Most of the non-Julia code that I'm aware of is scientific libraries: linear algebra (LAPACK/BLAS), random number generators, etc. But lots of very basic foundations and data types are straight Julia --- array addition (as a very trivial example) is implemented by looping over the elements of the arrays in Julia, not by passing the loop to a C library.[1]

Reimplementing LAPACK or BLAS would be a lot of unnecessary work, but I think that one goal of the language is to be fast enough that a reimplementation would not be worse than the existing versions. (I'm not affiliated w/ the project, though, so I'm just guessing.)

[1]: https://github.com/JuliaLang/julia/blob/master/base/arraymat...

JupiterMoon · on Dec 17, 2015

Interesting. Sensible. How possible/easy is it to use Julia libraries from e.g. python or R or C++?

tavert · on Dec 17, 2015

There's an embedding API that's not too bad to work with. There are pyjulia and rjulia projects, but they're not quite as actively developed as the other directions of PyCall.jl or RCall.jl. Making standalone libraries out of Julia code isn't easy to do yet, but should eventually be better-supported.

co_dh · on Dec 16, 2015

andrepd · on Dec 19, 2015

You can use the up vote for that.

jfpuget · on Dec 16, 2015

In light of your comment, it is interesting to see that Swift team just agreed to remove C style for loop.

https://lists.swift.org/pipermail/swift-evolution-announce/2...

thebooktocome · on Dec 16, 2015

What lessons have you learned from developing Octave that you think Julia could benefit from?

jordigh · on Dec 16, 2015

That's a bit of a tough one. Julia is much newer software than Octave, and it also has a lot of knowledge already of what does and doesn't work. They're already well-staffed by programming language experts. I have learned a lot about how the Matlab language works, and also a lot about how Octave is put together.

I don't have much to offer to Julia on technical terms. Octave's codebase started as pre-standard C++, and it's been a long and arduous job to very slowly modernise it. Julia has the advantage of starting modern.

I think the biggest lesson I've learned is, don't underestimate Matlab. It may be old, it may be an ugly language, but it also is gigantic, very well-funded, and everywhere. Julia and Octave are approaching Matlab from different angles, Julia by providing a better language and Octave by providing a free drop-in replacement, but ultimately we are both trying to replace Matlab.

Matlab users greatly value having a complete package with an editor, debugger, visualiser, profiler, and system modeller all in one. They don't generally see Simulink or the symbolic toolbox as being separate components, or indeed, as the completely separate programs that they really are. For the devout Matlab users, these conveniences greatly outweigh a better programming language. Going further, many Matlab users don't even consider their activity "programming" at all, and in recent years Matlab even writes the code for you if you just push a few buttons.

We're seeing some academics slowly switching over to Julia. This is great. The Mathworks knows very well that captivating young, malleable minds in university is how you get loyal paying customers, namely, the future bosses of these newly-molded minds. Julia is doing a good job appealing to profs. I am optimistic that within five years or so, new generations of Julia-trained graduates will slowly spread the love of the new language.

thebooktocome · on Dec 16, 2015

> Matlab users greatly value having a complete package with an editor, debugger, visualiser, profiler, and system modeller all in one.

That's a very good point. Julia's debugger and profiler are both horrifyingly bad.

ViralBShah · on Dec 16, 2015

The debugger is one of Julia's top priorities, and here's a debugger roadmap that Keno recently posted.

https://groups.google.com/forum/#!topic/julia-dev/-LTsBVRv1d...

jordigh · on Dec 16, 2015

After many years, we've finally had something that very roughly approaches that in the Octave GUI. I've heard lots of good things about our GUI, but I know we're still very far from the polish of Matlab in this department.

pkofod · on Dec 16, 2015

Do you really think profiling is so horrifyingly bad in Julia with ProfileView? Imho, I cannot see what Matlab offers on top of that, but that may just be me.

thebooktocome · on Dec 16, 2015

I think ProfileView is great at displaying flame graphs, but there's more to profiling than can be interpreted from a single graph. For example, it doesn't separate execution time from number of executions.

_Codemonkeyism · on Dec 16, 2015

If you don't understand other people, wonder then dig deeper. Don't make it an argument against those people.

pkofod · on Dec 16, 2015

A: "X is stupid, Y is much better!" B: "Are you sure X is stupid? How is Y better?" C: "Don't argue, B! Dig deeper."

You figure out who's who.

I was merely trying to understand why thebooktocome thought the profiler was so bad.

_Codemonkeyism · on Dec 16, 2015

You're obviously the clever one.

pkofod · on Dec 16, 2015

I meant no harm, and as you can see, thebooktocome actually understood my comment, and has given me a useful answer.

Have a nice day.

_Codemonkeyism · on Dec 16, 2015

Sorry then it was my bad English, as I'm no native speaker, as "Do you really think profiling is so ..." I did not understand as a honest inquiry, mainly because of your usage of "really" and your unconventional - I might say defensive - use of "Imho" in "Imho, I cannot see what Matlab ..." where I understood "Imho" as "In my humble opinion" expanded to "In my humble opinion, I cannot see what ...".

Also your words with the easy usage of "stupid" led me - wrongly! - into the direction that you might not be interested in understanding why users prefer Matlab, but more in expressing your opinion about the topic.

And to be frank, again perhaps due to my weak understanding of English, your usage of "You figure out who's who." and "Have a nice day" to me do not sound as the intention of trying to understand things but more on making clever remarks. But as I've pointed out, this might be due to the fact that English is not my native tongue.

sgt101 · on Dec 16, 2015

errm - if someone makes a point respond to the point rather than attacking the fact that they made it? What are the features that the Julia profiler should have? If you can say then I think that everyone will have made some progress.

dbcooper · on Dec 16, 2015

Does Matlab have something like iPython notebooks built in? That is something that would be really nice for academic work.

jawilson2 · on Dec 16, 2015

Matlab scripts have a cell-mode, where you can execute modules of code. Cells are delimited by %%, and you can run a cell by pressing ctrl-enter, run and progress, etc.

In my experience (15 years), Matlab has been the best for exploratory data analysis. If you change a script, you don't need to worry about re-importing issues like in python; you can even change code DURING a debugging session, EVEN with GUIs. I have been using python+numpy+pandas for a few years now, and python just feels like there is an impedance there that I don't get from Matlab, primarily with graphics/matplotlib. But, deploying and sharing python is easier. I admit it could just be my experience with Matlab that biases me. I usually find that people bitching about Matlab don't like the language, and that is usually because they are viewing Matlab AS the language (like python), and not as a data exploration and analysis environment.

sundarurfriend · on Dec 16, 2015

There's an IDE kind of environment as x0x0 mentions, but I think you're referring to the ability to mix code and text and figures all in one document? ~~MATLAB doesn't have that to my knowledge, at least not yet.~~ Strike that, there's MuPad, like the other comment mentions, though it is indeed rarely used in my experience.

Relevant to the topic at hand, Julia does have that functionality already, and it in fact (afaik) uses the same engine that's behind iPython, the Jupyter environment. There's even a web-based community version for people to try stuff out: https://www.juliabox.org/

x0x0 · on Dec 16, 2015

I'm not sure what you mean by that exactly, but matlab has a very usable gui / visual editor for mac and windows (and almost certainly linux, but I'm not sure).

If you mean a packaged script more like a mathematica notebook, I don't believe so.

tavert · on Dec 16, 2015

They've had the mupad notebook (http://www.mathworks.com/help/symbolic/mupad.html) ever since they replaced the backend of symbolic math toolbox from maple to mupad, around 2008 or 2009. But I don't think it's too popular with Matlab users, there's a certain simplicity to just having a plot come up in a new standalone window that can actually be surprisingly hard to accomplish cross platform. Often have to pick your poison of depending on Tk, Gtk, Qt, Electron, or running in a browser (which still doesn't feel right to a lot of people, and for the non-Python languages supported by jupyter the Python server backend is a non-trivial dependency that can be messy to deal with).

qpsk88 · on Dec 16, 2015

well, R2016a claims to have with the Live Editor something like this.

jfpuget · on Dec 16, 2015

Thank you, great comment, it spurred a great discussion.

Let me just clear one thing: I am not trying to represent Julia in any way. I wouldn't be legitimate for that at all. I tried to not write anything negative about the Julia language. Let me know if you think i did, in which case I'll modify my text.

To your point about compilers in charge of speeding up code, I see Numba doing more and more. I hope it will cover all of Python soon.

Lanzaa · on Dec 16, 2015

I thought the article was good. It started with a basic premise, "Should we ditch Python and other languages in favor of Julia for technical computing?" Then the article shows several methods for speeding up python code to be faster than using julia. I was a little disappointed that pypy was not mentioned, but it is a good introduction to speeding up python code.

I think the flaw in the article is when it switches to answering, "did the Julia team [write] Python benchmarks the best way for Python?" Then it rewrites the fib implementation to use a cache, which makes the comparison to the julia version completely ridiculous. It also does all sorts of optimization which clearly deviated from the spirit of the benchmark, naive python vs naive julia.

I wish the conclusion had been written in a way that clearly answered the original premise, should we ditch python for julia? The article clearly showed that there are a lot of good ways to speed up python code. Looking for algorithmic complexity wins (like in the fib example), using cpython, using numba, and profiling all can be used to speed up python code to the level of naive julia code. Which leads to the conclusion, if all you want is faster code there is no need to ditch python for julia.

jfpuget · on Dec 16, 2015

I agree that the first question is a very interesting one. I would be interested in the answer myself.

I made it clear however that I was going to answer a different question: did the Julia team wrote Python benchmarks the best way for Python?

That's all what the post is about: how to run Python code faster.

I will need way more Julia experience to be able to even think about answering the first question. The only thing I am 100% sure is that these micro benchmarks do not help answering that question.

co_dh · on Dec 16, 2015

I don't agree that speeding up your code is your compiler's job. If that is true, there is no point to learn data structure and algorithm.

the_af · on Dec 16, 2015

I don't think you and the parent post are talking about the same kind of speedups. Of course choosing the right algorithm and data structure is your job. How you tackle the problem is definitely your job, not the compiler's.

But in general, unrolling loops, caching (some) results, and eliding redundant code rightfully is the compiler's job. That's a different kind of speedup.

vvanders · on Dec 16, 2015

I would argue that data layout is even more important the the data structure you use(see radix sort) which is also something the compiler can't control.

I pretty strongly disagree that with the meta-parent that it's the "compiler's job to speed up your code". It just doesn't have enough context to be able to do it effectively. I would actually prefer a predictable compiler to one that tries to do all sorts of magic tricks that can be perturbed by seemingly trivial code changes.

scott_s · on Dec 16, 2015

Picking the right data structure and algorithms gets your code in the right algorithmic complexity range - did you implement a naive linear algorithm, or did you figure out a way to make it logarithmic? Or maybe even average-case constant? Compilers are generally not good at making optimizations that change complexity class. (Possible, yes, but not commonly.)

That's not what jordigh is talking about. Let's say you figured out a clever way to get your problem to O(log n). That's definitely your job. But in practice, constants matter - that is, 10 log n is way, way worse in practice than 2 log n, even though they are both O(log n). And that's the kind of optimization that compilers may be better at than you. Once you have convinced yourself that the performance of some component matters, a good principle is to figure out the best complexity class you can (using the "right" data structures and algorithms), then write the most idiomatic code possible.

grayclhn · on Dec 16, 2015

Regardless of whether or not your main point is true in general, most scientists don't know and will never have time to learn data structures and algorithms. So, for this domain at least (scientific computing), the compiler is getting the job anyway.

JupiterMoon · on Dec 16, 2015

> It's almost a literal translation of the Julia code.

Therein lies your problem. If I took some of my python code and literally translated it to Julia would it run like a pig as well? Could I even do a literal translation?

Veedrac · on Dec 16, 2015

It'll probably translate, and it'll probably run fine.

Julia's like scientific Python, but where the naïve implementation also runs fast. "Fast by default.", perhaps.

Nrpf · on Dec 16, 2015

Actually no. Naive Julia is often slower than python. Vectorized Array expressions are on par with numpy and slower than numba.

"type unstable", IO and text processing code is also slower.

Veedrac · on Dec 16, 2015

Although Numba might beat Julia, that's far from saying Julia would "run like a pig".

Nrpf · on Dec 17, 2015

Type unstable Julia is often slower than python.

thebooktocome · on Dec 16, 2015

The point of the Julia benchmarks was to show compiler performance.

You can do something clever in any language. There are plenty of really, really smart people that spend a lot of time writing incomprehensible (to me) Haskell that outperforms C.

The question is, do you have to do something clever to get performant code in the language of your choice?

In Julia -- not often. I've written around 50kloc of Julia; almost all of it is first-pass prototype code that manages to be performant despite itself. The most polished code I've written in Julia is about 100x faster than the MATLAB it replaced.

IMO, the main advantage of Python is its massive library of modules. As a prototyping language, on the other hand, it just seems to me that Julia is more flexible.

yati · on Dec 16, 2015

I agree with your main idea, but if you think about it, the author isn't doing anything clever per se with Python. I mean, using numpy and associated libraries is fairly common when Python is used for numerical computing (remembering that Python is a _general purpose_ language, not particularly aimed at a particular domain).

I have tried Julia, and really like it, but am still using Python for most of my prototyping. Maybe I should force myself to use Julia for some time :)

lqdc13 · on Dec 16, 2015

I agree, but this is only true as long as you stick with numerical applications, which is kind of the point of Julia.

It is currently more of a domain specific language, kind of like Matlab, because it's not really optimized for other things and has no libraries in the other domains.

On the other hand, I implemented a prototype neural network in it, and it went very smoothly. Will have to eventually rewrite it in Python though.

niutech · on Dec 28, 2015

Julia is a general purpose language, not a DSL. You can create web services, desktop apps, file tools with Julia. And it has access to a lot of external libraries using CCall.jl and PyCall.jl

Nrpf · on Dec 16, 2015

Right now, numba is a better compiler than Julia has. You can write array expressions and loops in numba and it will do fusion/deforestation to eliminate temporaries giving you faster vectorized in loop code than Julia currently does.

Of course, this is currently being worked on in julia as well, but I don't see your point.

lottin · on Dec 16, 2015

But to be fair, you need to compare the same exact program. A program that handles arbitrary precision isn't the same program as one that doesn't. The same goes for support for missing values and other stuff.

thebooktocome · on Dec 16, 2015

The OP was misleading. The benchmark is to calculate fib(20) = 6765, which does not overflow.

You can verify this for yourself at https://github.com/JuliaLang/julia/blob/master/test/perf/mic...

jfpuget · on Dec 16, 2015

The issue is that the Python code uses arbitrary precision while Julia code uses 64 bits. Python may be slower because of that. One way to get rid of this discrepancy is to compile with typed Cython, or with Numba, which is what I did. The other way would be to benchmark Julia with BigInt. Either way would be fair IMHO.

thebooktocome · on Dec 17, 2015

I was unaware that Python 3 removed the distinction between int (which used to be C-style int) and long (which was arbitrary precision).

If it's really idiomatic Python 3 to always use arbitrary precision integers for everything, then it's not really Julia's fault that Python 3 makes it more difficult to use performant arithmetic.

saurabhjha · on Dec 16, 2015

I have worked with python in two domains- scientific computing and web applications

- In scientific computing, we can either use better algorithms (yes, that makes a lot of difference) and dropping to C as necessary. The canonical example of the second alternative is Numpy.

In my humble opinion, the expressive power of python is what makes it an excellent language for scientific computing. For these kind of problems you cannot afford to worry about buffer overflows and memory allocations. You need a free mind to think about mathematical algorithms.

My own approach is to use python whenever I can do it and then use cProfile to determine whether to port some parts to C.

- If you are using python in application server, most of the time is spent waiting for data. Mostly, the job of application server is to collect data and make some kind of response which is not CPU intensive.

What you should optimize for in this case is data access patterns and creation of data objects. On the other hand, if you have any CPU intensive work, write it as a separate service outside of your application.

pathsjs · on Dec 16, 2015

They really lost me at "Caching computations". It should be clear that the benchmark is NOT the quickest way to compute Fibonacci numbers. The reason why it is included at all is that - without caching - computing Fibonacci numbers this way involves an exponential amount of computation, so it is easy to get long running times. Using a cache invalidates the point of doing the benchmark at all!

tokai · on Dec 16, 2015

You can easily make Python run nearly as fast as Julia while only using the standard library.

  import subprocess
  with subprocess.Popen(["julia", "tongue-in-cheek.jl"], stdout=subprocess.PIPE) as proc:
          print(proc.stdout.read())

mahouse · on Dec 16, 2015

Nearly? That's understimating how much it will take for the Python interpreter to wake up!

jankiel · on Dec 16, 2015

"Making it fast" wad not a goal of that benchmark. The whole point of it is measuring core features of the language - looping, recursion and so on. That's why they have this horrible Fibonacci impelentation: to meadure how language hanfles recursion.

Author misunderstands this. If you say that you're optimizing python by calling C, then something's wrong here.

Animats · on Dec 16, 2015

Suprisingly, they didn't try PyPy, which is about two orders of magnitude faster than CPython on simple loops. PyPy needs to become the main production version of Python. CPython should be viewed as obsolete technology, like the original non-compiling Java interpreter in Netscape 1.

zitterbewegung · on Dec 16, 2015

The reason people don't use pypy is the lack of c libraries.

jfpuget · on Dec 16, 2015

I did not try Pypy because last time I checked, it wasn't supporting Numpy. It means that Pypy would not been able to run these micro benchmarks as Numpy is used in some of them.

Please let me know if Numpy is supported now in Pypy. I'd be happy to add Pypy in the mix in that case.

itsadok · on Dec 16, 2015

I'm pretty sure everything you used is already supported: http://buildbot.pypy.org/numpy-status/latest.html

jfpuget · on Dec 16, 2015

Thanks. I did look at that page recently, and felt Numpy support was still experimental. But I'll give it a fair try.

What would motivate me would be that Pypy supports the packages I need for my day job, including Pandas, Scipy, and Scikit-learn. Do you know if there are plans to have these on top of Pypy?

poooogles · on Dec 16, 2015

http://lostinjit.blogspot.com/2015/11/python-c-api-pypy-and-...

tl;dr, Yes there are plans. But funding is needed.

jbssm · on Dec 16, 2015

Does PyPy work with Python 3?

Python 3 has been out for 7 years and I refuse to use anything that doesn't work in Python 3, it's just ridiculous to keep building stuff for Python 2, it's hindering the language and keeping it back in the past.

heinrich5991 · on Dec 16, 2015

Yes, it does: https://en.wikipedia.org/wiki/PyPy#Project_status.

canjobear · on Dec 16, 2015

Only Python 3.2 (so no "yield from" or async stuff).

Veedrac · on Dec 16, 2015

Note also jitpy for those times when they're not:

http://jitpy.readthedocs.org/en/latest/

stonewhite · on Dec 16, 2015

Google doesn't return a lot about any "non-compiling Java interpreter in Netscape 1". Anybody has some source?

Also, thinking like that makes more sense about the naming of JavaScript.

pronoiac · on Dec 16, 2015

I think the focus is "pre-JIT Java," rather than the exact browser version, Netscape Navigator 2.

chrispeel · on Dec 16, 2015

The benchmarks seen at [1] which compare Julia with other languages were written to be idiomatic in those languages. I.e., they weren't supposed to be the fastest you could be in that language (which often would mean call a C library), but rather something which represents the language well. So the right criticism is not just if the benchmarks could be made faster, but also whether the speedup is too tricky or hard or requires calling wrapped C libraries.

[1] http://julialang.org/benchmarks/

RyanHamilton · on Dec 16, 2015

Actually Chris it says on the website the benchmarks were "written to test the performance of specific algorithms, expressed in a reasonable idiom". I took issue with how they had written some of the java, e.g. They wrote their own quicksort which was slower than just using Arrays.sort, the much more idiomatic way in java. I even submitted a PR which went nowhere: https://github.com/JuliaLang/julia/pull/14229 I then broke the code improvements into smaller PRs and am still waiting after 2 weeks for the first PR to be merged.

ViralBShah · on Dec 16, 2015

Hi Ryan, the point was not to use Java's built-in sort, but to implement a textbook quicksort implementation in all languages to see how the compiler performs. That is why the original PR was not merged.

On the smaller PR, I had requested fixing the mandel benchmark that in Java is doing lesser work than the Julia and Lua benchmarks, which gives it an unfair advantage. That should be easy enough to fix too - but I didn't get a reply.

Let's get it merged though, and continue the discussion on the PR.

pjmlp · on Dec 16, 2015

The difference is that in Julia one stays within the language, doesn't need to use subsets of the language or optimized C libraries where the language plays a glue role.

I would agree with the article if Cython and Numba would support 100% Python or Numpy wasn't used to achieve similar execution speed.

Nrpf · on Dec 16, 2015

Julia also requires subsets of the language. Try writing rolled array expressions (in a loop or otherwise) in Julia and in Numba and see which one is faster.

Also try IO or text processing in Julia. Python is known to be faster right now.

pjmlp · on Dec 16, 2015

> Also try IO or text processing in Julia. Python is known to be faster right now.

Back when I used Python (early 2000's), that was actually C code, if I remember correctly.

The question is with IO and text processing implemented in pure Julia and pure Python, except for the OS FFI, which JIT compiler provides the best implementation?

tavert · on Dec 16, 2015

Counterpoint - try user-defined types (classes) in Numba.

Nrpf · on Dec 16, 2015

User defined types/classes are currently being worked on in an open PR. Excellent counterpoint for the time being, though.

Aside- Do you know if multiple inheritance/traits will happen at some point? I need this for modeling, even though it can be worked around for general software architecture.

tavert · on Dec 16, 2015

Likely. There's some recent discussion at https://github.com/JuliaLang/julia/issues/6975#issuecomment-... regarding taking inspiration from Clojure's protocols. There are a few different Traits and Interfaces packages floating around showing proof-of-concept implementations. Couldn't give you a timeline on when it'll make it into master, but it should happen.

Nrpf · on Dec 16, 2015

Thanks. The link seemed still to be single inheritance focus (sorry, I don't know the right terminology I come from a OO background.) but I could be wrong. Though I did get the sense that its just a interim step towards multi inheritance.

What about the dataframe and stats infrastructure? its currently in shambles. Any Idea when this can be expected to be fixed?

tavert · on Dec 16, 2015

Stats and dataframes now has someone working full time on it. It will get appreciably better soon.

Nrpf · on Dec 16, 2015

Very glad to hear that. Anywhere roadmap/ central place where I can contribute and follow progress?

tavert · on Dec 17, 2015

Hopefully there will be a blog post soon with some details and plans.

Nrpf · on Dec 17, 2015

Looking forward. I was about to embark with python on a new long term project, but I might delay that pending the new blog post. If possible do you have a guestimate on what sort of time window we are looking at for this blogpost? Days, weeks, month?

tavert · on Dec 17, 2015

Probably not days. I won't be the one to write it so I can't make any especially reliable predictions here. If enough people ask for this, probably some time in January.

gaze · on Dec 16, 2015

Turns out that if you bend over backwards and use some libraries and do a bunch of awkward stuff you can make most languages fast.

kriro · on Dec 16, 2015

I don't know but

  def benchmark_sort_numpy():
      lst = np.random.rand(5000)
      np.sort(lst)

isn't awkward at all and I'd argue that numpy (and pandas etc.) are very natural choices for anyone working on the problems they solve well. That's precisely the beauty of Python. There are very good libraries for almost everything. Usually there's also great communities around those libraries and it's usually not very hard to identify the "state of the art" library for any given problem.

For me the main question is not "will the compiler optimize well in the general case" but rather "will you naturally reach for the right libraries which are optimized well". For me/Python I'd say more often than not the answer is yes. I understand that that's not the point the Julia team is trying to make but it's a decent practical approach (imo)

tavert · on Dec 16, 2015

The motivating factor for using Julia in a lot of cases is: what do you do when the problem you're trying to solve hasn't been exactly solved already by someone else's C extension? Can an average person (scientist, grad student, etc) who knows the math behind the problem they're trying to solve, but doesn't want to jump through hoops of awkward extension compilation (where you have to know not only the high level language and the low level language, but also how to use the interface layer API's that sit between them), write a high-performance implementation from scratch without it taking too much time or effort? If libraries do exist, do they work in parallel? And the standout features of the language like multiple dispatch and metaprogramming also allow some new, very natural ways of approaching a lot of problems in technical computing.

dr_zoidberg · on Dec 17, 2015

That motivating factor can be achieved in Python using Numpy and Cython effectively. Check any of Ian Oszvald's High Performance Python talks.

tavert · on Dec 17, 2015

Numpy is great for dense multidimensional arrays of (edit: fixed precision) floating point numbers. Most problems I face need to deal with richer, more complicated, less uniform data structures than that. Similarly Cython is way better than writing a C extension by hand, but it feels very tacked-on (why are you writing libraries in a different sub-language than you use them from?), what you can do in nogil mode is pretty limited, and the choice of supported compilers is depressingly limited for when you need C++11, inline assembly, Fortran, linking to libraries that build with autotools, etc all to work cross-platform. If absolutely everything in the Python ecosystem were written using Cython then Python would have less of a performance problem, but there's a productivity, distribution, and difficulty barrier there.

princeb · on Dec 16, 2015

i noticed that running arithmetic loops tend to be the popular method of benchmarking these three languages.

I don't know how many folks who use MATLAB care that much about loop performance that they will be inclined to look at Python or Julia just because someone found an amazing improvement there.

The one thing that made me finally go over to Scipy from MATLAB was that I changed focus and no longer have to do analysis involving nonlinear/stochastic optimization. Several years ago (maybe 2009?), Octave was really struggling with speed here, and numpy still felt too new. I vaguely remember needing to wrangle with the mathematics a lot more (like approximating the Jacobians or Hessians) in order to get Octave to work. On the other hand I can only remember a handful of times where I needed to get MATLAB to do loops like these benchmarks (like maybe Runge Kutta or FD ODE solutions). Loops are really quite unnatural/unidiomatic in MATLAB, if you can keep your algorithm as close to linear systems as possible it's quite fast.

Has it changed much since? I know right now most of the important nonlin opt algorithms are available in scipy and if you want more there are external packages, but you still have to tinker a bit for the best solution. There's nothing like mindlessly using fmincon for every single problem in the world. I am only a layperson at nonlinear optimization, so I can't tell you why MATLAB is so much better out of the box.

tavert · on Dec 16, 2015

"Vectorizing" (interpreter-out-of-the-way vectorization, which isn't the same and doesn't necessarily give you SIMD vectorization) your code to get Matlab or NumPy to run it efficiently always seems like an unnecessary burden to put on the programmer, especially for algorithms that don't lend themselves to expressing in a vectorized way. Sometimes you just need to write a for loop, and it's great when the language gets out of your way and lets you do so without slowing your code down by an order of magnitude.

If you care about constrained optimization, Julia has leaps-and-bounds more sophisticated tools than anything Matlab or Python have to offer. Check out http://www.juliaopt.org and especially JuMP.jl. http://www.optimization-online.org/DB_FILE/2015/04/4891.pdf has some detailed comparisons. Macros and fast generic programming make Julia a very well-suited language for doing automatic differentiation (https://en.wikipedia.org/wiki/Automatic_differentiation).

porker · on Dec 16, 2015

While it's great to make code run as fast as possible, not every scientific problem is bounded by code execution.

I've recently been writing satellite image processing code, and profiled it thinking the algorithm was the problem. It turned out that even on a SSD nearly 90% of the program time (~30 minutes) was reading from & writing to disk intermediate image files.

More could be kept in RAM, but high-memory cloud machines aren't cheap, and our local development machines only have 16-32GB.

sgt101 · on Dec 16, 2015

You can get some amazing rack mounted machines now, ok not for cheap, but for relatively cheap given the value that they bring. I'm buying an analytics "box" at the moment with 20k gpu cores, 386GB ram and a 45k iops ssd, ok its $40k but shared across 20 engineers the productivity boost more than costs in really fast. It sounds to me that you need to take an investment case to your boss.

porker · on Dec 16, 2015

> It sounds to me that you need to take an investment case to your boss.

Welcome to academia...

sgt101 · on Dec 16, 2015

well, if you want the publications you need the kit.

dagw · on Dec 16, 2015

Or if it's just the occasional batch job, use AWS EC2 spot instances. A machine with 122 GB of RAM can often be had for $0.25-0.50 pr hour and sometimes for a lot less.

miahi · on Dec 16, 2015

It still is an algorithm problem if you don't compress the intermediate files (or you compress them with an algorithm that does not help your processing).

porker · on Dec 16, 2015

That's true. We've tried enabling GDAL's compression (written in C++ IIRC) on our TIFF files, but the combined compress+write/read+decompress is slower than writing/reading the uncompressed file to/from disk.

mahouse · on Dec 16, 2015

By rewriting Python code into something that does not resemble Python not even remotely, you can make Python code run faster.

endrebak · on Dec 16, 2015

Actually, it looks almost exactly the same; you just add a cdef int or long to the fibonacci for example. And using the Python scientific stack (which would likely give you all the speedups you will ever need) _is_ Python and looks like Python.

coldtea · on Dec 16, 2015

>not resemble Python not even remotely

So, I don't see where you got that from.

What he did was VERY MINIMAL changes. In some cases a mere annotation.

IndianAstronaut · on Dec 16, 2015

True. It's almost the equivalent of using Rcpp with R and saying it runs really fast.

Mikeb85 · on Dec 16, 2015

But at the end of the day, it does run really fast. And that's really all that matters.

Rcpp is really nice too, and definitely is a win for R.

Nrpf · on Dec 16, 2015

no its not. Rcpp is cpp syntax and semantics. Look at the Numba example..its pure python with one decorator.

awqrre · on Dec 16, 2015

Python is really slow but performance can be improved by just switching to a different interpreter.

jfpuget · on Dec 17, 2015

I updated the post with Julia running times on the same machine as the one used for Python.

hltt · on Dec 16, 2015

use Pyston

stefantalpalaru · on Dec 16, 2015

> I am not using an alternate implementation of Python here

What does he think Cython and Numba are?

> I am not writing any C code either

No, he's just using libraries written in C and Fortran in a ridiculous attempt to praise Python.

> Writing better Python code to avoid unnecessary computation

You don't improve a language benchmark by changing the algorithm. If you don't understand why, you have a lot more to learn before teaching people how to "make Python fast".