Python’s Weak Performance Matters

passive · on Feb 7, 2018

This is a weird article at this point in time.

The question it addresses:

"Does Python's performance matter?"

Has always had the answer:

"Sometimes, and you have options for those cases."

The OP found a "sometimes", and he's using one of those options. In this case, he's got Python for prototyping and glue, with Haskell improving performance. This is as it should be.

I don't know of any Python advocates who say it's the right tool for every part of every job. What we will say is it's usually a good "first" tool for every job. Building a system out in Python allows you to get something representative fairly quickly, which helps identify if there are areas where Python alone is not enough.

gameswithgo · on Feb 7, 2018

I would argue that performance always matters, and that Python is never the right tool for the job in an absolute sense.

Python may be the right tool for the job given the options we have today but there is no reason we cannot have a language exactly as nice to use as Python is, but that also provides good performance. Languages like Nim or F# approximate that ideal, for instance. And while I realize there are high perf variants of Python, these should be the primary standard, and only path. The slow path shouldn't exist.

It is a failure of our community that we allow languages to proliferate while remaining slow. This is bad because allowing slow tools to become popular means people create slow things, which wastes other peoples time and energy.

Electron becoming a standard way to make cross platform desktop apps is another example. Someone is not wrong to choose electron for that job, but we the software community are wrong to have let something that inefficient become the easiest way to do that job.

You cannot simply dismiss this issue by saying "Well don't use the slow software you don't like then" as many of these things become de-facto standards that you cannot avoid. Your place of work may require Microsoft Teams as the chat software, and now you are using a huge % of your laptops ram and battery for simple text transmission. Atom becomes the popular target for language plugins and ends up the only usable way to get good IDE features for your language, and you suffer the performance hit for it.

We can do better!

lmm · on Feb 7, 2018

> It is a failure of our community that we allow languages to proliferate while remaining slow. This is bad because allowing slow tools to become popular means people create slow things, which wastes other peoples time and energy.

Make it work, then make it work right, then make it work fast. I mean yes, a lot of things are slower than they should be, but the level of outright correctness bugs in software today is mindblowing. So while replacing our tools with faster tools should be something we do in the long term, I'd put a higher priority on lowering defect rates and making it easier to produce working software.

blub · on Feb 7, 2018

This "make it work" adage makes no sense at all when scrutinized.

Does anyone believe that other industries think like that? First let's invent a washing machine that washes, but occasionally sets clothes on fire and takes 24h for a washing cycle. Then we'll redesign it so it doesn't set things on fire, and finally redesign it yet again to finish in 2h.

It's incredibly wasteful. For any complex project, making it work and fast only at the end will either result in massive cost overruns or an outright canceled project.

brians · on Feb 7, 2018

Yeah. Speed is the dominant cost in software, but other industries treat other costs similarly. To pick another darling, look at Tesla: the Roadster is expensive, flammable, suitable only for enthusiasts. The S is generally useful, but too expansive. Still a narrow market. The E is their first general purpose product.

That’s normal. Same deal with Apple II, Mac, iPhone. Same deal with ether, coarse general anesthetic, modern mixes.

Yes, this is completely normal, and software based products should expect to evolve similarly.

blub · on Feb 8, 2018

All of the things you mentioned worked and had good enough performance. There was no crappy version of the iPhone which ran out of battery in 1h and took seconds to refresh when scrolling.

So obviously they thought about the performance aspect from the start.

dagw · on Feb 8, 2018

There was no crappy version of the iPhone which ran out of battery in 1h and took seconds to refresh when scrolling.

I'm pretty sure there probably was. They where just smart enough to make sure the only people that saw that version where a handful of engineers in their R&D lab.

dagw · on Feb 8, 2018

"Make it work, make it right, make it fast" should primarily be applied in the context of individual functions and modules not whole complex projects. Don't start micro-optimizing your function for speed before you've gotten it to return the correct result. Also with the additional caveat that you should stop at make it work right until you're at least fairly sure that that function is actually important for overall performance.

zild3d · on Feb 8, 2018

Yes. It's the prototyping phase. Medicine, hardware, even food. Cliff made his energy bars "work" first in his kitchen, then started to figure out how to make them scalable and optimized for taste, etc.

Any hardware startup does the same. Get a giant, hulky prototype going, with shotty wiring, way too much weight, and get it working. This make it work first adage is because you don't know what the final product/project will need to be (e.g. do it in python and then figure out how to make it better)

blub · on Feb 8, 2018

Ah food. I believe Soylent was experimenting with this idea of make it work and then make it work right, which resulted in a lot of entertainment for those that read the toilet stories of Soylent customers.

I don't see why there has to be a hard cut between make it work and make it work right or fast in prototyping. With more thought, perhaps the product can already be largely right and fast enough and also easily improvable.

I can't help but look at this saying as an excuse to get something done fast with the hope it can be improved later. This is not a general truth.

lmm · on Feb 7, 2018

> Does anyone believe that other industries think like that?

Software is very unlike other industries.

> For any complex project, making it work and fast only at the end will either result in massive cost overruns or an outright canceled project.

Citation needed, because my experience is the exact opposite: projects that put effort into making it fast from the first never get off the ground.

blub · on Feb 7, 2018

Software is different, but it doesn't bend space and time. Redesign costs can be significantly cheaper, but they are still present, and if the project is complex enough can be just as high as in physical world project catastrophes. I assume you have heard of various software projects which cost millions and overshot their budget by 50%, 100%, 200%, etc...

Now regarding the make -> work -> fast cliché: I tend to have architectural discussions at the beginning of the project, which include among other things performance. Depending on the project it might be a surface discussion, or go deeper.

Then based on the performance requirements and design decisions, the system is implemented, performance is tracked and adjustments are made. So performance should not be put first, middle or last, it's an architectural level concern which is controlled throughout the project.

How do those of us that like to complain about "premature optimisation" whenever performance is discussed design software? Because based on HN discussions it looks like code & pray.

lmm · on Feb 8, 2018

> Redesign costs can be significantly cheaper, but they are still present, and if the project is complex enough can be just as high as in physical world project catastrophes. I assume you have heard of various software projects which cost millions and overshot their budget by 50%, 100%, 200%, etc...

Indeed; in my experience such failures tend to be caused by too much rather than too little design and architecture up front.

> How do those of us that like to complain about "premature optimisation" whenever performance is discussed design software? Because based on HN discussions it looks like code & pray.

Don't try to design at the start when you don't know anything; let the design emerge. Do the simplest thing that might possibly work; 90% of the time it does work, the other 10% of the time you learn about another constraint. Get the simplest use case working, then iteratively expand to the full functionality. Refactor continuously and fearlessly (and adopt whatever testing/verification practices you need to make it fearless), driven by the changes that you need to make.

It's easy to mock, but it works much better than trying to design as a separate activity.

moocowtruck · on Feb 7, 2018

i'd like to know how we make correct software

lmm · on Feb 7, 2018

Type systems and languages that make them easy to use. I mean, other approaches are possible, but type systems are something we already all use and understand, and they seem to be enough.

theli0nheart · on Feb 7, 2018

You're conflating Python the language and Python the default runtime implementation (CPython). PyPy, a Python JIT compiler, has shown you can have an incredibly fast Python implementation. In some cases, it's faster than C.

Maybe this is what you're looking for. :)

http://speed.pypy.org

blub · on Feb 7, 2018

Python and CPython are the same thing for the majority of Python developers, just like the Oracle JVM is a synonym of Java.

Yes, there are other exotic runtimes, but only one official one and that's what people will use.

theli0nheart · on Feb 7, 2018

I get that, and I agree, but my point was that if you're writing Python and finding that your code is too slow, it's far easier to just drop in a faster runtime than it is to rewrite your entire existing codebase in a new language.

MR4D · on Feb 8, 2018

Not if you deploy your code to environments you don’t fully control.

theli0nheart · on Feb 8, 2018

Well, then you're out of luck. :)

DrJosiah · on Feb 7, 2018

As a guy who just spent the past summer doing (his 19th year of) Python, (6rd year of) Cython(/Pyrex), and (8th year of) C, picking around PyPy, and writing a couple transpilers; the tools are incredible. And I think you're talking out your butt here.

Speed is a matter of choice and a bit of manual tooling. You seem like a Golang advocate; great. I hope it continues to work for you. Python solves 95% of problems today, and with the continued bettering of tooling (via PyPy, Cython, Pandas, SciPy, PyTorch, etc.), Python and the dozens of active member groups (individually with 100k+ global users) work independently in every niche with performance improvements until it takes the market. We embrace, extend, wrap, and improve.

I remember when Unladen Swallow was laid upon the laps of the CPython developers. That was a LLVM backend JIT for CPython (CPython had another JIT at the time, Psyco), combined with a collection of other improvements to the CPython runtime that Google laid on the CPython developer's laps. It was a patchset against an old (and unmaintained) version of CPython, and may have taken years of further development to merge back in. Meanwhile PyPy, Pyrex->Cython, etc., were all pretty solid, so the CPython devs left the Unlaiden Swallow, because it didn't seem worth it (I agree with them). See PEP 3146 for details.

But that doesn't mean CPython devs haven't been at it; what the heck do you think all of those "optional" type annotations are for in Python 3.x? Type checkers for one, compilers for two. Oh... yeah, we've been moving towards optional static type checkers and static compilation in the Python community for years; and as a community, you can basically piece it all together. Is it 100% yet? No, but it's a solid 95% for most use-cases.

Watching Golang over the years; it feels a lot like if it wasn't in Golang, it wasn't worth using with Go. I say this as a user of cgo from 2012-2013 in a failed attempt to make something in Golang + sockets + goroutines faster than the equivalent in Python + sockets + threads + C. But when I was seeing better performance in Python, better error reporting in Python, better C library wrapping in Python, and better tooling in Python - I went back to Python (funny how 20+ years of tooling will do that).

Don't get me wrong, I love the speed of the Golang compiler. It's just mostly everything else in the ecosystem I don't like; including the lack of a viable C/Golang interface for anything nontrivial (like leaving C threads running with references to Golang objects/structres), the hilariously short official documentation for cgo, the basic need to re-implement the world in Golang to get good performance, and still being subject to the whims of Rob Pike - whose bad decisions (in the form of Sawzall) already wasted a week of my life when I was at Google.

You want performance in Python? Okay. Where do you want performance? If there isn't already a library there to help you, I'd be very surprised.

weberc2 · on Feb 8, 2018

I don't buy these arguments. You hit on a pathological case in Go where Python managed to be faster. That's not remotely typical, and even the "just use C/Cython/Numpy" is a huge oversell--there are some problems that are amenable to dropping into a lower level language, but many more are not. At best, it's just very, very hard to make Python as fast as unoptimized Go.

Besides performance, CGo and tooling seem to make up the bulk of your criticisms. I agree that Python's C interop story is strictly better than Go's, but this seems like throwing the baby out with the bathwater for a huge suite of applications. I also strongly disagree that Python's tooling is better; I think your 20 years of experience with it has biased you against the frustrations the rest of us have when we try to pick it up. By contrast, I've found Go's tooling to be superb--profiling, testing, benchmarking, documentation, etc all included out of the box. There are still holes (like debugging), but I daresay Go's tooling is categorically better than Python's.

blub · on Feb 7, 2018

"You want performance in Python? Okay. Where do you want performance? If there isn't already a library there to help you, I'd be very surprised."

I assume you haven't read the article, but give it a try. The author explains where they wanted performance and that there was no library, so they had to write some ugly code instead.

ziftface · on Feb 7, 2018

I agree that we can do better. But I don't think you'll ever be able to avoid some trade-off between CPU performance and ease-of-use. We might shrink the gap, but there will always be a market for languages that allow easy prototyping and new developer onboarding, and those languages will always be slower than C++.

lmm · on Feb 7, 2018

> This is a weird article at this point in time.

It's a timely article, because a number of things have changed in recent years to make the tradeoffs around using Python quite different from what they once were. Ten years ago, Python was slower than the alternatives by a small constant factor, datasets weren't big enough for python performance to be an issue, Python had a world-class tooling/library ecosystem and higher-performance languages at a similar level of conciseness/productivity were basically unknown.

Today, as the article says, things are different: Core counts are rising so practical Python performance is falling further and further behind, datasets have gotten large enough for Python performance to be an issue, Javascript has proven that it's possible to get much higher performance out of a scripting language, languages like Haskell have gone mainstream and offer a comparable-to-Python (better, in fact, given what a mess Python's packaging situation is) tool/ecosystem experience and comparable levels of productivity with much higher performance.

Every tool is a "sometimes", but good engineering is knowing when a given approach moves from being the right one 90% of the time to being the right one 10% of the time.

laike9m · on Feb 12, 2018

Agree, I really like the why "things are different" part.

scribu · on Feb 7, 2018

I think the article makes a lot more sense if you consider it in the context of "Python for data science".

In the last few years, there's been a lot of hype about replacing other number crunching solutions (R, SPSS, even Matlab) with the Python ecosystem of tools (Pandas, SciPy, etc.).

rhodysurf · on Feb 7, 2018

Except Pandas and SciPy use libraries written in CXX or Fortran and not pure python so speed is not really an issue with them usually

halbritt · on Feb 7, 2018

pandas.read_csv is kind of abysmally slow, unfortunately. There are a couple of alternatives, but nothing has really taken hold.

Dask exists, but not everyone can run a distributed system to read a multi-gigabyte csv.

artwr · on Feb 7, 2018

Have you tried paratext from wiseio? I have had good experiences with it.

halbritt · on Feb 13, 2018

I have. For whatever reason, I had difficulty with it.

autokad · on Feb 7, 2018

i dont seem to follow. if you are doing data science, all the bottle necked stuff will be running in numpy or pyspark. Choosing python over R, SPSS, Matlab usually doesnt come down to which one is faster, and R as far as i know is at least not vastly superior in speed.

kazagistar · on Feb 7, 2018

This was explicitly addressed in the article: as soon as you have to do anything which isn't a trivial numpy operation, performance goes off a cliff, and that can be a problem.

adamson · on Feb 7, 2018

This also isn't necessarily true. Take the example of TensorFlow. You build a representation of the computation you want to run, and then you can run nearly the whole thing end-to-end in native C++ using Eigen data structures, with occasional shuttling of data back into PyObjects (rare) or numpy (common) for metrics tracking.

Cython is a much more powerful tool than I think the author of this article realizes.

michaelsbradley · on Feb 7, 2018

As with Python, the fast libraries written for R are usually implemented with something else under the hood. Take the data.table library, for example:

https://github.com/Rdatatable/data.table

It's wicked fast for many kinds of tasks, but its R API is just a thin layer on top of C.

make3 · on Feb 7, 2018

or a cuda wrapper

chestervonwinch · on Feb 7, 2018

I don't understand what you mean about "hype about replacing ... with Python". How is it hype when the majority of people already use Python (see link)?

https://www.kaggle.com/surveys/2017

scribu · on Feb 7, 2018

The fact that a majority of people use Python for data crunching today doesn't prove that there wasn't hype in the past.

I'm not trying to say it's wrong, just that it's become a very visible niche for the language.

scaryclam · on Feb 7, 2018

This is a pattern I've had some success with several times now. Create a quick Python implementation for parts of data pipelines and then go back and re-write in Java/Go/C/whatever the best tool is for that bit of the job later when we know where the bottlenecks are.

skywhopper · on Feb 7, 2018

The quoted argument that "easy to write but slow languages are better because programmer time is far more costly than CPU speed" was pretty common, and I honestly think correct, 10-15 years ago. But things have changed.

CPU performance long ago hit physical limits, and more and more we are scaling out applications across hundreds, thousands, or millions of servers. We've passed the inflection point where CPU speed really is more expensive than programmer time, if you are running that code at a big enough scale.

Add in containerization and cloud VM platforms where the tradeoffs of space and performance versus money start to become very clear. Add in better, safer languages like Rust and Go for writing high-performance code. And today if you can spend three times the programming time writing in a faster and more efficient language, and it runs 100x as fast in a tenth the memory footprint, you are talking about massive overall cost savings.

theli0nheart · on Feb 7, 2018

> CPU performance long ago hit physical limits, and more and more we are scaling out applications across hundreds, thousands, or millions of servers. We've passed the inflection point where CPU speed really is more expensive than programmer time, if you are running that code at a big enough scale.

When you're starting a startup, scaling out your application to hundreds, thousands, or millions of servers isn't something you're going to do right off the bat, and more likely, that's never going to happen, no matter how successful your company is. The number of companies operating at that sort of scale can be counted on one hand. Most startups can run on a few boxes.

For those situations (which probably pertains to most new projects outside of big companies), programmer time is indeed, still the most important and costly input. If you're a startup and your 100x more performant Go code takes an extra 3-6 months to write, and a competitor beats you to market, no one is going to care how much faster your runtime performance is on the CPU, especially if you're a web app, when CPU time is probably should be last on the list of items which could lead to slow application performance to end users.

100x difference in CPU time is nothing compared to the 1000x loss in a cache hit or a 10000x disk read. I'd love to see an example where the CPU difference outweighs any influence from disk or memory.

_dps · on Feb 7, 2018

I was with you up until the last paragraph. Taken literally you seem to suggest there is no such thing as a CPU-bound workload. That's obviously not the case (cryptography is just one such example), but I would agree that many people think they are CPU-bound when they are really constrained by something else.

Secondly, Python and the patterns its expressiveness encourages are terrible for cache performance. In a simple C program it's easy to do something non-trivial in the space provided by L1 cache — in Python it's quite difficult even to reason about what's going to be in L1 if you're using any of the fancy features.

theli0nheart · on Feb 7, 2018

Good point. I see what you're saying and I was definitely not suggesting that. I was speaking mainly from a web application perspective, where program execution on the server is very low on the list of items affecting application speed as perceived by the end-user.

Scientific computing, on the other hand, is a completely different animal.

smitherfield · on Feb 7, 2018

>in Python it's quite difficult even to reason about what's going to be in L1

The interpreter's stack?

_dps · on Feb 7, 2018

I haven't looked at it in a while, so I could be wrong, but I think with small enough programs you can still squeeze some payload into L1 in long tight loops where you're not jumping up and down the Python stack a lot.

But your overall point stands: if you're writing non-trivial Python programs your L1 is usually spent on language/runtime overhead.

kevin_thibedeau · on Feb 7, 2018

When such things matter you drop to Cython and avoid interacting with PyObjects. Then you get native performance for tight loops.

uluyol · on Feb 7, 2018

> If you're a startup and your 100x more performant Go code takes an extra 3-6 months to write

I can write Go code nearly as fast as Python, sometimes faster if I need to refactor. Obviously this depends on familiarity with languages, but I think most of difference is probably experience with the language more than anything else.

> 100x difference in CPU time is nothing compared to the 1000x loss in a cache hit or a 10000x disk read. I'd love to see an example where the CPU difference outweighs any influence from disk or memory.

This is a little confusing. A language like Python doesn't just use more/slower instructions to do things: it has worse cache locality too.

You always deal with references to object instead of the data. So indirection is everywhere with no way to do anything about it. Primitives use more memory. On my machine, an int requires 28(!) bytes instead of 8. Then there's the fact that you have to fit the interpreter itself in the cache instead of just the code you wrote.

If you care about performance, don't use Python.

Unless of course you're using numpy or something similar

jakear · on Feb 7, 2018

To me the last sentence nullifies the entire argument. People doing a lot of numeric computations aren't (typically) doing it all in Python. They write critical sections in C (or some other "fast" language), or use existing libraries that already have done so. If they aren't writing those sections in C already, perhaps because they don't know how, or it would take too long to do right, then why would they ever choose to do the whole thing in C?

apendleton · on Feb 7, 2018

Firstly, not everyone doing performance-sensitive work is doing numeric work (that seems to have been a motivator for writing this article), so numpy isn't always practical. Secondly, I think the "if it's slow just rewrite the hard parts in C" is generally out of step with more modern options. Python gets you a really nice environment that's super pleasant to work with... until suddenly it doesn't and you're backed into a performance corner, and then you potentially need to conquer a major learning curve, take a huge usability hit, sacrifice memory safety, etc. There are increasingly common options now that let you get near-C performance for many applications while also totally avoiding that cliff. For applications that have even a decent chance of eventually needing that kind of work, I think it's reasonable to ask why you'd want to chance it when you could just write it in something both reasonably fast and much more pleasant than C from the get-go.

dragonwriter · on Feb 7, 2018

> Secondly, I think the "if it's slow just rewrite the hard parts in C" is generally out of step with more modern options.

Right, including modern options like “use Cython”, which opens up C-like power and performance within the Python ecosystem and while maintaining Python ergonomics (because Cython is both a Python language superset and had tooling integrated with Python's distutils, etc., tooling.)

Chris2048 · on Feb 7, 2018

Maybe C-devs are easier to find than Python devs with competence in C?

theandrewbailey · on Feb 7, 2018

"your 100x more performant Go code"

Let me optimize that:

"your 100x faster Go code"

weberc2 · on Feb 8, 2018

I write Python for my day job, but I often prototype in Go because it's so much easier. Also, regarding I/O, Python does nothing during I/O unless you take care to write async code and only call async libraries. Go does async by default and truly outclasses Python at IO bound tasks. For CPU bound tasks, Go is only 2 orders of magnitude faster. For I/O bound tasks, Go is about 4-6 orders of magnitude faster unless Python is carefully optimized.

commandlinefan · on Feb 7, 2018

I've been "premature optimization is the root of all evil"ed in more code reviews than I can count - yet when people complain about software, they first complain that it's ugly and then they complain that it's slow. Users don't care what it's written in.

snowwrestler · on Feb 7, 2018

If your users are complaining that your software is slow, then optimization would not be premature.

Optimization is premature when the cost of creating or managing it outweighs the business benefits it provides.

sitkack · on Feb 7, 2018

There is fast (optimization) and slow (the most straightforward thing with no consideration to perf) and then not-slow. I think not-slow is a happy medium. Fast code should and probably is, totally unreadable, looking like AlphaGo optimized your program.

commandlinefan · on Feb 7, 2018

In theory I think you're correct - you _can_ go crazy overoptimizing code to the point where it's completely unreadable. However, in my experience, it takes a lot of effort to get to this point, especially in modern languages. It seems to me that most developers err in the other direction: whatever works as long as it took the least amount of time to program (although this might well be partly driven by the management "agile" obsession with accounting for how every 15-minute increment of time was spent).

smitherfield · on Feb 7, 2018

>Fast code should and probably is, totally unreadable, looking like AlphaGo optimized your program.

Yes and no; multithreading and optimizing to remove branches can do a number on readability, but I find optimizing for cache locality and compile-time evaluation often makes code more readable. It depends on the language obviously.

smoe · on Feb 7, 2018

How many people are there who actually need to scale an application across hundreds, thousands, or millions of servers vs the ones that think they are going to need it?

I agree that at some point CPU speed is more expensive than programming time, but how many applications in the wild are actually at or beyond that point? I would be surprised if it is more than a couple of percent.

Sure if I know that a product has to serve millions from the get go, I'd choose a more performance focused language than Python. But if it is about building an MVP for a new startup it seems quite unreasonable to me to spent 3x the time to get it out of the door just for the extremely slim chance that the product takes off faster than Python is able to catch up with.

stuaxo · on Feb 7, 2018

I'd quite like to be able to write some plain python (not PIL or numpy or Cython or whatever) to change all the pixels in an image at a similar speed to say Java.

bbatha · on Feb 7, 2018

> The quoted argument that "easy to write but slow languages are better because programmer time is far more costly than CPU speed" was pretty common, and I honestly think correct, 10-15 years ago. But things have changed.

I don't think things have fundamentally changed in the programmer time is cheaper than cpu time calculation. What has changed is:

1. Classic dynamic languages (ruby, python, etc) all heavily assume the world is single threaded and blocking io can be scaled well enough. Unfortunately, these maxims from the late 90s do not hold about modern CPUs. Languages that embrace non-blocking io such as node.js have been incredibly successful as a result.

2. Software projects have gotten bigger. Extremely anecdotally projects in dynamic languages get uncomfortable with about 10,000 lines and unworkable at about 100,000 lines. A lot of the popularity of these dynamic languages was that you didn't need to type out your types all the time. Newer languages languages have shown have been able to get the best of both worlds with type inference and structural typing. Typescript's meteoric rise is because of exactly this.

Go and to a lesser extent Rust, have recognized these two issues facing last generation dynamic languages and have been successful because they address these issues head on. You can start a new Go application and have a similar time to market as python or ruby, but have it scale with your company, both in developer time and performance, for a long time. Why wouldn't you choose Go or some other new language?

halbritt · on Feb 7, 2018

To your point 1: This is the reason that AsyncIO exists in Python 3.6 and IIRC Facebook is pushing really, really hard to adopt it.

bbatha · on Feb 7, 2018

Absolutely, but its arguably 10 years too late. Node has already eaten python's lunch in the server space. To get your existing python code working on async you may basically need a full rewrite including any libraries that you pulled in. At that point most companies ask themselves if python is really the right language to conduct a rewrite in.

gspetr · on Feb 9, 2018

https://en.wikipedia.org/wiki/Twisted_(software)

7 years ahead of node.js

bunderbunder · on Feb 7, 2018

I also firmly believe that "easy to write but slow languages" sets up a false dilemma.

Haskell, JavaScript and C# are all fairly instructive examples. Haskell for demonstrating the level of flexibility you can achieve with a really well-thought-out static type system. JavaScript for showing how much of the "dynamic tax" you can avoid at run-time with an aggressive enough JIT compiler. C# for showing that you can get a pretty nice final result out of pragmatically blending a little bit of dynamic typing and a little bit of the ML-style static typing experience into a language that started out with a Java-style type system.

abuckenheimer · on Feb 7, 2018

When people say cpython is slow they are generally pointing to two things

1) The interpreter is _slow_

2) You can't achieve real thread based parallelism

I think in general people have a high level concept of (1) as static vs. dynamic and interpreted vs compiled is an understandable trade off. CPython as an implementation generally gets type casted as slow for its default dynamism which have understandable negative performance implications[1]. However CPython also gives you lots of ergonomic ways to push your program towards the static/compiled end of the spectrum with things like pandas/numpy/numba/extensions. In general using these correctly put you within the ballpark of _faster_ languages, could you write faster assembly by hand? Sure! Is optimizing in another language worth your time? I don't know.

I've never really understood (2) the lack of threading as a problem, multi process parallelism can be accomplished fairly easily if you are CPU bound or projects like uvloop[2] make async tasks fast enough to compete with any web framework out there. Furthermore even though your cpu cycle cost may be levered to a considerable point while operating over hundreds/thousands of servers doing distributed computing right is still hard and developing something like celery/airflow/luigi/dask from scratch is not cheap either. Leaning on CPython's massive ecosystem can massively lower the barrier to entry in a lot of big problems.

I think there are plenty of examples of re-writes in go[3]/rust[4] that have worked great for people, I have no doubt that python is _not_ the end all be all language, but I think the "python is slow" worry is generally overplayed.

Specific problems require specific solutions, I'm glad Haskell seems to work for the author in genetic analysis but think this could have been a more interesting article with some specific python-haskell comparisons rather than the generic python is slow argument.

[1] http://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/

[2] https://magic.io/blog/uvloop-blazing-fast-python-networking/

[3] https://web.archive.org/web/20170101002625/http://blog.parse...

[4] https://blogs.dropbox.com/tech/2016/05/inside-the-magic-pock...

ptype · on Feb 7, 2018

I love Python. The language is a joy, the eco system is fantastic. But yes, let’s be honest, if you can not vectorise your code it is slow, and I think that will be its downfall eventually.

I’m excited about Julia, I hope it gains popularity and the eco system grows. Until then, and in particular until the data frames story can compete with pandas, it Python with Cython for me, but I’d rather skip the Cython if it was not necessary for performance.

Any early adopters running Julia in production with stories to share?

tekkk · on Feb 7, 2018

I feel bit ambivalent about Python as it's a nice language for prototyping and quickly hacking things done. Yet I'm always baffled when I read Numpy's or Matplotlib's documentation and try to make sense of it as it can be (or at least feel) so complex and highly ambiguous. Eg. sometimes there is no/very brief examples at Numpy's documentation pages how the method works and most results from Google are only about advanced implementations, not about the basics of the method itself. In Matplotlib I still don't understand what is the right way of initializing a pyplot, there seems to be a million ways to do it and a million parameters you can give. API changes and inconsistencies too pain me at times (Pandas comes to my mind). While not a fault of Python as a language I think they greatly contribute to the experience of using Python.

Also I don't feel like the culture of Python programming focuses too much on documenting things which makes reading code at times like transcribing ancient Latin manuscripts. Maybe a good analogy would be JS back in the days with global jQuery scripts. Too unrestricted and free-form maybe. I'd wish Python became more like Kotlin with very clear patterns and great IDE support (in addition to PyCharm).

Well those are at least my experiences and feel free to disagree with me.

narimiran · on Feb 7, 2018

> In Matplotlib I still don't understand what is the right way of initializing a pyplot, there seems to be a million ways to do it and a million parameters you can give. API changes and inconsistencies too pain me at times

Matplotlib has the worst API of all Python libraries I have used over the years!

If there were a fork of it that got rid of Matlab-way of doing things (keeping only OOP style) and with consistent names (no more `twowords` and `two_words`), I would gladly switch in a heartbeat.

wirrbel · on Feb 7, 2018

This is maybe more hearsay, I only briefly tried to use Julia.

It is true, Julia has great features for performant code. It is, however, focussing too much on being a matlab competitor in my opinion. It will not be a language that you use to write a "normal" (i.e. non-numeric or CRUD) dynamic website in. My general observation however is, that you need to attract this crow, if you want to have an ecosystem with a variety of tooling. And it is the neat thing about Python (and Haskell).

ChrisRackauckas · on Feb 7, 2018

I'm using Julia and loving it. I've built a bunch of differential equation solvers which routinely outperform the classic C++/Fortran codes. I started out without "software development" experience but Julia and its community got me up to speed and helped me build something quite unique. Now Julia is the only language that has the numerical libraries I need to do my research.

In fact, the whole library story in Python/MATLAB is quite overblown. If you're doing something which is actually new, like PhD methods research, you need to be writing a lot of stuff from scratch. And in that case, you usually cannot get by with vectorizing everything... and vectorization always has the issue with temporary arrays too. Meanwhile, Julia's type system makes everything fast (which is a plus when trying to publish a paper on it!) but also get cool extra features for free like GPU support and arbitrary precision. For people developing and testing new methods, Julia is the best tool right now.

rhaps0dy · on Feb 8, 2018

>If you're doing something which is actually new, like PhD methods research, you need to be writing a lot of stuff from scratch.

Sometimes you also reuse a lot of stuff, it depends. For machine learning in particular, almost everything uses some sort of gradient-based optimisation algorithm. In these cases, it is very useful to have an automatic differentiation library. My coworker said Julia has about 3, IIRC, and until an amalgamation of them is merged into the standard library Julia isn't completely ready.

newen · on Feb 8, 2018

> until an amalgamation of them is merged into the standard library Julia isn't completely ready

That doesn't quite make sense. I'm sure there are a few more autodiff libraries in Julia than 3. You would just use the one that fits your use case.

PS. Ohh re-reading your comment, you want an autodiff library in Julia's standard library. That is very unlikely to happen since it's not (very) hard to cook up an autodiff library and autodiff is not widely used. Julia is not like Matlab, where you have to have everything in the standard library.

rhaps0dy · on Feb 9, 2018

Yeah, you read it correctly the second time :)

>That is very unlikely to happen

Well, my co-worker says people are working on it. I asked him whether I should try to adopt Julia and "not until autodiff is in the standard library" was his response.

>autodiff is not widely used

If you look only at the machine learning community, that is false. Autodiff is used all the time, to optimise loss functions using a variant of gradient descent. For neural nets, gaussian processes, SVMs... Not decision trees/forests, but these are not more popular than all the others combined.

one-more-minute · on Feb 9, 2018

Maybe there was some small miscommunication here – we are indeed working on "one AD to rule them all" (Capstan, [1]), but it won't go in the standard library as such (there's no need as it will be just as good as a package).

That said, Capstan relies on very new compiler technology and still requires some deep changes there, so will only work on future Julia versions (which may be what he was referring to).

Until then, Flux [2] has an AD that's well-suited to ML and works on current Julia versions.

[1] https://github.com/JuliaDiff/Capstan.jl/ [2] https://fluxml.github.io/

dahart · on Feb 7, 2018

> It is true that programmer time is more valuable than computer time, but waiting for results to finish computing is also a waste of my time (I suppose I could do something else in the meanwhile, but context switches are such a killer of my performance that I often just wait).

In film CG production, we had a rule of thumb. If running an interactive program takes longer than about ten seconds, the artist (user) becomes more likely than not to get up and go get some coffee or talk to someone else. We consciously made an effort to keep anything someone needed to wait for to under ten seconds, and save anything longer than that for nightly farm renders. We were writing the code in C, btw.

cbcoutinho · on Feb 7, 2018

Somewhat tangential to this, but I remember reading an article on HN about a group creating some web app that was constrained by some size limit (50 KB or so). Groups ended up putting 'dead code' in their projects to guard against other groups taking their space while they were working on their feature. I think this made it so the app never got less than 50...

Did you ever see something similar where devs put in some code as filler to make room for future features they were working on?

dahart · on Feb 7, 2018

Oh yeah, this is very common in game development, and I'm pretty sure in other embedded dev too. There's a famous story about a game programmer who saved an entire production when it started crashing something like two weeks before shipping by commenting one line of code. Turned out the line of code was a malloc (of like a megabyte) that he'd added a year earlier, in anticipation of the game running out of memory. My details are probably wrong, but I'm pretty sure I've seen this story linked on HN.

I was in game dev for a decade, and I saw this happen where I worked, the studio technical director adopted the practice of saving some space, because we always started running out of memory near the deadline as artists threw in all their content.

*edit: http://www.dodgycoder.net/2012/02/coding-tricks-of-game-deve...

pfranz · on Feb 7, 2018

In my experience of vfx usually the same author has ownership over the whole tool, so there's no incentive to pad time. We have joked about adding in sleeps and removing them look impressive to users, though.

Games probably have a situation more similar to what you're describing. Many games target 30fps, which is 33.3 milliseconds per frame. Each department usually gets a "budget" of how long they can spend. I've heard similar stories about padding memory usage and time, but it's hard to tell if they were serious and it's not common.

digikata · on Feb 7, 2018

It's typical in mil & aviation software to get specs with a required 50% cpu & memory reserve in anticipation of updates in the maintenance window (which could be shown as decades for many products).

narimiran · on Feb 7, 2018

If you would like 10-100x faster performance than Python, but would like to keep the easy-to-read code, give Nim [0] a try.

I do all my work in Python, and I've been using Nim in last couple of months - it took me a week or two until I was able to be productive in Nim.

Don't expect Python's large ecosystem, nor some Python goodies, but if you're looking for a readable, writable, high-performance post-Python language - Nim is the way to go!

[0] https://nim-lang.org/

platz · on Feb 7, 2018

I mean, if you're willing to give up the ecosystem, there are tons of options out there.

carlmr · on Feb 7, 2018

Exactly, the ecosystem is the only reason IMHO why Python is "easy". I find I'm much quicker at developing C#/F# if the library is available on NuGet (which is often, but not always the case). They're both reasonably fast, too. Python only has a very large community and thus ecosystem to offer.

dTal · on Feb 7, 2018

It can't be the only reason, or how would it ever have attracted such an ecosystem in the first place? Especially with its performance characteristics, and lack of corporate backing.

No, Python was invented at a time when its closest competitor was Perl - and you need only compare typical Perl with typical Python to appreciate that Python really was a usability revelation.

But that was nearly 30 years ago. I do think we can do better now.

dorfsmay · on Feb 7, 2018

• the language is easy, very attractive for non-CS people

• the language is consistent (everything is an object, or a pointer to an object rather, you can only pass by pointer, scope and namespace that make sense all the time, etc...)

• you can learn it gradually, you can start using it even if you know only 10% of the language

• Amazing documentation. The official tutorial is easy to read, and by the time you are through with it you are fairly proficient with python

• error message that tells you exactly what the error is and where it is happening. A lot of other languages do that now, but that wasn't the case 25 years ago.

• inline help / magical docstring (`a=4 ; help(a)`)

• The REPL, especially being able to get into the REPL with your apps (`python -i`)

• batteries, again 25 years ago, no language came with such a huge standard library

• PEP 8, the grandfather to gofmt and rustfmt. Makes a big difference when working on a team

• the Zen (`import this`)

And of course as the language grew in popularity:

• the ecosystem (pypi.org)

• the amount of resources, from books to website, the answer to every questions you can think of in SO, very active and helpful mailing list/usenet group, active and helpful /r/Python

bzbarsky · on Feb 7, 2018

Scope and namespace that make sense all the time?

  for foo in bar:
    pass
  # Why is "foo" in scope here???

NoahTheDuke · on Feb 7, 2018

Because Python doesn't have block scoping, only function scoping.

bzbarsky · on Feb 7, 2018

Yes, my point is that function scoping can hardly be described as "makes sense all the time". Especially given that variable declarations in python are _very_ implicit, without even the "var" that will indicate to you that you're messing with your function scope in JavaScript.

empthought · on Feb 7, 2018

> Yes, my point is that function scoping can hardly be described as "makes sense all the time".

Block and function scoping both "make sense all the time." Whether or not you personally are accustomed to a language's design decisions does not have any bearing on whether or not those decisions "make sense."

In the particular case you cited, of course it is natural that `foo` is in scope outside of the loop; it appears in a line outside of the loop. This is trivially evident in the indentation.

bzbarsky · on Feb 7, 2018

The context is that we're talking about people picking up the language for the first time, coming from other languages. What matters is whether the scoping rules make sense to such people.

For that, it matters whether they're coming from another function-scoped language (or are familiar with one), and whether they're used to having their variable declarations being obvious or not.

For me, when I was first learning Python, this scoping issue was a significant pain point. This is obviously an anecdote, not data, but again for purposes of "people first coming to the language" it's relevant.

Obviously once one has worked in Python for a while one internalizes things like this, just like one internalizes various things in other languages that are obvious pain points for beginners.

empthought · on Feb 8, 2018

Python is and always has been designed to be an introductory programming language, so your criticism does not apply in the case of this design intent. More people learn Python first than learn another language first. Having seen beginning programmers learn Python, I can say this is an utter nonissue to them.

For the case of experienced programmers, the behavior is consistent and simple. In what way does this not "make sense?" Does Haskell's normal order evaluation not "make sense?" Does Ruby's optional parentheses for method calls not "make sense?" Does a regular expression literal syntax not "make sense?" Does JSX not "make sense?" Do public/protected/private modifiers in Java not "make sense?"

Stop confusing a match to your personal comfort zone for actual quality or fitness-for-purpose.

bzbarsky · on Feb 9, 2018

> the behavior is consistent and simple

Let's look at our example again:

  for foo in bar:
    pass
  # Why is "foo" in scope here???

Well, the answer could be "it's not". It depends on whether bar actually produced anything to assign to foo. Doing a "print(foo)" after that loop might print something, or it might throw a NameError.

Now I understand why that happens (in terms of what the loop actually desugars into), and you understand why it happens. But it's not as consistent and simple as you're trying to make it out to be. You have to really understand what a for-in loop is doing under the hood to explain the behavior.

You seem to feel that I'm attacking Python or something. I'm not. It's a nice language to work in, with a lot to recommend it. But it's not a language I'd choose as a poster child for scope and namespace making sense all the time unless you really dig into what's going on "under the hood".

> Does Haskell's normal order evaluation not "make sense?"

No opinion, really; not enough intimate familiarity with the problem space to have one.

> Does Ruby's optional parentheses for method calls not "make sense?"

Again, no opinion.

> Does a regular expression literal syntax not "make sense?"

It depends on the regexp. If your regexp is simple enough, it's fine. In far too many cases you end up with a write-only monstrosity. Also, you say "syntax" as if there were only one; there are multiple and some make more sense than others.

> Does JSX not "make sense?"

Again, no opinion.

> Do public/protected/private modifiers in Java not "make sense?"

It depends on how they're used.

> Stop confusing a match to your personal comfort zone for actual quality or fitness-for-purpose.

Stop confusing "doesn't always make sense" for "doesn't make sense" (totally different statements, there), and the former statement for a statement about quality of fitness-for-purpose. Lots of things are considered high-quality and fit for purpose while still having flaws. Possibly flaws that could not have been avoided without sacrificing other goals. But determining that requires first admitting that the flaws exist and then evaluating them. If we either pretend that the flaws don't exist, or that flaws existing is somehow an indicator that the entire system is unfit for purpose, it's hard to think productively about the design of the next system.

dahart · on Feb 7, 2018

> It can't be the only reason ... that was nearly 30 years ago. I do think we can do better now.

Maybe the ecosystem is the only reason left today. The reasons for initial adoption 25 years ago and the reasons for widespread usage today probably aren't the same reasons. Numpy, Anaconda & Jupyter notebooks didn't exist then, and now they're a huge reason for Python usage. I can't even think of a language that comes with a standard library that rivals Python's, let alone the ecosystem.

vgy7ujm · on Feb 7, 2018

IME Perl can be more readable if you are experienced in Perl.

carlmr · on Feb 7, 2018

Fair point, I was importing the diapers module 30 years ago. But I think in today's world the advantage is mostly in the ecosystem. How the ecosystem came to be is another story.

rthomas6 · on Feb 7, 2018

Another reason is the documentation. It's fantastic.

lmm · on Feb 7, 2018

Why Nim rather than e.g. Haskell (mentioned in the article) or OCaml, which are much more mature and have much bigger, more established tool/library ecosystems?

narimiran · on Feb 7, 2018

> Why Nim rather than e.g. Haskell?

Because Nim syntax will be familiar to Python developer. Sometimes all you need to change is add variable declarations and rename `def` to `proc`.

Haskell has a much steeper learning curve. Been there, struggled with that. If I would recommend a functional language to Python developer, I would go with F#.

cjalmeida · on Feb 7, 2018

Their syntax is weird for someone coming from C-like langs (Java, C, C++, C#)

dragonwriter · on Feb 7, 2018

> If you would like 10-100x faster performance than Python, but would like to keep the easy-to-read code, give Nim [0] a try.

Why not Cython, which gives speedups in those ranges, is more familiar to Python developers and fully integrated into the Python ecosystem?

alecco · on Feb 7, 2018

Or just program a module in C or Cython.

narimiran · on Feb 7, 2018

> Or just program a module in C

You can do that in Nim - it compiles to C.

If you have no previous knowledge of C, with Nim you get C-like speed with Python-like easy to write syntax.

weberc2 · on Feb 7, 2018

Which is error prone and often actually slower depending on calling patterns between C and Python. There are many languages today that offer better ergonomics than Python/C, and a few (like Go) which offer better ergonomics than Python by itself, all while besting it in performance by one or more orders of magnitude.

I like Python; I just wish I could say the same for its developers...

dragonwriter · on Feb 7, 2018

> There are many languages today that offer better ergonomics than Python/C

Including Cython. I mean, if you've got a C library, sure, interface it with Python; but unless you want something to be called from something else in addition to Python, dropping to C for performance needn't be the default choice in Python; that's the whole reason Cython exists.

> and a few (like Go) which offer better ergonomics than Python by itself

I find Go’s ergonomics to be worse than Python but better, mostly, than Java.

> all while besting it in performance by one or more orders of magnitude.

And, still, including Cython.

weberc2 · on Feb 7, 2018

I would consider Cython if the particular bottlenecks were amenable to calling into a lower level language and if the alternative were porting a large application from Python to something else wholesale, but I would probably never start a new application in Python/Cython if there was any chance that performance would ever matter. The alternatives are simply too good.

> I find Go’s ergonomics to be worse than Python but better, mostly, than Java.

This is surprising. I'm a Python developer, and I still prototype new features in Go.

MichaelRenor · on Feb 7, 2018

> The result is that I find myself doing more and more things in Haskell, which lets me write high-level code with decent performance (still slower than what I get if I go all the way down to C++, but with very good libraries).

This strikes me as an odd conclusion to come to if speed was the main motivator.

luispedrocoelho · on Feb 7, 2018

OP here.

Speed is the main motivation, but total time is TimeToWriteCode + TimeToRunCode.

Python has the lowest TimeToWriteCode, but very high TimeToRunCode. C++ has lowest TimeToRunCode, but high TimeTowWriteCode. Haskell is often a good compromise for me.

Also, with Haskell, it can be very easy to take advantage of 20 CPU cores, while I don't have as much familiarity with high-level C++ threading libraries.

aldanor · on Feb 7, 2018

@ the OP - not to sound hostile, but you write code (like in the example here [1]) that is bound to be slow, just from a glance at it. vstacking, munging with pandas indices (and pandas in general), etc; in order for it to be fast, you want pure numpy, with as little allocations happening as possible. I help my coworkers “make things faster” with snippets like this all the time.

If you provide me with a self-contained code example (with data required to run it) that is “too slow”, I’d be willing to try and optimise it to support my point above.

Also, have you tried Numba? It maybe a matter of just applying a “@jit” decorator and restructuring your code a bit in which case it may get magically boosted a few hundred times in speed.

[1] https://git.embl.de/costea/metaSNV/blob/master/metaSNV_post....

luispedrocoelho · on Feb 7, 2018

That is the _FAST_ version of the code (people keep saying "of course, it's slow", when it's the fast version).

Here is an earlier version (intermediate speed): https://git.embl.de/costea/metaSNV/commit/ff44942f5f4e7c4d0e...

It's not so easy to post the data to reproduce a real use-case as it's a few Terabytes :)

*

Here's a simple easy code that is incredibly slow in Python:

    interesting = set(line.strip() for line in open('interesting.txt'))
    total = 0
    for line in open('data.txt'):
        id,val = line.split('\t')
        if id in interesting:
           total += int(val)

This is not unlike a lot of code I write, actually.

proto-n · on Feb 7, 2018

I've also found that loops with dictionary (or set) lookups are a pain point in python performance. However, this example strikes me as a pretty-obvious pandas use-case:

    interesting = set(line.strip() for line in open('interesting.txt'))
    total=0
    for c in chunks: # im lazy to actually write it
        df = pd.read_csv('data.txt', sep='\t', skiprows=c.start, nrows=c.length, names=['id','val'])
        total += df['val'][df['id'].isin(interesting)].sum()

I'm not exactly sure, but pretty sure that isin() doesn't use python set lookups, but some kind of internal implementation, and is thus really fast. I'd be quite surprised if disk IO wasn't the bottleneck in the above example.

luispedrocoelho · on Feb 7, 2018

`isin` is worse in terms of performance as it does linear iteration of the array.

Reading in chunks is not bad (and you can just use `chunksize=...` as a parameter to `read_csv`), but pandas `read_csv` is not so efficient either. Furthemore, even replacing `isin` with something like `df['id'].map(interesting.__contains__)` still is pretty slow.

Btw, deleting `interesting` (when it goes out of scope) might take hours(!) and there is no way around that. That's a bona fides performance bug.

In my experience, disk IO (even when using network disks) is not the bottleneck for the above example.

proto-n · on Feb 8, 2018

Ok, I said I wasn't sure about the implementation, so I looked it up. In fact `isin` uses either hash tables or np.in1d (for larger sets, since according to pandas authors it is faster after a certain threshold). See https://github.com/pandas-dev/pandas/blob/master/pandas/core...

aldanor · on Feb 7, 2018

Could you give a hint of how the data ("sample1", "sample2") looks like, or how to randomly generate it in order to benchmark it sensibly? I guess these are similarly-indexed float64 series where the index may contain duplicates? Maybe you could share a chunk of data (as input to genetic_distance() function) as an example if it's not too proprietary and if it's sufficient to run a micro benchmark.

There's also code in genetic_distance() function that IIUC is meant to handle the case when sample1 and sample2 are not similarly-indexed, however (a) you essentially never use it, since you only pass sample1 and sample2 that are columns of the same dataframe (what's the point then?), and (b) your code would actually throw an exception if you tried doing that.

P.S. I like the part where you've removed the comment "note that this is a slow computation" :)

onuralp · on Feb 7, 2018

Have you checked out scikit-allel? It is fairly comprehensive in terms of calculating basic population stats, and the developer is highly active.

scikit-allel: http://scikit-allel.readthedocs.io/en/latest/index.html

scikit-allel example: http://alimanfoo.github.io/2015/09/21/estimating-fst.html

zarr: https://github.com/zarr-developers/zarr

BerislavLopac · on Feb 7, 2018

The speed could possibly be improved by using map. Also, not related to speed if this is all of the code, but might affect it in a larger programs: you should make sure your file pointers are closed. Something like:

    with open('interesting.txt') as interesting_file:
        interesting = {line.strip() for line in interesting_file}
    with open('data.txt') in data_file:
        total = sum(int(val) for id, val in map(lambda line: line.split('\t'), data_file) if id in interesting)

man-and-laptop · on Feb 7, 2018

`map` is not going to make it faster. `map` is a loop. Only vectorized code is faster.

deckiedan · on Feb 7, 2018

Have you tried using Cython to compile code like the above? Python's sets / maps / reading data etc should be fairly optimised, so Cython might let you bypass boxing counter variables instead using native C ints or whatever.

Also, if the data you're reading is numeric only - or at least non-unicode / character data - you might be able to get a speed boost reading the data as binary not as python text strings.

peterwaller · on Feb 7, 2018

> you write code (like in the example here [1]) that is bound to be slow, just from a glance at it > [1] https://git.embl.de/costea/metaSNV/blob/master/metaSNV_post.....

Given his code you referenced, could you elaborate on what makes it look slow at a glance, and how you might speed it up? :)

anaptdemise · on Feb 7, 2018

ln 221:

    if snp_taxID not in samples_of_interest.keys():#Check if Genome is of interest

Tracing through, looks like samples_of_interest is a dict. `snp_taxID not in samples_of_interest` would make membership check constant time.

ihnorton · on Feb 7, 2018

> Also, have you tried Numba?

Numba does not support dictionaries and has limited support for pandas dataframes (only underlying arrays, when convertible to NumPy buffers, if I understand correctly). This limits usefulness for many non-array situations, as well as some existing code-bases (the dictionary is fundamental in Python and typically used everywhere -- often for performance).

pbowyer · on Feb 7, 2018

Brian Moore's quip [0] about mod_rewrite comes to mind every time I use Numba:

"Despite the examples and docs, Numba is voodoo. Damned cool voodoo, but still voodoo"

0. https://httpd.apache.org/docs/2.0/rewrite/

biztos · on Feb 7, 2018

Interesting assertion re: TimeToWriteCode, but I think there's TimeToWriteCode vs. TimeToWriteGoodCode.

I'm working on my first serious Python project right now, and I find it's super easy to throw together some code that more or less works; but for solid, readable, documented, properly unit-tested code I hope is production-ready, it's not any faster than Perl or Golang.

(Sure, if you're a Python expert it's faster for you than for me, but if it's about TimeForExpertsToWriteGoodCode I'm not any more convinced.)

guitarbill · on Feb 7, 2018

Production-ready is so complex, it's hard to make any comparison. E.g. for a library, writing good documentation (with diagrams and decent technical writing) takes me way longer coding anyway - probably by an order of magnitude.

Proper unit-testing is also going to take roughly the same time in any language, just because you have to think hard about sensible tests (although I still love mocking/patching in Python, so I'd give it an edge, plus pdb/ipdb for debugging tests is cool). Production-ready also includes deployment, which for anything non-trivial I'd say Golang > Python > Perl.

Finally, if we're talking "serious project", IMO tooling and how that tooling integrates into a CI pipeline are more important than development speed, because as a team or project goes, terrible CI will slow developers more than any language. Although again here I think Python does quite well with decent linting, unit test frameworks, and code coverage options, Golang's opinionated tools are simpler in this respect.

(I enjoyed C# for similar reasons, although I don't think it's kept up w.r.t. tooling - been ages since I used it though.)

biztos · on Feb 7, 2018

Good points. So far I find I really like Python's mocking, "with self.some_useful_patch()" is really nice, and I like the idea of side effects especially with boto. Of course in some cases it's really difficult, but every language has its tricky unit-testing problems.

One big point I would give to Golang, about which lots of people disagree with me, is the "opinionatedness" of it. It seems to me that Python, like Perl, has a "There's More Than One Way To Do It" mentality, and after many years of that I really appreciated Golang's emphasis on the "idiomatic." That goes for the tooling too.

I have also noticed that the Python ecosystem doesn't have a strong documentation culture, which I find annoying as a relative newbie. But that presumably matters less over time, and it seems to be part of the Python Way to use libraries that "just work" and not worry about the details.

coldtea · on Feb 7, 2018

>Interesting assertion re: TimeToWriteCode, but I think there's TimeToWriteCode vs. TimeToWriteGoodCode.

In lots of areas, "good code" doesn't matter much, if at all.

Scientific computing is full of those cases -- you write code to run a few times, and don't care for maintaining it and running it ever again (as long as the results are correct).

biztos · on Feb 7, 2018

I often wonder about that, especially having written lots and lots of lousy, unmaintainable code in my own life.

It usually starts with "oh it's just a one-off thing" and then it turns out to be useful and the rest is messy history.

But sure, within that genre I could see Python being a faster language to write in than many others.

lloeki · on Feb 7, 2018

Sometimes even for a one-shot job you dive down and write passable code then as you start to tackle the complexities of the problem at hand you realise that the amount of ropy code has just tied your hands and now it gets increasingly harder to wrap your head around your implementation and finally complete the one-shot job.

klmr · on Feb 7, 2018

> In lots of areas, "good code" doesn't matter much, if at all.

This is the received wisdom in biological science but I’m convinced that it’s trivially wrong. I’ve seen a lot of research code, most of it bad. I have no idea how many bugs are in this code, and I know for a fact that the original authors also don’t know. And it would be truly exceptional if these pieces of code were bug-free (in fact, there’s enough software engineering know-how to categorically conclude that a very high percentage of such code has bugs). How many of these bugs affect the correctness of the results?

… since the code quality is so bad, this is impossible to quantify. So, yes, code quality does matter in science, since it affects the probability of publishing wrong results.

Incidentally, there are cases of retractions of high-impact papers due to errors in code. Of course this will also happen with better code quality; but if conventional software engineering wisdom is right then it will happen substantially less.

pletnes · on Feb 7, 2018

That’s easy with python, too, in a lot of number crunching cases. Numpy with MKL will use all your cores, as will e.g dask and other libraries built on numpy. Farming out embarassingly parallel work to threads or processes is also easy.

luispedrocoelho · on Feb 7, 2018

If I can fit the code into numpy-like structure, then Python is typically fine.

The issue is when I cannot.

fnord123 · on Feb 7, 2018

Then move the function to a pyx file and build it with Cython. Problem solved.

kalefranz · on Feb 7, 2018

Also look into numba as a jit decorator for python functions.

fxmc · on Feb 7, 2018

Have you given dask a try? It gives you out-of-core arrays with numpy semantics and distributed computing.

VHRanger · on Feb 7, 2018

Dask doesn't solve that problem since it's a wrapper around pandas functions.

If you can't make the core pandas code decently fast, dask won't save you.

kiriliponi · on Feb 7, 2018

dask.dataframe might not help but dask.distributed could in that case.

I've had success using it on non vanilla stuff (i.e. code that could not get converted to play natively with numpy/pandas structures)

As a bonus, the nice profiling tools (built within dask) have also helped me improve the performance of the code.

See https://distributed.readthedocs.io/en/latest/

pletnes · on Feb 7, 2018

There’s dask.array which works on numpy arrays instead of dataframes. Otherwise, your argument holds.

kamaal · on Feb 7, 2018

Or use languages where you don't have to these extreme workarounds for what should happen by default.

inciampati · on Feb 7, 2018

If you write more C++ than python, it will have a lower TimeToWriteCode. Despite having spent years writing python I don't find it any more productive than C++.

C++11 has all the nice features you might expect from python with the only drawback being the lack of a REPL.

klibertp · on Feb 7, 2018

The lack of REPL compounds with long compilation times, which is practically a feature of C++ and not going to go away anytime soon. The effect is that, when you explore a new API or need to tune parameters to some function call deep in the call stack, you're an order of magnitude slower than with Python (or Lisp, Scala, F#, Haskell, or even Nim or plain C (b/c compilation times)).

If you know exactly what you need to write, you're just as quick in C++ as in Python, that's true. Programming is mostly about learning what to write, though, and here C++ loses.

kamaal · on Feb 7, 2018

If you are developing your code as a small tiny functions getting stitched later. Then writing unit test cases will solve this problem too.

klibertp · on Feb 7, 2018

No, it will help with lack of REPL but not with long compilation times. Long compilation times are bad across the board. Go advertises "fast compilation" as one of its key features for a reason.

EDIT: Not to mention, if you write your code as a lot of tiny functions you could just as well write it in C. Once you go for classes and templates, that's where C++ power is visible, but that's also where its compile times suck.

TeMPOraL · on Feb 7, 2018

Writing unit tests is nowhere near a replacement for a proper REPL.

banachtarski · on Feb 7, 2018

+1 people shouldn't overlook things that are bundled in the C++ stdlib now (chrono, random, thread, algorithm, mutex, containers, etc)

cjalmeida · on Feb 7, 2018

They're great and incredibly useful. And one should not forget that you can easily use them in a Python extension written in C++14 and exported using Cython or SWIG.

whyever · on Feb 7, 2018

There is a C++ REPL: https://root.cern.ch/cling

52-6F-62 · on Feb 7, 2018

https://repl.it/ also provides C++ (and many more) support in an online version.

cozzyd · on Feb 7, 2018

My time to debug code is usually smaller for C++, although I'm not familiar with the python tooling as much.

boomlinde · on Feb 7, 2018

Is total time really that interesting as a metric? Factor in cost, both in terms of, say, what the employer pays you and what they pay for CPU time, sprinkle it with costs in terms of externalities (e.g. the cost of millions of clients executing poorly performing code vs the cost of millions of clients paying for the additional development overhead of well performing code) and the equation is a lot more complex and application-dependent.

Then weigh in the hard realities of some engineering problems. It won't matter that it takes 1% of the time to implement a video decoder in python if it can't deliver decoded frames in a timely manner. It won't matter that the C solution will run 1000x faster if you need a month to develop what should be delivered on Friday.

I'm sorry if this is already covered in the article. I had a brief look before but it won't currently load.

inciampati · on Feb 7, 2018

As for high level C++ threading you have OMP. It's incredibly easy to use. In the simplest case you just use a preprocessor directive before a loop to say it should run in parallel. It's probably not as nice as what you get in Haskell because it needs to be done explicitly but it is really easy to use.

grumpyprole · on Feb 7, 2018

GHC Haskell is an advanced optimising compiler, which can get very near the speed of C and C++. However, to write fast programs, one must use the right data structures and algorithms. Often this means array-based strings and streaming IO, unfortunately many Haskell textbooks don't tend to cover this.

luispedrocoelho · on Feb 7, 2018

Yep, my Haskell usage is "conduit all the way down".

I even wrote up a few utilities to make use of multiple threads while working at a high level: https://hackage.haskell.org/package/conduit-algorithms-0.0.7...

bjoli · on Feb 7, 2018

All comparisons of Haskell and C I have seen where Haskell has been near the performance of C(++) has always compared a highly optimised Haskell program to a moderately fast c program.

I have found that most managed languages generally come within 2-5 times slower than C and C++. Which is good enough for me.

imtringued · on Feb 7, 2018

The JVM is fast enough for me but for some reason any JVM project requires hundreds of megabytes of RAM even for the simplest things.

cpp webserver + sqlite database + cygwin overhead? 9MB RAM lua worker + websocket client? 4MB RAM JVM server + static html page + websocket relay + kurento api? 500MB RAM

With the exception of the cpp webserver none of these tasks are CPU or memory intensive yet the JVM is still off by orders of magnitude.

It's ok if it's the only application running on a server with multiple users but there is just a single user and that's me.

grumpyprole · on Feb 7, 2018

Have you tried specifying a smaller maximum heap size? The biggest memory issue IMHO, is that Java and the JVM do not specialise any "generic" code and so there can be a lot of unnecessary boxing going on.

zwerdlds · on Feb 7, 2018

Not saying you're wrong - you definitely don't get great performance easily in Haskell, but it seems to have better benchmarks than python, by a lot.

ohyes · on Feb 7, 2018

“Lies, damn lies and benchmarks” I think is the saying.

Python itself is a slow language but it has a lot of fast packages, so it shows poorly when you actually write your benchmark in python.

Haskell is a faster language but because it is high level there are more pitfalls you’ll get into if you don’t know the ins and outs of getting fast code out of the compiler. The guys who write fast benchmark code aren’t ‘average’ developers.

So in Haskell it’s “the code is slow and I don’t know why” vs python “the code is slow because python is slow, import fast package someone wrote to speed it up.”

All that said, I think Haskell is the better language but you have to put in more effort to get experienced in it before you see returns on the investment. Python has a shallower learning curve and an easy way to get “good enough” performance (a bit slower than C).

The best criticism in the article is of the multi-core deficiency of python’s interpreter. But that’s only briefly touched on. It isn’t a friendly environment to write complicated multi core code.

acqq · on Feb 7, 2018

> Python has a shallower learning curve and an easy way to get “good enough” performance (a bit slower than C).

The article we all reply to exactly claims that as soon as you don't use e.g. NumPy, it's not "good enough" anymore, and I agree with that. The article also argues that e.g. JavaScript isn't more in the same category with Python, but much faster, even if it's not less dynamic.

I think the reason for JavaScript's speed vs. Python's slowness is obvious: there were wealthy companies involved, which were, due to competition pressures, motivated to speed up their own JavaScript engines.

To get to the point where Python has similar speeds somebody would have to be motivated enough to invest heavily, and then it could happen. As far as I know, there aren't technical limitations against that.

ShroudedNight · on Feb 7, 2018

Having actively worked on a JIT compiler for CPython; in retrospect, JavaScript had a significant advantage over python: the expected requirement of one's JavaScript code to run more-or-less compatibly on a variety of interpreters.

So much Python has historically been tied to CPython's specific ideosyncracies that there is significantly more onus on the upstart VM developers to maintain compatibility with paralinguistic behaviour (things like expectations regarding object destruction sequencing).

guitarbill · on Feb 7, 2018

It seems like the downfall of anything non-CPython is either the C API or the GIL, although I'm shocked at how good package support is looking for PyPy now (http://packages.pypy.org). Makes me want to try it again.

Javascript also has the "benefit" of an appalling base library, while the base library that CPython provides is quite large, and growing.

pjmlp · on Feb 7, 2018

Every time this comes up, there are also the good examples of Lisp, Dylan and Smalltalk as languages that as dynamic as Python, while enjoying relatively good JIT compilers.

e12e · on Feb 7, 2018

Which of course python has too (pypy). Although it depends on what your bar for "relatively good" is.

pjmlp · on Feb 7, 2018

Sure, but PyPy seems to be largely ignored and I commend their developers for being so persistent.

bringtheaction · on Feb 7, 2018

Me too. I have heard from someone that used Haskell in production that lazy evaluation caused trouble in terms of achieving predictable performance.

nightski · on Feb 7, 2018

Yes, naive use of lazy evaluation can cause performance problems. So can naive use of strict evaluation. It's important to have a solid understanding of your languages evaluation model. This is probably the largest barrier to writing highly performant Haskell. Not because it is vastly more difficult or anything, but it is very different than pretty much any other widely used language.

eksemplar · on Feb 7, 2018

I read it as a compromise. I don’t think the author is primarily a programmer, but mainly uses high-level “easy” languages to do data science.

C++ would probably increases his development time significantly compared to Haskell.

thinkpad20 · on Feb 7, 2018

How so? Haskell has a very high performance ceiling.

banachtarski · on Feb 7, 2018

It really doesn't... I think it's "better" but I definitely wouldn't classify it as high.

gameswithgo · on Feb 7, 2018

it does have a high perf ceiling, but the code isn't easy to write when you approach it.

Things like C#, F#, Java, Kotlin, Nim, Lua would be more natural things to turn to when you want something "Easy" like python but faster, I think.

nightski · on Feb 7, 2018

I think you could replace "isn't easy" with "isn't familiar". Haskell is very likely to be the first language a developer encounters which is lazy by default instead of strict by default (for many good reasons).

So no, it's not easy in a similar fashion that pointers or double pointers in C/C++ are not easy. Or understanding call by value vs. call by reference semantics are not easy. The list goes on. It's probably the largest barrier to learning the language.

But once you get a handle on the evaluation model it becomes a lot more natural. At least that was my experience, maybe it is not typical.

gaius · on Feb 7, 2018

I use OCaml when I need something like a fast, type-safe Python

ChrisSD · on Feb 7, 2018

I don't think it's true to say that Python's core developers are uninterested in performance. Speeding up Python is a hard problem. He mentions PyPy but even that has only managed modest performance gains in some areas (and not without tradeoffs). He suggests JavaScript as a comparison but doesn't elaborate on how they're comparable beyond the superficial (they're both dynamic scripting languages).

I get that he's frustrated with Python's performance but it would be really interesting to hear from someone who knows the technology involved rather than simple speculation.

Animats · on Feb 7, 2018

Python's little tin god really likes his CPython implementation being the One True Python. Python does the things that are easy to do in an interpreter where everything is a dictionary, and avoids things which are hard to do in that environment. In Python, you can store into any variable in any thread from any other thread. You can replace code being executed in another thread. Even Javascript doesn't let you do that. This functionality is very rarely used, and makes it really hard to optimize Python.

(And no, calling C whenever you need to go fast is not a solution. Calling C from Python is risky; you have to maintain all the invariants of the Python system, manually incrementing and decrementing reference counts, and be very careful about not assuming things don't change in the data structures you're looking at. This is not trivial.)

A generation ago, Pascal had the same problem. Wirth had an elegant recursive-descent compiler that didn't optimize. He insisted it be the One True Compiler, and managed to get the ISO standard for Pascal to reflect that. The decline of Pascal followed, although Turbo Pascal for DOS, a much more powerful dialect, had a good run, and Delphi still lives on.

orf · on Feb 7, 2018

> Python's little tin god really likes his CPython implementation being the One True Python.

No, he likes it to be the reference implementation, as it both is and should be. It's simple for a reason.

> This functionality is very rarely used

It's used all the time by debuggers, and the underlying features that allow you to do this is one of the most core and intrinsic things in Python.

civility · on Feb 7, 2018

If you neglect some of the metaclass stuff, which I believe most people do, then Python is nearly isomorphic to JavaScript. I think the comparison is very fair.

I also believe the reason Python is unlikely to ever catch up to JavaScript is the same reason that CPython will always be the dominant implementation - They've exposed so much of the C internals, that everyone is bound to the actual slow and single threaded implementation. JavaScript implementations in web browsers can do lots of magic behind the curtains because the majority of users don't rely on the actual innards being consistent from release to release.

shakna · on Feb 7, 2018

Viper is an interesting approach on speeding up Python.

It's developed for MicroPython, which does give them room for breaking changes, but has trade-offs.

Arithmetic is much faster, but dictionary lookups take much longer compared to CPython.

Viper is a code-emitter from a large subset of Python, and even allows for inline assembly. But it's only for a few architectures at the moment, like ARM and x86.

cromat3 · on Feb 7, 2018

Viper is called Zerynth today: https://wiki.python.org/moin/Zerynth

shakna · on Feb 7, 2018

They're similar, but not the same.

Zerynth is a development suite. Notably, it makes use of a VM.

Viper is just one of the code emitters buried inside the MicroPython source code, like here [0]. Notably, it produces native code, not bytecode for a VM.

[0] https://github.com/micropython/micropython/blob/master/py/ob...

pjmlp · on Feb 7, 2018

Common Lisp, Dylan and Smalltalk are as dynamic as Python, yet they all enjoy of good quality native code compilers (AOT/JIT).

klibertp · on Feb 7, 2018

To be honest, Dylan is not exactly alive anymore. CL and Smalltalk still have commercial vendors providing implementations, Dylan has currently two implementations, but one of those (Gwydion) is completely neglected and the other (OpenDylan) has maybe 5 developers working on it in their spare time.

It's a shame because even with its verbose syntax Dylan is a nice language, with a module system, with object system based on multimethods, with hygienic macros and, as you noted, an AOT compiler. It could be huge. We could have ended with Dylan in the place of Java and we'd be better off with that. I consider Dylan one of the biggest missed opportunities in PL space.

pjmlp · on Feb 7, 2018

True, but they all provide real life examples of what it possible in terms of compilation.

On Dylan's case, the team even decided to implement their own OS as last milestone, although the decision had already fallen for using C++ instead.

So this all Ruby and Python are slow because they are dynamic, isn't exactly how things went in other equally dynamic languages.

Python JITs just need a bit more love.

loeg · on Feb 7, 2018

PyPy achieves huge performance gains in most long-running code. I don't know what you mean by modest. The main hurdle is slightly longer start-up times while the JIT warms up (comparable to Java).

http://speed.pypy.org/

b0rsuk · on Feb 7, 2018

Is writing extensions a lost art? I read a few blog posts about speeding up Python and Ruby with Rust extensions. This should enable rewriting only the slow parts. Later, you could replace more of it if needed. Is writing extensions so very problematic in practice?

I know Go has runtime issues making it not very good for mixing with other languages, so it often encourages rewriting the whole application in it.

jerf · on Feb 7, 2018

Extensions aren't a total solution, though, which people often sell them as. You have an impedance mismatch between Python and C code, because Python has all of its objects packed in a way that is very strange to C, so you end up essentially deserializing all objects into C, then back out into Python, in a very expensive and allocation-heavy (on both sides) conversion.

If you can set up your computation in Python and run it in C, as with a lot of NumPy code, you can have your entire program basically run at C speeds. But if you have a complicated algorithm in Python, perhaps implementing business logic, you can very easily see a slowdown if you try to move bits of that logic into C piecemeal, as you end up paying more in cross-language serialization and overhead than you can win back.

In addition, writing extensions can be hard. First you've got a maze of choices nowadays, and while many of them are quite good at what they do, it can be difficult to figure out whether you're going to do something they aren't good at and have to switch options later, and it's really hard to figure out how to even analyze what they are and are not good at when you're not already familiar with the space. Then, if you do end up having to delve into the raw C, it's very tedious code, very tricky code to deal with the PyObjs, and code that can segfault the interpreter instantly if you don't get it right, which is not what you want to read about your multi-hour processing code. And for a greenfield or very young product, this maze of options is competing against other ecosystems where you can simply implement your code and get it to run 20-50x faster while writing code that is easier to write than a Python extension.

They are a solution to some problems. I don't deny this. NumPy is an existence proof of that statement. But I'd write this post because I'd say "if it's slow, just write the slow bit in C!" has been oversold in the dynamic language community now for at least the 15 years I've been paying attention, and it still seems to be going strong.

In 2003, it may still have been a good choice; in 2018, my recommendation to anybody writing the sort of code where this matters is to pick up one of the several languages that are simply faster to start with, and are much more convenient (even when statically typed) to work with than the competition was in 2003. And I also want to say that Python is still good for many things; I still whip it out every couple of months for something; it's still on my very short list of best languages. But the ground on which it is the best choice is definitely getting squeezed by a lot of very good competition and the changing nature of computer hardware, and a wise engineer pays attention to that and adjusts as needed.

fafhrd91 · on Feb 7, 2018

Python extension doesn’t mean C. Rust works perfectly for extensions, it covers a lot of low level c-api integration and it is fast. You can write whole application in rust and use python as a glue language

https://github.com/PyO3/pyo3

Pyo3 library gives you ability to work both diractions. Call python code from rust and call rust code from python.

jerf · on Feb 7, 2018

That sounds like one of the "maze of choices" I mentioned, no?

And if you're "writing the whole application in Rust and using Python as a glue language", you don't have the problem that this entire discussion is about, which is when you have Python code that is slow. Python as an extension language is a completely different world. Performance problems there are a much less big deal, because you've already got the option to simply use the fast language with only modestly more complexity, if indeed even that given how nice Rust is once you get used to it. It's when your whole app is in Python that these issues emerge, and "Just write extensions" is an option far less often than portrayed.

fafhrd91 · on Feb 7, 2018

My point is, you are not limited with using extensions only for optimizing hot loops, in rust you can write application logic as well. I doubt you should do this in C for example

deathanatos · on Feb 7, 2018

PyO3 is a fork of rust-cpython, which has a nasty abort issue[1] which is unfortunately a show-stopper for me. It isn't clear to me if PyO3 is also affected by this issue.

[1]: https://github.com/dgrunwald/rust-cpython/issues/59

fafhrd91 · on Feb 7, 2018

Pyo3 is not affected by this issue. pyo3 compiles in c-api interface, it doesn't use separate libs for that (python27-sys)

deathanatos · on Feb 8, 2018

> Pyo3 is not affected by this issue. pyo3 compiles in c-api interface, it doesn't use separate libs for that (python27-sys)

At some point, it must use a separate "lib" for that. Some of the core functions in the Python API exist in the Python binary; it is debatable whether one considers that a "separate lib", or not, but is isn't possible to compile it into your binary. (Or there would be two of whatever you decide to compile in, and that would be problematic.)

I looked into it, since you said it was not be affected. PyO3's own README notes that it is affected by the issue; it lists the same proposed solution as the bug against rust-cpython does. While the solution "works", in the sense that you can build a working module from it, the problem with the solution is that the ergonomics of it are terrible; my understanding is that it completely prevents one from being able to `cargo build`.

That said, I was not aware of either `cargo rustc` or `setuptools-rust`; at the time I was looking into it, setuptools lacked the necessary support to implement `setuptools-rust`, so that's nice to see that that has finally occurred. `cargo rustc` alleviates much of the concern around the ergonomics of building the extension, though that'll still be fun to explain to coworkers. The combination of all that would seem to imply that building Rust extensions might finally be somewhat feasible.

jsmthrowaway · on Feb 7, 2018

Cython interops just fine with Rust, too.

snissn · on Feb 7, 2018

Disagreeing with you and agreeing with the parent, it sounds like a lost art... Numpy doesn't pack and unpack python data structures, it just uses C structures. Python extensions I've written just use C/C++ data types and only occasionally passes python native types back to python. Python is amazing for developer productivity, but the methods are a bit opaque.

jerf · on Feb 7, 2018

"Python extensions I've written just use C/C++ data types and only occasionally passes python native types back to python."

See my other post; was your system basically using Python as an extension language on a system fundamentally implemented in C/C++? That's not the problem this discussion is about, which is when you have a large pile of Python code that is the main component of your system, and has proved to be slow. Piecemeal extensionization is not a very good option there, and non-piecemeal extensionization is "rewriting the system in a faster language".

If it's a lost art, it's because the domain where this is the best option is steadily shrinking. There's an increasing number of languages that interop well with C (and sometimes C++), are more convenient, and are still fast. Many of them are fast enough and convenient enough to simply implement your code in that language in the first place. As a result of that, I personally think that dynamic scripting languages have reached their peak and are now facing inexorable slow decline; the problem they solved in the 1990s is increasingly not a problem as a crop of languages that are both convenient and fast continue marching forward. JavaScript is, as ever, an exception due to its currently-privileged place in the browser ecosystem, though over the next decade that's going to fade as WebAssembly hits maturity.

(But let me emphasize that "slow"; I'm talking decades, not months. There is still plenty of opportunity to graduate this semester, get a job in dynamic scripting languages, and be in that space for 20 years. But I think in another 2-4 years we're all going to be able to agree they've peaked.)

snissn · on Feb 8, 2018

> was your system basically using Python as an extension language on a system fundamentally implemented in C/C++?

python for the networking and business logic, C/C++ for the data

cjalmeida · on Feb 7, 2018

I'm not sure I follow what's hard about writing extensions. You can do it in modern C++ if you will and even old SWIG can generate decent bindings for you if you keep your interface sane.

bjourne · on Feb 7, 2018

I agree with that. But for the OPs specific problem, a function returning the genetic diversity as a float between two samples, rewriting it in C or Pyrex would have been an ideal solution.

fireflash38 · on Feb 7, 2018

Not even extensions are required -- you could have C libraries that you hook into rather easily. You don't have to learn to navigate the PyObj or the Python extension docs, just know how to write & expose C functions as a shared library.

aldanor · on Feb 7, 2018

Most definitely not. You can write C++ extensions inside Jupyter notebook these days ([1]) -- thanks to libraries like pybind11 [2].

[1] https://github.com/aldanor/ipybind [2] https://github.com/pybind/pybind11

rgperkins · on Feb 7, 2018

It is not a lost art, but it is a thankless job.

If an extension just works, people take it, the author gets two or three positive remarks and is ignored from then on because the extension now has utility status.

Much better to write an application in Python using 30 slow Python module dependencies, so there are always some fires to fight and the application always gets publicity.

munro · on Feb 7, 2018

> At the same time, data keeps getting bigger and computers come with more and more cores (which Python cannot easily take advantage of), while single-core performance is only slowly getting better. Thus, Python is a worse and worse solution, performance-wise.

PySpark is makes it really easy to take advantage of multiple cores & machines. Most operations I want to do to my data I can find in PySpark's pyspark.sql.functions, so I get all the benefits of the JVM. In the cases I need something from Python, I can just UDF, it's a little slower than JVM but still extremely fast when distributed--I find all problems come down to time or memory complexity, which is independent to whatever your programming in. Also, it's very easy to take advantage of spot instances with Spark... I'm usually working with 2-20 spot instances, and sometimes go up to 60 depending on what I'm doing.

quietbritishjim · on Feb 7, 2018

The original article said that one reason it doesn't matter that pure Python's performance is poor is that you can use numpy (and pandas) to vectorise things, which then has native code performance. It goes on to say that his current problem is that the things he's doing today can't be vectorised with numpy – so that poor performance does matter after all. If he can't even express his code in terms of numpy operations (and other C-based libraries like scipy), I doubt they're going to be expressible in terms of Spark's primitives, which are a considerably smaller subset.