"Sometimes, and you have options for those cases."
The OP found a "sometimes", and he's using one of those options. In this case, he's got Python for prototyping and glue, with Haskell improving performance. This is as it should be.
I don't know of any Python advocates who say it's the right tool for every part of every job. What we will say is it's usually a good "first" tool for every job. Building a system out in Python allows you to get something representative fairly quickly, which helps identify if there are areas where Python alone is not enough.
I would argue that performance always matters, and that Python is never the right tool for the job in an absolute sense.
Python may be the right tool for the job given the options we have today but there is no reason we cannot have a language exactly as nice to use as Python is, but that also provides good performance. Languages like Nim or F# approximate that ideal, for instance. And while I realize there are high perf variants of Python, these should be the primary standard, and only path. The slow path shouldn't exist.
It is a failure of our community that we allow languages to proliferate while remaining slow. This is bad because allowing slow tools to become popular means people create slow things, which wastes other peoples time and energy.
Electron becoming a standard way to make cross platform desktop apps is another example. Someone is not wrong to choose electron for that job, but we the software community are wrong to have let something that inefficient become the easiest way to do that job.
You cannot simply dismiss this issue by saying "Well don't use the slow software you don't like then" as many of these things become de-facto standards that you cannot avoid. Your place of work may require Microsoft Teams as the chat software, and now you are using a huge % of your laptops ram and battery for simple text transmission. Atom becomes the popular target for language plugins and ends up the only usable way to get good IDE features for your language, and you suffer the performance hit for it.
> It is a failure of our community that we allow languages to proliferate while remaining slow. This is bad because allowing slow tools to become popular means people create slow things, which wastes other peoples time and energy.
Make it work, then make it work right, then make it work fast. I mean yes, a lot of things are slower than they should be, but the level of outright correctness bugs in software today is mindblowing. So while replacing our tools with faster tools should be something we do in the long term, I'd put a higher priority on lowering defect rates and making it easier to produce working software.
This "make it work" adage makes no sense at all when scrutinized.
Does anyone believe that other industries think like that? First let's invent a washing machine that washes, but occasionally sets clothes on fire and takes 24h for a washing cycle. Then we'll redesign it so it doesn't set things on fire, and finally redesign it yet again to finish in 2h.
It's incredibly wasteful. For any complex project, making it work and fast only at the end will either result in massive cost overruns or an outright canceled project.
Yeah. Speed is the dominant cost in software, but other industries treat other costs similarly. To pick another darling, look at Tesla: the Roadster is expensive, flammable, suitable only for enthusiasts. The S is generally useful, but too expansive. Still a narrow market. The E is their first general purpose product.
That’s normal. Same deal with Apple II, Mac, iPhone. Same deal with ether, coarse general anesthetic, modern mixes.
Yes, this is completely normal, and software based products should expect to evolve similarly.
All of the things you mentioned worked and had good enough performance. There was no crappy version of the iPhone which ran out of battery in 1h and took seconds to refresh when scrolling.
So obviously they thought about the performance aspect from the start.
There was no crappy version of the iPhone which ran out of battery in 1h and took seconds to refresh when scrolling.
I'm pretty sure there probably was. They where just smart enough to make sure the only people that saw that version where a handful of engineers in their R&D lab.
"Make it work, make it right, make it fast" should primarily be applied in the context of individual functions and modules not whole complex projects. Don't start micro-optimizing your function for speed before you've gotten it to return the correct result. Also with the additional caveat that you should stop at make it work right until you're at least fairly sure that that function is actually important for overall performance.
Yes. It's the prototyping phase. Medicine, hardware, even food. Cliff made his energy bars "work" first in his kitchen, then started to figure out how to make them scalable and optimized for taste, etc.
Any hardware startup does the same. Get a giant, hulky prototype going, with shotty wiring, way too much weight, and get it working. This make it work first adage is because you don't know what the final product/project will need to be (e.g. do it in python and then figure out how to make it better)
Ah food. I believe Soylent was experimenting with this idea of make it work and then make it work right, which resulted in a lot of entertainment for those that read the toilet stories of Soylent customers.
I don't see why there has to be a hard cut between make it work and make it work right or fast in prototyping. With more thought, perhaps the product can already be largely right and fast enough and also easily improvable.
I can't help but look at this saying as an excuse to get something done fast with the hope it can be improved later. This is not a general truth.
Software is different, but it doesn't bend space and time. Redesign costs can be significantly cheaper, but they are still present, and if the project is complex enough can be just as high as in physical world project catastrophes. I assume you have heard of various software projects which cost millions and overshot their budget by 50%, 100%, 200%, etc...
Now regarding the make -> work -> fast cliché:
I tend to have architectural discussions at the beginning of the project, which include among other things performance.
Depending on the project it might be a surface discussion, or go deeper.
Then based on the performance requirements and design decisions, the system is implemented, performance is tracked and adjustments are made. So performance should not be put first, middle or last, it's an architectural level concern which is controlled throughout the project.
How do those of us that like to complain about "premature optimisation" whenever performance is discussed design software? Because based on HN discussions it looks like code & pray.
> Redesign costs can be significantly cheaper, but they are still present, and if the project is complex enough can be just as high as in physical world project catastrophes. I assume you have heard of various software projects which cost millions and overshot their budget by 50%, 100%, 200%, etc...
Indeed; in my experience such failures tend to be caused by too much rather than too little design and architecture up front.
> How do those of us that like to complain about "premature optimisation" whenever performance is discussed design software? Because based on HN discussions it looks like code & pray.
Don't try to design at the start when you don't know anything; let the design emerge. Do the simplest thing that might possibly work; 90% of the time it does work, the other 10% of the time you learn about another constraint. Get the simplest use case working, then iteratively expand to the full functionality. Refactor continuously and fearlessly (and adopt whatever testing/verification practices you need to make it fearless), driven by the changes that you need to make.
It's easy to mock, but it works much better than trying to design as a separate activity.
Type systems and languages that make them easy to use. I mean, other approaches are possible, but type systems are something we already all use and understand, and they seem to be enough.
You're conflating Python the language and Python the default runtime implementation (CPython). PyPy, a Python JIT compiler, has shown you can have an incredibly fast Python implementation. In some cases, it's faster than C.
I get that, and I agree, but my point was that if you're writing Python and finding that your code is too slow, it's far easier to just drop in a faster runtime than it is to rewrite your entire existing codebase in a new language.
As a guy who just spent the past summer doing (his 19th year of) Python, (6rd year of) Cython(/Pyrex), and (8th year of) C, picking around PyPy, and writing a couple transpilers; the tools are incredible. And I think you're talking out your butt here.
Speed is a matter of choice and a bit of manual tooling. You seem like a Golang advocate; great. I hope it continues to work for you. Python solves 95% of problems today, and with the continued bettering of tooling (via PyPy, Cython, Pandas, SciPy, PyTorch, etc.), Python and the dozens of active member groups (individually with 100k+ global users) work independently in every niche with performance improvements until it takes the market. We embrace, extend, wrap, and improve.
I remember when Unladen Swallow was laid upon the laps of the CPython developers. That was a LLVM backend JIT for CPython (CPython had another JIT at the time, Psyco), combined with a collection of other improvements to the CPython runtime that Google laid on the CPython developer's laps. It was a patchset against an old (and unmaintained) version of CPython, and may have taken years of further development to merge back in. Meanwhile PyPy, Pyrex->Cython, etc., were all pretty solid, so the CPython devs left the Unlaiden Swallow, because it didn't seem worth it (I agree with them). See PEP 3146 for details.
But that doesn't mean CPython devs haven't been at it; what the heck do you think all of those "optional" type annotations are for in Python 3.x? Type checkers for one, compilers for two. Oh... yeah, we've been moving towards optional static type checkers and static compilation in the Python community for years; and as a community, you can basically piece it all together. Is it 100% yet? No, but it's a solid 95% for most use-cases.
Watching Golang over the years; it feels a lot like if it wasn't in Golang, it wasn't worth using with Go. I say this as a user of cgo from 2012-2013 in a failed attempt to make something in Golang + sockets + goroutines faster than the equivalent in Python + sockets + threads + C. But when I was seeing better performance in Python, better error reporting in Python, better C library wrapping in Python, and better tooling in Python - I went back to Python (funny how 20+ years of tooling will do that).
Don't get me wrong, I love the speed of the Golang compiler. It's just mostly everything else in the ecosystem I don't like; including the lack of a viable C/Golang interface for anything nontrivial (like leaving C threads running with references to Golang objects/structres), the hilariously short official documentation for cgo, the basic need to re-implement the world in Golang to get good performance, and still being subject to the whims of Rob Pike - whose bad decisions (in the form of Sawzall) already wasted a week of my life when I was at Google.
You want performance in Python? Okay. Where do you want performance? If there isn't already a library there to help you, I'd be very surprised.
I don't buy these arguments. You hit on a pathological case in Go where Python managed to be faster. That's not remotely typical, and even the "just use C/Cython/Numpy" is a huge oversell--there are some problems that are amenable to dropping into a lower level language, but many more are not. At best, it's just very, very hard to make Python as fast as unoptimized Go.
Besides performance, CGo and tooling seem to make up the bulk of your criticisms. I agree that Python's C interop story is strictly better than Go's, but this seems like throwing the baby out with the bathwater for a huge suite of applications. I also strongly disagree that Python's tooling is better; I think your 20 years of experience with it has biased you against the frustrations the rest of us have when we try to pick it up. By contrast, I've found Go's tooling to be superb--profiling, testing, benchmarking, documentation, etc all included out of the box. There are still holes (like debugging), but I daresay Go's tooling is categorically better than Python's.
"You want performance in Python? Okay. Where do you want performance? If there isn't already a library there to help you, I'd be very surprised."
I assume you haven't read the article, but give it a try. The author explains where they wanted performance and that there was no library, so they had to write some ugly code instead.
I agree that we can do better. But I don't think you'll ever be able to avoid some trade-off between CPU performance and ease-of-use. We might shrink the gap, but there will always be a market for languages that allow easy prototyping and new developer onboarding, and those languages will always be slower than C++.
It's a timely article, because a number of things have changed in recent years to make the tradeoffs around using Python quite different from what they once were. Ten years ago, Python was slower than the alternatives by a small constant factor, datasets weren't big enough for python performance to be an issue, Python had a world-class tooling/library ecosystem and higher-performance languages at a similar level of conciseness/productivity were basically unknown.
Today, as the article says, things are different: Core counts are rising so practical Python performance is falling further and further behind, datasets have gotten large enough for Python performance to be an issue, Javascript has proven that it's possible to get much higher performance out of a scripting language, languages like Haskell have gone mainstream and offer a comparable-to-Python (better, in fact, given what a mess Python's packaging situation is) tool/ecosystem experience and comparable levels of productivity with much higher performance.
Every tool is a "sometimes", but good engineering is knowing when a given approach moves from being the right one 90% of the time to being the right one 10% of the time.
I think the article makes a lot more sense if you consider it in the context of "Python for data science".
In the last few years, there's been a lot of hype about replacing other number crunching solutions (R, SPSS, even Matlab) with the Python ecosystem of tools (Pandas, SciPy, etc.).
i dont seem to follow. if you are doing data science, all the bottle necked stuff will be running in numpy or pyspark. Choosing python over R, SPSS, Matlab usually doesnt come down to which one is faster, and R as far as i know is at least not vastly superior in speed.
This was explicitly addressed in the article: as soon as you have to do anything which isn't a trivial numpy operation, performance goes off a cliff, and that can be a problem.
This also isn't necessarily true. Take the example of TensorFlow. You build a representation of the computation you want to run, and then you can run nearly the whole thing end-to-end in native C++ using Eigen data structures, with occasional shuttling of data back into PyObjects (rare) or numpy (common) for metrics tracking.
Cython is a much more powerful tool than I think the author of this article realizes.
I don't understand what you mean about "hype about replacing ... with Python". How is it hype when the majority of people already use Python (see link)?
This is a pattern I've had some success with several times now. Create a quick Python implementation for parts of data pipelines and then go back and re-write in Java/Go/C/whatever the best tool is for that bit of the job later when we know where the bottlenecks are.
The quoted argument that "easy to write but slow languages are better because programmer time is far more costly than CPU speed" was pretty common, and I honestly think correct, 10-15 years ago. But things have changed.
CPU performance long ago hit physical limits, and more and more we are scaling out applications across hundreds, thousands, or millions of servers. We've passed the inflection point where CPU speed really is more expensive than programmer time, if you are running that code at a big enough scale.
Add in containerization and cloud VM platforms where the tradeoffs of space and performance versus money start to become very clear. Add in better, safer languages like Rust and Go for writing high-performance code. And today if you can spend three times the programming time writing in a faster and more efficient language, and it runs 100x as fast in a tenth the memory footprint, you are talking about massive overall cost savings.
> CPU performance long ago hit physical limits, and more and more we are scaling out applications across hundreds, thousands, or millions of servers. We've passed the inflection point where CPU speed really is more expensive than programmer time, if you are running that code at a big enough scale.
When you're starting a startup, scaling out your application to hundreds, thousands, or millions of servers isn't something you're going to do right off the bat, and more likely, that's never going to happen, no matter how successful your company is. The number of companies operating at that sort of scale can be counted on one hand. Most startups can run on a few boxes.
For those situations (which probably pertains to most new projects outside of big companies), programmer time is indeed, still the most important and costly input. If you're a startup and your 100x more performant Go code takes an extra 3-6 months to write, and a competitor beats you to market, no one is going to care how much faster your runtime performance is on the CPU, especially if you're a web app, when CPU time is probably should be last on the list of items which could lead to slow application performance to end users.
100x difference in CPU time is nothing compared to the 1000x loss in a cache hit or a 10000x disk read. I'd love to see an example where the CPU difference outweighs any influence from disk or memory.
I was with you up until the last paragraph. Taken literally you seem to suggest there is no such thing as a CPU-bound workload. That's obviously not the case (cryptography is just one such example), but I would agree that many people think they are CPU-bound when they are really constrained by something else.
Secondly, Python and the patterns its expressiveness encourages are terrible for cache performance. In a simple C program it's easy to do something non-trivial in the space provided by L1 cache — in Python it's quite difficult even to reason about what's going to be in L1 if you're using any of the fancy features.
Good point. I see what you're saying and I was definitely not suggesting that. I was speaking mainly from a web application perspective, where program execution on the server is very low on the list of items affecting application speed as perceived by the end-user.
Scientific computing, on the other hand, is a completely different animal.
I haven't looked at it in a while, so I could be wrong, but I think with small enough programs you can still squeeze some payload into L1 in long tight loops where you're not jumping up and down the Python stack a lot.
But your overall point stands: if you're writing non-trivial Python programs your L1 is usually spent on language/runtime overhead.
> If you're a startup and your 100x more performant Go code takes an extra 3-6 months to write
I can write Go code nearly as fast as Python, sometimes faster if I need to refactor. Obviously this depends on familiarity with languages, but I think most of difference is probably experience with the language more than anything else.
> 100x difference in CPU time is nothing compared to the 1000x loss in a cache hit or a 10000x disk read. I'd love to see an example where the CPU difference outweighs any influence from disk or memory.
This is a little confusing. A language like Python doesn't just use more/slower instructions to do things: it has worse cache locality too.
You always deal with references to object instead of the data. So indirection is everywhere with no way to do anything about it. Primitives use more memory. On my machine, an int requires 28(!) bytes instead of 8. Then there's the fact that you have to fit the interpreter itself in the cache instead of just the code you wrote.
If you care about performance, don't use Python.
Unless of course you're using numpy or something similar
To me the last sentence nullifies the entire argument. People doing a lot of numeric computations aren't (typically) doing it all in Python. They write critical sections in C (or some other "fast" language), or use existing libraries that already have done so. If they aren't writing those sections in C already, perhaps because they don't know how, or it would take too long to do right, then why would they ever choose to do the whole thing in C?
Firstly, not everyone doing performance-sensitive work is doing numeric work (that seems to have been a motivator for writing this article), so numpy isn't always practical. Secondly, I think the "if it's slow just rewrite the hard parts in C" is generally out of step with more modern options. Python gets you a really nice environment that's super pleasant to work with... until suddenly it doesn't and you're backed into a performance corner, and then you potentially need to conquer a major learning curve, take a huge usability hit, sacrifice memory safety, etc. There are increasingly common options now that let you get near-C performance for many applications while also totally avoiding that cliff. For applications that have even a decent chance of eventually needing that kind of work, I think it's reasonable to ask why you'd want to chance it when you could just write it in something both reasonably fast and much more pleasant than C from the get-go.
> Secondly, I think the "if it's slow just rewrite the hard parts in C" is generally out of step with more modern options.
Right, including modern options like “use Cython”, which opens up C-like power and performance within the Python ecosystem and while maintaining Python ergonomics (because Cython is both a Python language superset and had tooling integrated with Python's distutils, etc., tooling.)
I write Python for my day job, but I often prototype in Go because it's so much easier. Also, regarding I/O, Python does nothing during I/O unless you take care to write async code and only call async libraries. Go does async by default and truly outclasses Python at IO bound tasks. For CPU bound tasks, Go is only 2 orders of magnitude faster. For I/O bound tasks, Go is about 4-6 orders of magnitude faster unless Python is carefully optimized.
I've been "premature optimization is the root of all evil"ed in more code reviews than I can count - yet when people complain about software, they first complain that it's ugly and then they complain that it's slow. Users don't care what it's written in.
There is fast (optimization) and slow (the most straightforward thing with no consideration to perf) and then not-slow. I think not-slow is a happy medium. Fast code should and probably is, totally unreadable, looking like AlphaGo optimized your program.
In theory I think you're correct - you _can_ go crazy overoptimizing code to the point where it's completely unreadable. However, in my experience, it takes a lot of effort to get to this point, especially in modern languages. It seems to me that most developers err in the other direction: whatever works as long as it took the least amount of time to program (although this might well be partly driven by the management "agile" obsession with accounting for how every 15-minute increment of time was spent).
>Fast code should and probably is, totally unreadable, looking like AlphaGo optimized your program.
Yes and no; multithreading and optimizing to remove branches can do a number on readability, but I find optimizing for cache locality and compile-time evaluation often makes code more readable. It depends on the language obviously.
How many people are there who actually need to scale an application across hundreds, thousands, or millions of servers vs the ones that think they are going to need it?
I agree that at some point CPU speed is more expensive than programming time, but how many applications in the wild are actually at or beyond that point? I would be surprised if it is more than a couple of percent.
Sure if I know that a product has to serve millions from the get go, I'd choose a more performance focused language than Python. But if it is about building an MVP for a new startup it seems quite unreasonable to me to spent 3x the time to get it out of the door just for the extremely slim chance that the product takes off faster than Python is able to catch up with.
I'd quite like to be able to write some plain python (not PIL or numpy or Cython or whatever) to change all the pixels in an image at a similar speed to say Java.
> The quoted argument that "easy to write but slow languages are better because programmer time is far more costly than CPU speed" was pretty common, and I honestly think correct, 10-15 years ago. But things have changed.
I don't think things have fundamentally changed in the programmer time is cheaper than cpu time calculation. What has changed is:
1. Classic dynamic languages (ruby, python, etc) all heavily assume the world is single threaded and blocking io can be scaled well enough. Unfortunately, these maxims from the late 90s do not hold about modern CPUs. Languages that embrace non-blocking io such as node.js have been incredibly successful as a result.
2. Software projects have gotten bigger. Extremely anecdotally projects in dynamic languages get uncomfortable with about 10,000 lines and unworkable at about 100,000 lines. A lot of the popularity of these dynamic languages was that you didn't need to type out your types all the time. Newer languages languages have shown have been able to get the best of both worlds with type inference and structural typing. Typescript's meteoric rise is because of exactly this.
Go and to a lesser extent Rust, have recognized these two issues facing last generation dynamic languages and have been successful because they address these issues head on. You can start a new Go application and have a similar time to market as python or ruby, but have it scale with your company, both in developer time and performance, for a long time. Why wouldn't you choose Go or some other new language?
Absolutely, but its arguably 10 years too late. Node has already eaten python's lunch in the server space. To get your existing python code working on async you may basically need a full rewrite including any libraries that you pulled in. At that point most companies ask themselves if python is really the right language to conduct a rewrite in.
I also firmly believe that "easy to write but slow languages" sets up a false dilemma.
Haskell, JavaScript and C# are all fairly instructive examples. Haskell for demonstrating the level of flexibility you can achieve with a really well-thought-out static type system. JavaScript for showing how much of the "dynamic tax" you can avoid at run-time with an aggressive enough JIT compiler. C# for showing that you can get a pretty nice final result out of pragmatically blending a little bit of dynamic typing and a little bit of the ML-style static typing experience into a language that started out with a Java-style type system.
When people say cpython is slow they are generally pointing to two things
1) The interpreter is _slow_
2) You can't achieve real thread based parallelism
I think in general people have a high level concept of (1) as static vs. dynamic and interpreted vs compiled is an understandable trade off. CPython as an implementation generally gets type casted as slow for its default dynamism which have understandable negative performance implications[1]. However CPython also gives you lots of ergonomic ways to push your program towards the static/compiled end of the spectrum with things like pandas/numpy/numba/extensions. In general using these correctly put you within the ballpark of _faster_ languages, could you write faster assembly by hand? Sure! Is optimizing in another language worth your time? I don't know.
I've never really understood (2) the lack of threading as a problem, multi process parallelism can be accomplished fairly easily if you are CPU bound or projects like uvloop[2] make async tasks fast enough to compete with any web framework out there. Furthermore even though your cpu cycle cost may be levered to a considerable point while operating over hundreds/thousands of servers doing distributed computing right is still hard and developing something like celery/airflow/luigi/dask from scratch is not cheap either. Leaning on CPython's massive ecosystem can massively lower the barrier to entry in a lot of big problems.
I think there are plenty of examples of re-writes in go[3]/rust[4] that have worked great for people, I have no doubt that python is _not_ the end all be all language, but I think the "python is slow" worry is generally overplayed.
Specific problems require specific solutions, I'm glad Haskell seems to work for the author in genetic analysis but think this could have been a more interesting article with some specific python-haskell comparisons rather than the generic python is slow argument.
I love Python. The language is a joy, the eco system is fantastic. But yes, let’s be honest, if you can not vectorise your code it is slow, and I think that will be its downfall eventually.
I’m excited about Julia, I hope it gains popularity and the eco system grows. Until then, and in particular until the data frames story can compete with pandas, it Python with Cython for me, but I’d rather skip the Cython if it was not necessary for performance.
Any early adopters running Julia in production with stories to share?
I feel bit ambivalent about Python as it's a nice language for prototyping and quickly hacking things done. Yet I'm always baffled when I read Numpy's or Matplotlib's documentation and try to make sense of it as it can be (or at least feel) so complex and highly ambiguous. Eg. sometimes there is no/very brief examples at Numpy's documentation pages how the method works and most results from Google are only about advanced implementations, not about the basics of the method itself. In Matplotlib I still don't understand what is the right way of initializing a pyplot, there seems to be a million ways to do it and a million parameters you can give. API changes and inconsistencies too pain me at times (Pandas comes to my mind). While not a fault of Python as a language I think they greatly contribute to the experience of using Python.
Also I don't feel like the culture of Python programming focuses too much on documenting things which makes reading code at times like transcribing ancient Latin manuscripts. Maybe a good analogy would be JS back in the days with global jQuery scripts. Too unrestricted and free-form maybe. I'd wish Python became more like Kotlin with very clear patterns and great IDE support (in addition to PyCharm).
Well those are at least my experiences and feel free to disagree with me.
> In Matplotlib I still don't understand what is the right way of initializing a pyplot, there seems to be a million ways to do it and a million parameters you can give. API changes and inconsistencies too pain me at times
Matplotlib has the worst API of all Python libraries I have used over the years!
If there were a fork of it that got rid of Matlab-way of doing things (keeping only OOP style) and with consistent names (no more `twowords` and `two_words`), I would gladly switch in a heartbeat.
This is maybe more hearsay, I only briefly tried to use Julia.
It is true, Julia has great features for performant code. It is, however, focussing too much on being a matlab competitor in my opinion. It will not be a language that you use to write a "normal" (i.e. non-numeric or CRUD) dynamic website in. My general observation however is, that you need to attract this crow, if you want to have an ecosystem with a variety of tooling. And it is the neat thing about Python (and Haskell).
I'm using Julia and loving it. I've built a bunch of differential equation solvers which routinely outperform the classic C++/Fortran codes. I started out without "software development" experience but Julia and its community got me up to speed and helped me build something quite unique. Now Julia is the only language that has the numerical libraries I need to do my research.
In fact, the whole library story in Python/MATLAB is quite overblown. If you're doing something which is actually new, like PhD methods research, you need to be writing a lot of stuff from scratch. And in that case, you usually cannot get by with vectorizing everything... and vectorization always has the issue with temporary arrays too. Meanwhile, Julia's type system makes everything fast (which is a plus when trying to publish a paper on it!) but also get cool extra features for free like GPU support and arbitrary precision. For people developing and testing new methods, Julia is the best tool right now.
>If you're doing something which is actually new, like PhD methods research, you need to be writing a lot of stuff from scratch.
Sometimes you also reuse a lot of stuff, it depends. For machine learning in particular, almost everything uses some sort of gradient-based optimisation algorithm. In these cases, it is very useful to have an automatic differentiation library. My coworker said Julia has about 3, IIRC, and until an amalgamation of them is merged into the standard library Julia isn't completely ready.
> until an amalgamation of them is merged into the standard library Julia isn't completely ready
That doesn't quite make sense. I'm sure there are a few more autodiff libraries in Julia than 3. You would just use the one that fits your use case.
PS. Ohh re-reading your comment, you want an autodiff library in Julia's standard library. That is very unlikely to happen since it's not (very) hard to cook up an autodiff library and autodiff is not widely used. Julia is not like Matlab, where you have to have everything in the standard library.
Well, my co-worker says people are working on it. I asked him whether I should try to adopt Julia and "not until autodiff is in the standard library" was his response.
>autodiff is not widely used
If you look only at the machine learning community, that is false. Autodiff is used all the time, to optimise loss functions using a variant of gradient descent. For neural nets, gaussian processes, SVMs... Not decision trees/forests, but these are not more popular than all the others combined.
Maybe there was some small miscommunication here – we are indeed working on "one AD to rule them all" (Capstan, [1]), but it won't go in the standard library as such (there's no need as it will be just as good as a package).
That said, Capstan relies on very new compiler technology and still requires some deep changes there, so will only work on future Julia versions (which may be what he was referring to).
Until then, Flux [2] has an AD that's well-suited to ML and works on current Julia versions.
> It is true that programmer time is more valuable than computer time, but waiting for results to finish computing is also a waste of my time (I suppose I could do something else in the meanwhile, but context switches are such a killer of my performance that I often just wait).
In film CG production, we had a rule of thumb. If running an interactive program takes longer than about ten seconds, the artist (user) becomes more likely than not to get up and go get some coffee or talk to someone else. We consciously made an effort to keep anything someone needed to wait for to under ten seconds, and save anything longer than that for nightly farm renders. We were writing the code in C, btw.
Somewhat tangential to this, but I remember reading an article on HN about a group creating some web app that was constrained by some size limit (50 KB or so). Groups ended up putting 'dead code' in their projects to guard against other groups taking their space while they were working on their feature. I think this made it so the app never got less than 50...
Did you ever see something similar where devs put in some code as filler to make room for future features they were working on?
Oh yeah, this is very common in game development, and I'm pretty sure in other embedded dev too. There's a famous story about a game programmer who saved an entire production when it started crashing something like two weeks before shipping by commenting one line of code. Turned out the line of code was a malloc (of like a megabyte) that he'd added a year earlier, in anticipation of the game running out of memory. My details are probably wrong, but I'm pretty sure I've seen this story linked on HN.
I was in game dev for a decade, and I saw this happen where I worked, the studio technical director adopted the practice of saving some space, because we always started running out of memory near the deadline as artists threw in all their content.
In my experience of vfx usually the same author has ownership over the whole tool, so there's no incentive to pad time. We have joked about adding in sleeps and removing them look impressive to users, though.
Games probably have a situation more similar to what you're describing. Many games target 30fps, which is 33.3 milliseconds per frame. Each department usually gets a "budget" of how long they can spend. I've heard similar stories about padding memory usage and time, but it's hard to tell if they were serious and it's not common.
It's typical in mil & aviation software to get specs with a required 50% cpu & memory reserve in anticipation of updates in the maintenance window (which could be shown as decades for many products).
If you would like 10-100x faster performance than Python, but would like to keep the easy-to-read code, give Nim [0] a try.
I do all my work in Python, and I've been using Nim in last couple of months - it took me a week or two until I was able to be productive in Nim.
Don't expect Python's large ecosystem, nor some Python goodies, but if you're looking for a readable, writable, high-performance post-Python language - Nim is the way to go!
Exactly, the ecosystem is the only reason IMHO why Python is "easy". I find I'm much quicker at developing C#/F# if the library is available on NuGet (which is often, but not always the case). They're both reasonably fast, too. Python only has a very large community and thus ecosystem to offer.
It can't be the only reason, or how would it ever have attracted such an ecosystem in the first place? Especially with its performance characteristics, and lack of corporate backing.
No, Python was invented at a time when its closest competitor was Perl - and you need only compare typical Perl with typical Python to appreciate that Python really was a usability revelation.
But that was nearly 30 years ago. I do think we can do better now.
• the language is easy, very attractive for non-CS people
• the language is consistent (everything is an object, or a pointer to an object rather, you can only pass by pointer, scope and namespace that make sense all the time, etc...)
• you can learn it gradually, you can start using it even if you know only 10% of the language
• Amazing documentation. The official tutorial is easy to read, and by the time you are through with it you are fairly proficient with python
• error message that tells you exactly what the error is and where it is happening. A lot of other languages do that now, but that wasn't the case 25 years ago.
• inline help / magical docstring (`a=4 ; help(a)`)
• The REPL, especially being able to get into the REPL with your apps (`python -i`)
• batteries, again 25 years ago, no language came with such a huge standard library
• PEP 8, the grandfather to gofmt and rustfmt. Makes a big difference when working on a team
• the Zen (`import this`)
And of course as the language grew in popularity:
• the ecosystem (pypi.org)
• the amount of resources, from books to website, the answer to every questions you can think of in SO, very active and helpful mailing list/usenet group, active and helpful /r/Python
Yes, my point is that function scoping can hardly be described as "makes sense all the time". Especially given that variable declarations in python are _very_ implicit, without even the "var" that will indicate to you that you're messing with your function scope in JavaScript.
> Yes, my point is that function scoping can hardly be described as "makes sense all the time".
Block and function scoping both "make sense all the time."
Whether or not you personally are accustomed to a language's design decisions does not have any bearing on whether or not those decisions "make sense."
In the particular case you cited, of course it is natural that `foo` is in scope outside of the loop; it appears in a line outside of the loop. This is trivially evident in the indentation.
The context is that we're talking about people picking up the language for the first time, coming from other languages. What matters is whether the scoping rules make sense to such people.
For that, it matters whether they're coming from another function-scoped language (or are familiar with one), and whether they're used to having their variable declarations being obvious or not.
For me, when I was first learning Python, this scoping issue was a significant pain point. This is obviously an anecdote, not data, but again for purposes of "people first coming to the language" it's relevant.
Obviously once one has worked in Python for a while one internalizes things like this, just like one internalizes various things in other languages that are obvious pain points for beginners.
Python is and always has been designed to be an introductory programming language, so your criticism does not apply in the case of this design intent. More people learn Python first than learn another language first. Having seen beginning programmers learn Python, I can say this is an utter nonissue to them.
For the case of experienced programmers, the behavior is consistent and simple. In what way does this not "make sense?" Does Haskell's normal order evaluation not "make sense?" Does Ruby's optional parentheses for method calls not "make sense?" Does a regular expression literal syntax not "make sense?" Does JSX not "make sense?" Do public/protected/private modifiers in Java not "make sense?"
Stop confusing a match to your personal comfort zone for actual quality or fitness-for-purpose.
for foo in bar:
pass
# Why is "foo" in scope here???
Well, the answer could be "it's not". It depends on whether bar actually produced anything to assign to foo. Doing a "print(foo)" after that loop might print something, or it might throw a NameError.
Now I understand why that happens (in terms of what the loop actually desugars into), and you understand why it happens. But it's not as consistent and simple as you're trying to make it out to be. You have to really understand what a for-in loop is doing under the hood to explain the behavior.
You seem to feel that I'm attacking Python or something. I'm not. It's a nice language to work in, with a lot to recommend it. But it's not a language I'd choose as a poster child for scope and namespace making sense all the time unless you really dig into what's going on "under the hood".
> Does Haskell's normal order evaluation not "make sense?"
No opinion, really; not enough intimate familiarity with the problem space to have one.
> Does Ruby's optional parentheses for method calls not "make sense?"
Again, no opinion.
> Does a regular expression literal syntax not "make sense?"
It depends on the regexp. If your regexp is simple enough, it's fine. In far too many cases you end up with a write-only monstrosity. Also, you say "syntax" as if there were only one; there are multiple and some make more sense than others.
> Does JSX not "make sense?"
Again, no opinion.
> Do public/protected/private modifiers in Java not "make sense?"
It depends on how they're used.
> Stop confusing a match to your personal comfort zone for actual quality or fitness-for-purpose.
Stop confusing "doesn't always make sense" for "doesn't make sense" (totally different statements, there), and the former statement for a statement about quality of fitness-for-purpose. Lots of things are considered high-quality and fit for purpose while still having flaws. Possibly flaws that could not have been avoided without sacrificing other goals. But determining that requires first admitting that the flaws exist and then evaluating them. If we either pretend that the flaws don't exist, or that flaws existing is somehow an indicator that the entire system is unfit for purpose, it's hard to think productively about the design of the next system.
> It can't be the only reason ... that was nearly 30 years ago. I do think we can do better now.
Maybe the ecosystem is the only reason left today. The reasons for initial adoption 25 years ago and the reasons for widespread usage today probably aren't the same reasons. Numpy, Anaconda & Jupyter notebooks didn't exist then, and now they're a huge reason for Python usage. I can't even think of a language that comes with a standard library that rivals Python's, let alone the ecosystem.
Fair point, I was importing the diapers module 30 years ago. But I think in today's world the advantage is mostly in the ecosystem. How the ecosystem came to be is another story.
Why Nim rather than e.g. Haskell (mentioned in the article) or OCaml, which are much more mature and have much bigger, more established tool/library ecosystems?
Because Nim syntax will be familiar to Python developer. Sometimes all you need to change is add variable declarations and rename `def` to `proc`.
Haskell has a much steeper learning curve. Been there, struggled with that. If I would recommend a functional language to Python developer, I would go with F#.
Which is error prone and often actually slower depending on calling patterns between C and Python. There are many languages today that offer better ergonomics than Python/C, and a few (like Go) which offer better ergonomics than Python by itself, all while besting it in performance by one or more orders of magnitude.
I like Python; I just wish I could say the same for its developers...
> There are many languages today that offer better ergonomics than Python/C
Including Cython. I mean, if you've got a C library, sure, interface it with Python; but unless you want something to be called from something else in addition to Python, dropping to C for performance needn't be the default choice in Python; that's the whole reason Cython exists.
> and a few (like Go) which offer better ergonomics than Python by itself
I find Go’s ergonomics to be worse than Python but better, mostly, than Java.
> all while besting it in performance by one or more orders of magnitude.
I would consider Cython if the particular bottlenecks were amenable to calling into a lower level language and if the alternative were porting a large application from Python to something else wholesale, but I would probably never start a new application in Python/Cython if there was any chance that performance would ever matter. The alternatives are simply too good.
> I find Go’s ergonomics to be worse than Python but better, mostly, than Java.
This is surprising. I'm a Python developer, and I still prototype new features in Go.
> The result is that I find myself doing more and more things in Haskell, which lets me write high-level code with decent performance (still slower than what I get if I go all the way down to C++, but with very good libraries).
This strikes me as an odd conclusion to come to if speed was the main motivator.
Speed is the main motivation, but total time is TimeToWriteCode + TimeToRunCode.
Python has the lowest TimeToWriteCode, but very high TimeToRunCode. C++ has lowest TimeToRunCode, but high TimeTowWriteCode. Haskell is often a good compromise for me.
Also, with Haskell, it can be very easy to take advantage of 20 CPU cores, while I don't have as much familiarity with high-level C++ threading libraries.
@ the OP - not to sound hostile, but you write code (like in the example here [1]) that is bound to be slow, just from a glance at it. vstacking, munging with pandas indices (and pandas in general), etc; in order for it to be fast, you want pure numpy, with as little allocations happening as possible. I help my coworkers “make things faster” with snippets like this all the time.
If you provide me with a self-contained code example (with data required to run it) that is “too slow”, I’d be willing to try and optimise it to support my point above.
Also, have you tried Numba? It maybe a matter of just applying a “@jit” decorator and restructuring your code a bit in which case it may get magically boosted a few hundred times in speed.
It's not so easy to post the data to reproduce a real use-case as it's a few Terabytes :)
*
Here's a simple easy code that is incredibly slow in Python:
interesting = set(line.strip() for line in open('interesting.txt'))
total = 0
for line in open('data.txt'):
id,val = line.split('\t')
if id in interesting:
total += int(val)
This is not unlike a lot of code I write, actually.
I've also found that loops with dictionary (or set) lookups are a pain point in python performance. However, this example strikes me as a pretty-obvious pandas use-case:
interesting = set(line.strip() for line in open('interesting.txt'))
total=0
for c in chunks: # im lazy to actually write it
df = pd.read_csv('data.txt', sep='\t', skiprows=c.start, nrows=c.length, names=['id','val'])
total += df['val'][df['id'].isin(interesting)].sum()
I'm not exactly sure, but pretty sure that isin() doesn't use python set lookups, but some kind of internal implementation, and is thus really fast. I'd be quite surprised if disk IO wasn't the bottleneck in the above example.
`isin` is worse in terms of performance as it does linear iteration of the array.
Reading in chunks is not bad (and you can just use `chunksize=...` as a parameter to `read_csv`), but pandas `read_csv` is not so efficient either. Furthemore, even replacing `isin` with something like `df['id'].map(interesting.__contains__)` still is pretty slow.
Btw, deleting `interesting` (when it goes out of scope) might take hours(!) and there is no way around that. That's a bona fides performance bug.
In my experience, disk IO (even when using network disks) is not the bottleneck for the above example.
Ok, I said I wasn't sure about the implementation, so I looked it up. In fact `isin` uses either hash tables or np.in1d (for larger sets, since according to pandas authors it is faster after a certain threshold). See https://github.com/pandas-dev/pandas/blob/master/pandas/core...
Could you give a hint of how the data ("sample1", "sample2") looks like, or how to randomly generate it in order to benchmark it sensibly? I guess these are similarly-indexed float64 series where the index may contain duplicates? Maybe you could share a chunk of data (as input to genetic_distance() function) as an example if it's not too proprietary and if it's sufficient to run a micro benchmark.
There's also code in genetic_distance() function that IIUC is meant to handle the case when sample1 and sample2 are not similarly-indexed, however (a) you essentially never use it, since you only pass sample1 and sample2 that are columns of the same dataframe (what's the point then?), and (b) your code would actually throw an exception if you tried doing that.
P.S. I like the part where you've removed the comment "note that this is a slow computation" :)
The speed could possibly be improved by using map. Also, not related to speed if this is all of the code, but might affect it in a larger programs: you should make sure your file pointers are closed. Something like:
with open('interesting.txt') as interesting_file:
interesting = {line.strip() for line in interesting_file}
with open('data.txt') in data_file:
total = sum(int(val) for id, val in map(lambda line: line.split('\t'), data_file) if id in interesting)
Have you tried using Cython to compile code like the above? Python's sets / maps / reading data etc should be fairly optimised, so Cython might let you bypass boxing counter variables instead using native C ints or whatever.
Also, if the data you're reading is numeric only - or at least non-unicode / character data - you might be able to get a speed boost reading the data as binary not as python text strings.
Numba does not support dictionaries and has limited support for pandas dataframes (only underlying arrays, when convertible to NumPy buffers, if I understand correctly). This limits usefulness for many non-array situations, as well as some existing code-bases (the dictionary is fundamental in Python and typically used everywhere -- often for performance).
Interesting assertion re: TimeToWriteCode, but I think there's TimeToWriteCode vs. TimeToWriteGoodCode.
I'm working on my first serious Python project right now, and I find it's super easy to throw together some code that more or less works; but for solid, readable, documented, properly unit-tested code I hope is production-ready, it's not any faster than Perl or Golang.
(Sure, if you're a Python expert it's faster for you than for me, but if it's about TimeForExpertsToWriteGoodCode I'm not any more convinced.)
Production-ready is so complex, it's hard to make any comparison. E.g. for a library, writing good documentation (with diagrams and decent technical writing) takes me way longer coding anyway - probably by an order of magnitude.
Proper unit-testing is also going to take roughly the same time in any language, just because you have to think hard about sensible tests (although I still love mocking/patching in Python, so I'd give it an edge, plus pdb/ipdb for debugging tests is cool). Production-ready also includes deployment, which for anything non-trivial I'd say Golang > Python > Perl.
Finally, if we're talking "serious project", IMO tooling and how that tooling integrates into a CI pipeline are more important than development speed, because as a team or project goes, terrible CI will slow developers more than any language. Although again here I think Python does quite well with decent linting, unit test frameworks, and code coverage options, Golang's opinionated tools are simpler in this respect.
(I enjoyed C# for similar reasons, although I don't think it's kept up w.r.t. tooling - been ages since I used it though.)
Good points. So far I find I really like Python's mocking, "with self.some_useful_patch()" is really nice, and I like the idea of side effects especially with boto. Of course in some cases it's really difficult, but every language has its tricky unit-testing problems.
One big point I would give to Golang, about which lots of people disagree with me, is the "opinionatedness" of it. It seems to me that Python, like Perl, has a "There's More Than One Way To Do It" mentality, and after many years of that I really appreciated Golang's emphasis on the "idiomatic." That goes for the tooling too.
I have also noticed that the Python ecosystem doesn't have a strong documentation culture, which I find annoying as a relative newbie. But that presumably matters less over time, and it seems to be part of the Python Way to use libraries that "just work" and not worry about the details.
>Interesting assertion re: TimeToWriteCode, but I think there's TimeToWriteCode vs. TimeToWriteGoodCode.
In lots of areas, "good code" doesn't matter much, if at all.
Scientific computing is full of those cases -- you write code to run a few times, and don't care for maintaining it and running it ever again (as long as the results are correct).
Sometimes even for a one-shot job you dive down and write passable code then as you start to tackle the complexities of the problem at hand you realise that the amount of ropy code has just tied your hands and now it gets increasingly harder to wrap your head around your implementation and finally complete the one-shot job.
> In lots of areas, "good code" doesn't matter much, if at all.
This is the received wisdom in biological science but I’m convinced that it’s trivially wrong. I’ve seen a lot of research code, most of it bad. I have no idea how many bugs are in this code, and I know for a fact that the original authors also don’t know. And it would be truly exceptional if these pieces of code were bug-free (in fact, there’s enough software engineering know-how to categorically conclude that a very high percentage of such code has bugs). How many of these bugs affect the correctness of the results?
… since the code quality is so bad, this is impossible to quantify. So, yes, code quality does matter in science, since it affects the probability of publishing wrong results.
Incidentally, there are cases of retractions of high-impact papers due to errors in code. Of course this will also happen with better code quality; but if conventional software engineering wisdom is right then it will happen substantially less.
That’s easy with python, too, in a lot of number crunching cases. Numpy with MKL will use all your cores, as will e.g dask and other libraries built on numpy. Farming out embarassingly parallel work to threads or processes is also easy.
If you write more C++ than python, it will have a lower TimeToWriteCode. Despite having spent years writing python I don't find it any more productive than C++.
C++11 has all the nice features you might expect from python with the only drawback being the lack of a REPL.
The lack of REPL compounds with long compilation times, which is practically a feature of C++ and not going to go away anytime soon. The effect is that, when you explore a new API or need to tune parameters to some function call deep in the call stack, you're an order of magnitude slower than with Python (or Lisp, Scala, F#, Haskell, or even Nim or plain C (b/c compilation times)).
If you know exactly what you need to write, you're just as quick in C++ as in Python, that's true. Programming is mostly about learning what to write, though, and here C++ loses.
No, it will help with lack of REPL but not with long compilation times. Long compilation times are bad across the board. Go advertises "fast compilation" as one of its key features for a reason.
EDIT: Not to mention, if you write your code as a lot of tiny functions you could just as well write it in C. Once you go for classes and templates, that's where C++ power is visible, but that's also where its compile times suck.
They're great and incredibly useful. And one should not forget that you can easily use them in a Python extension written in C++14 and exported using Cython or SWIG.
Is total time really that interesting as a metric? Factor in cost, both in terms of, say, what the employer pays you and what they pay for CPU time, sprinkle it with costs in terms of externalities (e.g. the cost of millions of clients executing poorly performing code vs the cost of millions of clients paying for the additional development overhead of well performing code) and the equation is a lot more complex and application-dependent.
Then weigh in the hard realities of some engineering problems. It won't matter that it takes 1% of the time to implement a video decoder in python if it can't deliver decoded frames in a timely manner. It won't matter that the C solution will run 1000x faster if you need a month to develop what should be delivered on Friday.
I'm sorry if this is already covered in the article. I had a brief look before but it won't currently load.
As for high level C++ threading you have OMP. It's incredibly easy to use. In the simplest case you just use a preprocessor directive before a loop to say it should run in parallel. It's probably not as nice as what you get in Haskell because it needs to be done explicitly but it is really easy to use.
GHC Haskell is an advanced optimising compiler, which can get very near the speed of C and C++.
However, to write fast programs, one must use the right data structures and algorithms. Often this means array-based strings and streaming IO, unfortunately many Haskell textbooks don't tend to cover this.
All comparisons of Haskell and C I have seen where Haskell has been near the performance of C(++) has always compared a highly optimised Haskell program to a moderately fast c program.
I have found that most managed languages generally come within 2-5 times slower than C and C++. Which is good enough for me.
Have you tried specifying a smaller maximum heap size? The biggest memory issue IMHO, is that Java and the JVM do not specialise any "generic" code and so there can be a lot of unnecessary boxing going on.
“Lies, damn lies and benchmarks” I think is the saying.
Python itself is a slow language but it has a lot of fast packages, so it shows poorly when you actually write your benchmark in python.
Haskell is a faster language but because it is high level there are more pitfalls you’ll get into if you don’t know the ins and outs of getting fast code out of the compiler. The guys who write fast benchmark code aren’t ‘average’ developers.
So in Haskell it’s “the code is slow and I don’t know why” vs python “the code is slow because python is slow, import fast package someone wrote to speed it up.”
All that said, I think Haskell is the better language but you have to put in more effort to get experienced in it before you see returns on the investment. Python has a shallower learning curve and an easy way to get “good enough” performance (a bit slower than C).
The best criticism in the article is of the multi-core deficiency of python’s interpreter. But that’s only briefly touched on. It isn’t a friendly environment to write complicated multi core code.
> Python has a shallower learning curve and an easy way to get “good enough” performance (a bit slower than C).
The article we all reply to exactly claims that as soon as you don't use e.g. NumPy, it's not "good enough" anymore, and I agree with that. The article also argues that e.g. JavaScript isn't more in the same category with Python, but much faster, even if it's not less dynamic.
I think the reason for JavaScript's speed vs. Python's slowness is obvious: there were wealthy companies involved, which were, due to competition pressures, motivated to speed up their own JavaScript engines.
To get to the point where Python has similar speeds somebody would have to be motivated enough to invest heavily, and then it could happen. As far as I know, there aren't technical limitations against that.
Having actively worked on a JIT compiler for CPython; in retrospect, JavaScript had a significant advantage over python: the expected requirement of one's JavaScript code to run more-or-less compatibly on a variety of interpreters.
So much Python has historically been tied to CPython's specific ideosyncracies that there is significantly more onus on the upstart VM developers to maintain compatibility with paralinguistic behaviour (things like expectations regarding object destruction sequencing).
It seems like the downfall of anything non-CPython is either the C API or the GIL, although I'm shocked at how good package support is looking for PyPy now (http://packages.pypy.org). Makes me want to try it again.
Javascript also has the "benefit" of an appalling base library, while the base library that CPython provides is quite large, and growing.
Every time this comes up, there are also the good examples of Lisp, Dylan and Smalltalk as languages that as dynamic as Python, while enjoying relatively good JIT compilers.
Yes, naive use of lazy evaluation can cause performance problems. So can naive use of strict evaluation. It's important to have a solid understanding of your languages evaluation model. This is probably the largest barrier to writing highly performant Haskell. Not because it is vastly more difficult or anything, but it is very different than pretty much any other widely used language.
I think you could replace "isn't easy" with "isn't familiar". Haskell is very likely to be the first language a developer encounters which is lazy by default instead of strict by default (for many good reasons).
So no, it's not easy in a similar fashion that pointers or double pointers in C/C++ are not easy. Or understanding call by value vs. call by reference semantics are not easy. The list goes on. It's probably the largest barrier to learning the language.
But once you get a handle on the evaluation model it becomes a lot more natural. At least that was my experience, maybe it is not typical.
I don't think it's true to say that Python's core developers are uninterested in performance. Speeding up Python is a hard problem. He mentions PyPy but even that has only managed modest performance gains in some areas (and not without tradeoffs). He suggests JavaScript as a comparison but doesn't elaborate on how they're comparable beyond the superficial (they're both dynamic scripting languages).
I get that he's frustrated with Python's performance but it would be really interesting to hear from someone who knows the technology involved rather than simple speculation.
Python's little tin god really likes his CPython implementation being the One True Python. Python does the things that are easy to do in an interpreter where everything is a dictionary, and avoids things which are hard to do in that environment. In Python, you can store into any variable in any thread from any other thread. You can replace code being executed in another thread. Even Javascript doesn't let you do that. This functionality is very rarely used, and makes it really hard to optimize Python.
(And no, calling C whenever you need to go fast is not a solution. Calling C from Python is risky; you have to maintain all the invariants of the Python system, manually incrementing and decrementing reference counts, and be very careful about not assuming things don't change in the data structures you're looking at. This is not trivial.)
A generation ago, Pascal had the same problem. Wirth had an elegant recursive-descent compiler that didn't optimize. He insisted it be the One True Compiler, and managed to get the ISO standard for Pascal to reflect that. The decline of Pascal followed, although Turbo Pascal for DOS, a much more powerful dialect, had a good run, and Delphi still lives on.
If you neglect some of the metaclass stuff, which I believe most people do, then Python is nearly isomorphic to JavaScript. I think the comparison is very fair.
I also believe the reason Python is unlikely to ever catch up to JavaScript is the same reason that CPython will always be the dominant implementation - They've exposed so much of the C internals, that everyone is bound to the actual slow and single threaded implementation. JavaScript implementations in web browsers can do lots of magic behind the curtains because the majority of users don't rely on the actual innards being consistent from release to release.
Viper is an interesting approach on speeding up Python.
It's developed for MicroPython, which does give them room for breaking changes, but has trade-offs.
Arithmetic is much faster, but dictionary lookups take much longer compared to CPython.
Viper is a code-emitter from a large subset of Python, and even allows for inline assembly. But it's only for a few architectures at the moment, like ARM and x86.
Zerynth is a development suite. Notably, it makes use of a VM.
Viper is just one of the code emitters buried inside the MicroPython source code, like here [0]. Notably, it produces native code, not bytecode for a VM.
To be honest, Dylan is not exactly alive anymore. CL and Smalltalk still have commercial vendors providing implementations, Dylan has currently two implementations, but one of those (Gwydion) is completely neglected and the other (OpenDylan) has maybe 5 developers working on it in their spare time.
It's a shame because even with its verbose syntax Dylan is a nice language, with a module system, with object system based on multimethods, with hygienic macros and, as you noted, an AOT compiler. It could be huge. We could have ended with Dylan in the place of Java and we'd be better off with that. I consider Dylan one of the biggest missed opportunities in PL space.
PyPy achieves huge performance gains in most long-running code. I don't know what you mean by modest. The main hurdle is slightly longer start-up times while the JIT warms up (comparable to Java).
Is writing extensions a lost art? I read a few blog posts about speeding up Python and Ruby with Rust extensions. This should enable rewriting only the slow parts. Later, you could replace more of it if needed. Is writing extensions so very problematic in practice?
I know Go has runtime issues making it not very good for mixing with other languages, so it often encourages rewriting the whole application in it.
Extensions aren't a total solution, though, which people often sell them as. You have an impedance mismatch between Python and C code, because Python has all of its objects packed in a way that is very strange to C, so you end up essentially deserializing all objects into C, then back out into Python, in a very expensive and allocation-heavy (on both sides) conversion.
If you can set up your computation in Python and run it in C, as with a lot of NumPy code, you can have your entire program basically run at C speeds. But if you have a complicated algorithm in Python, perhaps implementing business logic, you can very easily see a slowdown if you try to move bits of that logic into C piecemeal, as you end up paying more in cross-language serialization and overhead than you can win back.
In addition, writing extensions can be hard. First you've got a maze of choices nowadays, and while many of them are quite good at what they do, it can be difficult to figure out whether you're going to do something they aren't good at and have to switch options later, and it's really hard to figure out how to even analyze what they are and are not good at when you're not already familiar with the space. Then, if you do end up having to delve into the raw C, it's very tedious code, very tricky code to deal with the PyObjs, and code that can segfault the interpreter instantly if you don't get it right, which is not what you want to read about your multi-hour processing code. And for a greenfield or very young product, this maze of options is competing against other ecosystems where you can simply implement your code and get it to run 20-50x faster while writing code that is easier to write than a Python extension.
They are a solution to some problems. I don't deny this. NumPy is an existence proof of that statement. But I'd write this post because I'd say "if it's slow, just write the slow bit in C!" has been oversold in the dynamic language community now for at least the 15 years I've been paying attention, and it still seems to be going strong.
In 2003, it may still have been a good choice; in 2018, my recommendation to anybody writing the sort of code where this matters is to pick up one of the several languages that are simply faster to start with, and are much more convenient (even when statically typed) to work with than the competition was in 2003. And I also want to say that Python is still good for many things; I still whip it out every couple of months for something; it's still on my very short list of best languages. But the ground on which it is the best choice is definitely getting squeezed by a lot of very good competition and the changing nature of computer hardware, and a wise engineer pays attention to that and adjusts as needed.
Python extension doesn’t mean C. Rust works perfectly for extensions, it covers a lot of low level c-api integration and it is fast. You can write whole application in rust and use python as a glue language
That sounds like one of the "maze of choices" I mentioned, no?
And if you're "writing the whole application in Rust and using Python as a glue language", you don't have the problem that this entire discussion is about, which is when you have Python code that is slow. Python as an extension language is a completely different world. Performance problems there are a much less big deal, because you've already got the option to simply use the fast language with only modestly more complexity, if indeed even that given how nice Rust is once you get used to it. It's when your whole app is in Python that these issues emerge, and "Just write extensions" is an option far less often than portrayed.
My point is, you are not limited with using extensions only for optimizing hot loops, in rust you can write application logic as well. I doubt you should do this in C for example
PyO3 is a fork of rust-cpython, which has a nasty abort issue[1] which is unfortunately a show-stopper for me. It isn't clear to me if PyO3 is also affected by this issue.
> Pyo3 is not affected by this issue. pyo3 compiles in c-api interface, it doesn't use separate libs for that (python27-sys)
At some point, it must use a separate "lib" for that. Some of the core functions in the Python API exist in the Python binary; it is debatable whether one considers that a "separate lib", or not, but is isn't possible to compile it into your binary. (Or there would be two of whatever you decide to compile in, and that would be problematic.)
I looked into it, since you said it was not be affected. PyO3's own README notes that it is affected by the issue; it lists the same proposed solution as the bug against rust-cpython does. While the solution "works", in the sense that you can build a working module from it, the problem with the solution is that the ergonomics of it are terrible; my understanding is that it completely prevents one from being able to `cargo build`.
That said, I was not aware of either `cargo rustc` or `setuptools-rust`; at the time I was looking into it, setuptools lacked the necessary support to implement `setuptools-rust`, so that's nice to see that that has finally occurred. `cargo rustc` alleviates much of the concern around the ergonomics of building the extension, though that'll still be fun to explain to coworkers. The combination of all that would seem to imply that building Rust extensions might finally be somewhat feasible.
Disagreeing with you and agreeing with the parent, it sounds like a lost art... Numpy doesn't pack and unpack python data structures, it just uses C structures. Python extensions I've written just use C/C++ data types and only occasionally passes python native types back to python. Python is amazing for developer productivity, but the methods are a bit opaque.
"Python extensions I've written just use C/C++ data types and only occasionally passes python native types back to python."
See my other post; was your system basically using Python as an extension language on a system fundamentally implemented in C/C++? That's not the problem this discussion is about, which is when you have a large pile of Python code that is the main component of your system, and has proved to be slow. Piecemeal extensionization is not a very good option there, and non-piecemeal extensionization is "rewriting the system in a faster language".
If it's a lost art, it's because the domain where this is the best option is steadily shrinking. There's an increasing number of languages that interop well with C (and sometimes C++), are more convenient, and are still fast. Many of them are fast enough and convenient enough to simply implement your code in that language in the first place. As a result of that, I personally think that dynamic scripting languages have reached their peak and are now facing inexorable slow decline; the problem they solved in the 1990s is increasingly not a problem as a crop of languages that are both convenient and fast continue marching forward. JavaScript is, as ever, an exception due to its currently-privileged place in the browser ecosystem, though over the next decade that's going to fade as WebAssembly hits maturity.
(But let me emphasize that "slow"; I'm talking decades, not months. There is still plenty of opportunity to graduate this semester, get a job in dynamic scripting languages, and be in that space for 20 years. But I think in another 2-4 years we're all going to be able to agree they've peaked.)
I'm not sure I follow what's hard about writing extensions. You can do it in modern C++ if you will and even old SWIG can generate decent bindings for you if you keep your interface sane.
I agree with that. But for the OPs specific problem, a function returning the genetic diversity as a float between two samples, rewriting it in C or Pyrex would have been an ideal solution.
Not even extensions are required -- you could have C libraries that you hook into rather easily. You don't have to learn to navigate the PyObj or the Python extension docs, just know how to write & expose C functions as a shared library.
If an extension just works, people take it, the author gets two or three positive remarks and is ignored from then on because the extension now has utility status.
Much better to write an application in Python using 30 slow Python module dependencies, so there are always some fires to fight and the application always gets publicity.
> At the same time, data keeps getting bigger and computers come with more and more cores (which Python cannot easily take advantage of), while single-core performance is only slowly getting better. Thus, Python is a worse and worse solution, performance-wise.
PySpark is makes it really easy to take advantage of multiple cores & machines. Most operations I want to do to my data I can find in PySpark's pyspark.sql.functions, so I get all the benefits of the JVM. In the cases I need something from Python, I can just UDF, it's a little slower than JVM but still extremely fast when distributed--I find all problems come down to time or memory complexity, which is independent to whatever your programming in. Also, it's very easy to take advantage of spot instances with Spark... I'm usually working with 2-20 spot instances, and sometimes go up to 60 depending on what I'm doing.
The original article said that one reason it doesn't matter that pure Python's performance is poor is that you can use numpy (and pandas) to vectorise things, which then has native code performance. It goes on to say that his current problem is that the things he's doing today can't be vectorised with numpy – so that poor performance does matter after all. If he can't even express his code in terms of numpy operations (and other C-based libraries like scipy), I doubt they're going to be expressible in terms of Spark's primitives, which are a considerably smaller subset.
This is true, but using a Distributed system like Spark itself adds a ton of complexity in having to understand and manage it. If one can do something with a set of stateless processes, even if it's more performant, I feel it's a bad idea to use a distributed system instead. Not always, but a good majority of cases that I've seen. I've seen projects where Celery would be enough but instead they chose to use Spark/Storm and never delivered.
Spark is great, but at that point why not just use Scala? It offers Python-like conciseness/productivity, and by using the same language Spark is written in you avoid a big class of possible interop issues.
PySpark should only be used for prototyping. It add an enormous extra overhead on operations due to serializing data back and forth between the Java and Python processes.
I’d like to thank the author for sharing a very practical view of problem solving in the data science space.
Can I suggest julia? Its very easy to understand coming from python, and performant code can be had usually in easy to read implementation of the expressions in whatever paper you are basing your work upon.
Dan is thorough, and I trust him to make a good faith effort to understand things. If you'd like to refute the arguments and not the messenger, I would love to learn more.
The post is quite old: while the technical arguments certainly had merit at the time, they have largely been addressed (the exception is probably error handling, but his complaint there is more subjective, and I still don't think any language really has a good answer for that one).
As to the community, I'm not exactly sure what happened with Dan (he only has a handful of posts on GitHub and mailing lists, so it seems to have been largely in private emails), but my experience could not have been more different: even from the early days they have been very friendly and helpful.
The linked article doesn't attempt to refute the significant claims from Dan's article (which has an update from a year later, so 2015, at the bottom):
1. That the language is (was) undertested, and as a result, full of easy to run into bugs
2. The language makes it easy to ignore errors
3. APIs are inconsistent
4. The head branch isn't kept build-clean (i.e., often it fails to build)
5. Code is often undocumented, with non-descriptive naming. The combination makes it difficult to understand what code is doing.
Dan also writes that he expects the performance issues would be fixed and those aren't what concerns him:
> The purely technical problems, like slow load times or the package manager, are being fixed or will be fixed, so there’s not much to say there.
And bonus community problem:
> Update: this post was edited a bit to remove a sentence about how friendly the Julia community is since that no longer seemed appropriate in light of recent private and semi-private communications from one of the co-creators of Julia. They were, by far, the nastiest and most dishonest responses I’ve ever gotten to any blog post. Some of those responses were on a private discussion channel; multiple people later talked to me about how shocked they were at the sheer meanness and dishonesty of the responses. Oh, and there’s also the public mailing list. The responses there weren’t in the same league, but even so, I didn’t stick around long since I unsubscribed when one the Julia co-creators responded with something bad enough that it prompted someone else to to suggest sticking to the facts and avoiding attacks.
Now, I don't know if any or how many of those issues have been fixed since 2014-2015. But the 2018 blog post about microbenchmark improvements doesn't really address these concerns.
There have been over 25 000 commits since that post was written, clearly, any comments about the specifics of the language are terribly out of date.
For example, the base language now has 92% test coverage.
CI is being run on Windows 32/64 bit, Linux 32/64bit, macOS, FreeBSD with ARM CI coming up.
Every day a large number of benchmarks are run on dedicated hardware and the results are tracked. Benchmarks are also run on most PRs that has potential performance implications bhefore merging. Before new point releases, the tests for all registered packages are run and compared to the old version. Every new failure in a package test is tracked down to make sure there is nothing breaking in the point release.
While package load time is still an issue, the introduced precompilation feature has significantly helped with this. Also, new methods of working with the language, like Revise.jl (https://github.com/timholy/Revise.jl) which updates the code that is being executed in real time when you save your file, makes load time much less of an issue.
No matter who the author is of a blogpost, if it is made about something that has been in rapid development for years after the post was written, the information will have little relevance to the current situation.
(1) Test coverage has dramatically improved.
(2) Improved, and to some degree a matter of taste (Julia is not statically checked, so some errors can only be discovered at runtime)
(3) There has been significant refactoring since 2014 to improve consistency, especially in the recent push toward 1.0.
(4) Developers are much more disciplined now, and just about everything goes through CI.
(5) Docs are much, much better for both users and developers. For the latter, see https://docs.julialang.org/en/latest/devdocs/ast/ and others in that section (none of which existed in 2014, IIRC).
> And bonus community problem:
I can't comment on private discussions to which I was not a party, but people can search the archived google groups discussions (julia-users) for the blog author's name to read the public threads.
That said, the core of the concern, as I understand it, was that these perceived issues would put a damper on growth. Young languages, especially, need growth to outpace attrition to maintain viability (ecosystem, broad testing and platform support, etc.). Three years on, with sustained user-base growth, increased funding, and Julia having been used for a number of articles in high-impact journals (e.g. Nature) -- I think this concern is somewhat less pressing. See
https://pkg.julialang.org/pulse.html for a snapshot of the sustained ecosystem growth, and
https://discourse.julialang.org to get a sense of the breadth, depth, and responsiveness of the community (in my obviously-biased view, of course!).
A related concern, was (quoting the blog post):
> A small team of highly talented developers who can basically hold all of the code in their collective heads can make great progress while eschewing anything that isn’t just straight coding at the cost of making it more difficult for other people to contribute. Is that worth it? It’s hard to say. If you have to slow down Jeff, Keno, and the other super productive core contributors and all you get out of it is a couple of bums like me, that’s probably not worth it.
Julia as a whole has had 697 contributors right now. At the very core of the language, the parser and lowering code have well over 40 contributors, and code-generation (LLVM lowering) has over 60.
It's really shocking to me that years later, nobody has addressed the most serious issue (community interaction) Dan brought up. Just reading the archived mailing list thread I could not believe the suspicion, and later outright hostility towards Dan's comments. It obviously dawned on a couple of core contributors, as can be seen by the sheepish attempts to walk things back, but this sort of interaction _with the project leadership_ is a giant red flag about how they view their community and users. If someone like Dan can be treated in that way publicly (and far worse privately, according to his own account), things are going to be so much worse for other people.
FWIW, I have had very positive interactions with the community, including some discussions about proposed features on the github issues with founders as part of the discussion.
My impression is they are thinking very deeply about this work, from a theoretical CS standpoint, but also open to input from average users of the language.
Thanks for mentioning Julia, a good solution to slow Python code. I used Julia for 2 weeks last year on a consulting gig, and despite rough spots and given more development time, Julia might become fairly popular.
I have never been much of a fan of Python. 15 years ago at lunch Peter Norvig was talking about the advantages of Python (he and I wrote Common Lisp books at the same time). I then tried Python for a few months but then went back to Ruby for a scripting language and Java, Common Lisp, and Haskell for non-scripting tasks.
I now manage a machine learning team that is all-in using Python so that is what I am using also. I am starting to really like Python, and by using type annotations, heavy use of pylint and PyCharm, I am finding Python really nice to use. Spinning up on using Cython is on my want-to-do list also.
This. I can't emphasize enough how type hints + decent IDE is useful to productivity.
Cython is also pretty good. My tip to write fast Cython is to code as if you were writing pure C and forget higher level constructs. This way it gets translates to C almost 1:1 with all performance benefits.
Alternatively, if you do need more abstraction, you can write nice C++14 and use Cython as glue code.
This is the productivity problem that julia aims to solve. The idea is to stay in one language that everyone on the team can work productively in, while still producing hi performance code.
>I used to make this argument. Some of it is just a form of utilitarian programming: having a program that runs 1 minute faster but takes 50 extra hours to write is not worth it unless you run it >3000 times. For code that is written as part of data analysis, this is rarely the case.
I find this argument breaks down if you consider human psychology. Especially with a program taking 15 seconds or 30 minutes (which is a reasonable time span between say a C++/Rust implementation and a Python implementation in some cases I've experienced).
With 15 seconds exec time you might stay in flow. With 30 minutes you're almost guaranteed to have started something else. Maybe you even forget and only get back to it the next day. All of a sudden your 30minutes delay become a day. Then you notice you made some wrong inputs, and you lose another day. In the other case you're still under a minute.
I find small increases in program delay often lead to big increases in time inefficiency. It's hard to constantly context switch in and out of tasks.
I think the utilitarian argument should take human psychology into effect and weight more towards faster programs.
I recently discovered that pypy3 can run all my day to day Python code. It has some issues with slightly different behavior from cpython when using threads but other than that I see a 4x speedup on most of my slowest pure python workloads (parsing large rdf files and reserializing them after computing a total order on all their nodes). Huge win for productivity.
> I see a 4x speedup on most of my slowest pure python workloads
Heh, only 25..250 X to go. We did a direct line for line translation of some numerically intensive code from Python to C++ and saw a literal 1000X speedup. On other projects, it's been more like 100X slower. That says two things: first Python can be really slow, second, for some programs, Python doesn't really save on lines of code over modern C++.
I've been very impressed with PyPy however. In testing, it can sometimes sneak up to less than a factor of 2 slower than C. However, the bummer comes when it doesn't hit that mark and you have no idea how to trick the JIT to do better. If it works, great. If it doesn't, you don't have much insight into why.
Finally, I've always been able to get Cython to parity with C++. However, when I'm done, I wonder what I gained. The C++ isn't that much more complicated than adequately type annotated Cython.
While I'm more of a pythonist than a C-ist, hearing "1000x speedup" and "line for line" to me implies that you aren't writing idiomatic python. Idiomatic python is (often) faster than not, and (often) more difficult to translate to lower level languages.
As a simple example, list-comprehensions are faster than loops, and can't be line for line translated into C++.
#include "cpplinq.hpp"
int computes_a_sum ()
{
using namespace cpplinq;
int ints[] = {3,1,4,1,5,9,2,6,5,4};
auto result = from_array (ints)
>> where ([](int i) {return i%2 ==0;}) // Keep only even numbers
>> sum () // Sum remaining numbers
;
return result;
}
Are they really though? In my experience you'll gain a couple of percent because you're getting rid of that call to append, but that's hardly the orders of magnitude OP was looking for.
Yes, I wasn't saying they would fix this problem (they won't), but making the broader point that in general, more idiomatic/pleasant looking code is faster. This is neither of those, and so my hunch is that there are improvements that can be made to both form and function.
> Finally, I've always been able to get Cython to parity with C++. However, when I'm done, I wonder what I gained. The C++ isn't that much more complicated than adequately type annotated Cython.
Depends on who needs to work with the code; but sometimes, it might be nice to have an obviously correct (but slow) pure python version, that can share apis and tests with the convoluted "fast enough" version.
Also, I’ve more than once seen cpython beat C++/Fortran since it’s easier to do the right algo/datastructure things, plus numpy is more optimized than most «amateur» C loop-over-arrays.
Honestly, NumPy is gonna be hard to beat even for someone knowledgeable in certain use cases, especially ones where the overhead in Python is trumped by time spent in library calls. It's the same reason that it's hard to beat MATLAB or Mathematica in cases they are optimized for despite being relatively slow languages. They are calling some of the most heavily optimized libraries in existence (e.g., BLAS) and using heuristics to help choose the smartest evaluation strategy.
Edit: More speed on the Python side is good though, because it gives you flexibility. Sometimes it's hard to figure out how to do stuff optimally in NumPy, versus just banging things out in a for loop. I've definitely done that when I wanted something to just work, versus spending an hour figuring out what arcane incantation I need to pass to np.einsum to get the operation I want.
Sure, I’ve seen C++ experts beat by numpy, i.e BLAS/LAPACK. A lot of people don’t use BLAS from C++ either, which would improve the performance on that side, too.
I don't find this a very compelling argument. The author doesn't mention any attempts to profile or speed up the code.
Specifically with pandas I've found if you aren't careful you can do a lot of unnecessary copying. Not sure if that's what is going on here, but cProfile can help find the bottlenecks.
Seconding this, there are a couple of things that jump out at me as immediately non-optimal, and which together would probably give an order of magnitude speedup.
- Defining compute_diversity inside a double for loop
- `sample1.ix[sample1.index[sample1.index.duplicated()]]` appears overengineered (I think you can just remove the `sample1.index` here (edit: you can't , but I think you could refactor to remove the indexing and reindexing and index resetting, and then you could))
- Depending on the data size, swapping from `[` to `(` everywhere would give a nice speedup just because you no longer need to store everything in memory/swap to disk, whereas in haskell the list comprehensions would be lazy by default. (edit: seeing as the databases downloaded are 12 and 33 GB, and Pandas requires generally 2-3X ram, its likely that there's swapping happening somewhere. I'd bet that using generators would be a big speed boost)
- Overall I think genetic_distance can be significantly simplified, a lot of the index-massaging doesn't look necessary. I could be wrong, but this looks sloppy, and sloppy often implies slower than necessary.
Unfortunately, the provided data files are big enough that I can't easily benchmark on my computer. I can't even fit the dataset in memory!
The problem is libraries that works fine when everything fits in RAM start breaking down if you aren't careful. Not really python speed issue, but you lose some of the tools you relied on previously.
While that may be true, my point is that it is almost certainly possible to make your code go faster than it is already, and also become more readable in the process.
And so saying that python is either slow or ugly and unreadable is perhaps an unfair characterization. I may be wrong here. I haven't benchmarked the code in question, but I think that even for the algorithm you're trying to do, with the special casing, that function could be significantly simplified.
Edit: I'd be curious to see example data that is passed into this function.
That may be the case. However, my point is that we started with a rather direct implementation of a formula in a paper. This was very easy to write but took hours on a test set (which we could extrapolate to taking weeks on real data!).
Then, I spent a few hours and ended up with that ugly code that now takes a few seconds (and is dominated by the whole analysis taking several minutes, so it would not be worth it even if you could potentially make this function take zero time).
Maybe with a few more hours, I could get both readability and speed, but that is not worth it (at this moment, at least).
*
The comment about the benchmark data being large is exactly my point: as datasets are growing faster than CPU speed, low-level performance matters more than it did a few years ago (at least if you are working, as I am, with these large data).
1. Have gotten similar performance boosts elsewhere, meaning that you wouldn't have needed to refactor this function in the first place (although the implication of a 10000x speedup means that may not be true, although I can absolutely see the potential for 100x speedups in this code, depending on exactly what the input data is)
2. Its likely that there are much more natural ways to implement the function you have in pandas more idiomatically. These would be both clearer and likely equally fast, though possibly faster. (heck, there are even ways to refactor the code you have to make it look a lot like the direct from the paper impl)
In other words, this isn't (necessarily) a case of python having weak performance, its a case of unidiomatic python having weak performance. This is true in any language though. You can write unidiomatic code in any language, and more often than not it will be slower than a similar idiomatic method (repeatedly apply `foldl` in haskell). I'm not enough of an expert in pandas multi-level indexes to say that for certain, but I'd bet there are more efficient ways to do what you're doing from within pandas that look a lot less ugly and run similarly fast.
Granted, there's an argument to be made that the idiomatic way should be more obvious. But "uncommon pandas indexing tools should be more discoverable" is not the same as "python is unworkably slow".
1. No, that function was the bottleneck, by far, and I can tell you that >10,000x was what we got between the initial version and the final one.
2. I don't care about faster at this point. The function is fast enough. Maybe there is some magic incantation of pandas that will be readable and compute the same values, but I will believe it when I see it. What I thought was more idiomatic was much slower.
I think this is more of a case of "the problem does not fit numpy/pandas' structure (because of how the duplicated indices need to be handled), so you end up with ugly code."
1. you don't get 10000x speedups by changing languages. It's likely that this optimization would be necessary in any case.
2. You don't care about improving the code, but you did care enough to write an article saying that the language didn't fit your needs without actually doing the due diligence to check and see if the language fit your needs. That's the part that gets me.
Well, I just wanted to use pandas to load a 4GB csv file. After using 32GB of my RAM, and 4GB of swap I gave up. I've just loaded all that data to Postgres, and made a couple of queries. This way I stopped using pandas at all.
I found that pandas is great for data exploration and data that you know is small (few 100s MB). Other than that, Python builtins and numpy arrays are a better alternative.
I hardly use pandas at this point besides read_csv, which is very good once you know the syntax for parsing strings/dates, skipping rows, dropping columns, etc.
After that I usually just keep the numpy array since all I need is floats. I guess the index groupby stuff is cool, but I never really needed it. Postgres is fine but if you're just doing numerics it doesn't help much.
I use a lot of Python for web stuff and I haven't been in a situation where Python itself was the performance bottleneck. I always thought that when you run into a situation where Python is the bottleneck, you replace the critical bits with something like C/C++/Rust. Following this approach, you would get the best of both worlds: rapid proof-of-concept/time-to-marked with the option to improve performance critical parts later (which often isn't necessary). Could anybody share some experience with this?
It requires that your code is architected so performance critical sections of Python can move into C etc. Let's say that your code creates a complex object tree from some configuration settings, and executes Python methods and code from all over it, using heavy OO. That is difficult to move to C++ as your performance is spent on Python bookkeeping -- you are calling methods and thus looking things up in dictionary, you are modifying fields and also looking up more thing in dictionaries, increasing decreasing refcounts etc.
If you have a million 32-bit numbers that you current run Python code on, great, you don't have to convert Python objects to C at all.
> Let's say that your code creates a complex object tree from some configuration settings, and executes Python methods and code from all over it, using heavy OO.
Luckily, this does not apply to the codebases I'm working on, which are all quite functional (no classes, no inheritance, pure functions exclusively, immutable data types, etc.) I have the feeling that this will not strike me that hard. If you rely on pure functions you have all the application state that you need for the function on the parameters and you pass all new state back through `return`. I guess all I had to do is convert the types once for the C function call (from Python to C) and once for the `return` (from C to Python)?
His example of a function which is unreadable, is pretty typical. It still might be slower than a tight loop in C, but it’s only unreadable the first time you write something like that.
I admit I recognized only some general NumPy things like masks, unique, reductions, outer etc: I don’t use Pandas and not sure what the non-Numpy stuff does.
I still don’t think it’s more obscure than equivalent for loops or FP folds or similar.
The go-to solution for speeding up Python code should always be first to use Cython on critical sections of your Python code and tweak your code using type annotations, at least IMHO.
Do type annotations really make any difference to the interpreter? I thought that the interpreter doesn't care about what type a variable is annotated to...
> Do type annotations really make any difference to the interpreter?
Note the advice was first use Cython (a compiler for a superset of Python), and then tweak as needed with type annotations. Cython’s compiler definitely uses type annotations.
OP, I would encourage you take some courses on High-performance computing and, specially, on Architecture Awareness in Programming. These types of courses will help you increasing the performance of your programs by being aware of what's running "under the hood" and ways to "help" the compiler/interpreter better optimisations.
Although it's accurate to say that Haskell or C++ are faster than Python, by having had a quick look on the examples you posted around here, I believe there's still a lot of room to improve (performance-wise) on your Python code that could bring a significant speedup boost.
However, bear in mind that you shouldn't expect Python to be close to a C++ performance unless you start using libraries such as NumPy that are, essentially, written in C/C++.
To me, the critical quality exposed is the abstraction/synthesis moment of Haskell/types/FP thinking. Python is what I use, but I rely on insights from a Haskell person to get solutions of merit. Left to my own devices I frequently derive python solutions with bad scaling, few and weak opportunitistic parallelism moments and heaps of errors.
When driven to think in types and simple function composition the solutions seem to run better
Let me disclaim by saying I like Python, and I've used it for a decade and it pays my bills.
The author claims that Python has the lowest developer cost. I used to think that was true, and maybe it is in data science applications, but I regularly find that I'm quite a bit more productive in Go than in Python (largely thanks to the type checker and other static analysis tooling). As an added bonus, Go programs are regularly 100 times faster than Python programs, and usually Python programs are much more difficult to optimize than Go programs.
Library availability notwithstanding, starting new projects (of any significance at all) in Python is looking like a worse and worse choice all the time.
> Python, it is slow as molasses. I don’t mean slower in the sense of “wait a couple of seconds”, I mean “wait several hours instead of 2 minutes.”
Python can be multiple orders of magnitude slower than the equivalent in C/C++/Rust, but,
> more cores (which Python cannot easily take advantage of)
Python's multiprocessing makes launching new processes (which can take advantage of more cores) pretty much as easy as launching threads.
But as a developer, I am frustrated by a lot of the things people believe are options. They are options, but … they're hard to use, and hard to take advantage of.
* Writing a Python extension requires dropping down to C, which has so many foot-guns, I'd like to delay doing so as long as absolutely possible. Even then, you might not be able to win back that much performance, if most of your time is spent manipulating Python objects. Programmers, in my experience, also vastly overestimate their ability to write correct C.
* Cython can compile Python to "C", but in Python 2 (which I am alas stuck with, despite my will; someday…), has a bug that miscompiles code dealing with metaclasses. Worse still, the latest version of six will trigger this bug. (The Cython developers do not consider this — Cython's compiled version of code behaving differently from Python and CPython — a bug.)
* rust-cpython is theoretically great, but has a bug on OS X that causes aborts (it erroneously links against the Python binary, I think, and this causes issues w/ virtual environments, where a different binary ends up getting used. I don't think this effects Linux, but I need to support OS X.)
(Throw in the enormous amount of time that I spend debugging object of type NoneType has no attribute "static_typing", and the amount of time that I spend wonder "what type is this variable supposed to be? and working it out by reverse engineering the code, and I honestly wonder if Python is actually "faster".)
It's hard to tell ahead of time if performance matters. When you hack something out, and are able to get it to work, it inevitably starts to grow features. At some point (around 1000 lines in my experience), Python no longer is fast enough, but the code is difficult to port to something faster.
It would be much better if you didn't have to back track on all the code you wrote. I've been burned by this enough times, that now I am wary of starting anything in Python, because I know it will grow to be bigger, and I will regret having picked Python!
Exactly. For web dev stuff, the database is usually the bottleneck. Caching is the next lace to put your efforts. I think I have had one problem where Python was too slow.
At PyCon US last year, Intel had a booth promoting their version of Python and data libraries optimized for Intel processors. Wonder if any of this would have helped.
I don't think they made raw-python-code faster (like pypy) but bundle some libraries and make them faster (like they have some if cpu=intel: enable optimizaion they've done elsewhere).
Sad not to see Cython getting a mention in this post. It is a superset of Python, so your vanilla code will run just fine, and you can optimize slow parts to native-speed. It's a fantastic tool.
I have an impression that there are features in Python that give very little programmer productivity but make the language slow. It should be possible to implement a hypothetical FastPython without such features but with great performance gains. Of course it wouldn't be compatible with most of the libraries. I can imagine though that porting most of the libraries to FastPython still would be a manageable task. I wonder if such projects had been attempted.
You mention 1TB files. Why do you guys at embl not use a database for this sort of stuff? I'd figure that with some proper indexing, I figure you could see pretty decent speedups just from that already.
Not OP: but working on a downstream project and my current boss used to work on EMBL-bank in the day.
A lot of this stuff is in databases. e.g. Oracle and I think for advanced search it was in Teradata.
However, databases are hard to share so many steps require dumping the database into some interchange formats (custom and often from before the age of XML or JSON, yeah for ASN.1 parsing!)
Sharing database dumps is done but commercial licenses and version mismatches do add issues here as well. Remember EMBL/ENA is older than MySQL.
The databases tend to have the wrong shape for the next downstream step i.e. table design is related to work flow and if your next step in working is completely different we end up with issues. Also some data can't be published until a certain date so that needs to be filtered from the dumps in some way.
Consider as well that this project is 3 decades old and used to be printed in books at some point, and shipped on DVD as recently as 2004. File based operations can be extremely efficient.
For some things, we do. But databases are not magical and setting up a good table/index system &c is also work and there is overhead.
Thus, if we are talking about having (for example) a webservice where queries have a form that is known apriori, then it's a good solution. If you have output data from your processing that you will be slicing and dicing in different ways which you cannot predict ahead of time, then, they are not appropriate.
(Loading Terabytes of data into a database takes a while too).
Bioinformatics is perpetually ten years behind. The de facto standard for sequencing data is effectively a stripped down bzipped plain text file. It's madness.
Naive interpretation of the bytecode (not even pre-decoded, just a switch statement).
And almost everything is resolved in the dynamic environment. for example,
This is a bit misleading. You suggest that local variables are looked up by name in a dictionary, which is not the case. They are looked up by indexing into a C array, with the index being a constant in the bytecode. That's quite a lot simpler. Here is the corresponding code (look above for the definition of the GETLOCAL macro): https://github.com/python/cpython/blob/fc1ce810f1da593648b4d...
But this isn't a very good rendering of the thing because it doesn't show the many redundant reference count increment/decrement pairs every time you touch a variable.
(Also, interpreter dispatch uses computed GOTOs instead of the plain switch on C compilers that support it.)
LOAD_FAST is the normal case for locals inside a function. Not sure off the top of my head where LOAD_NAME would be generated in normal usage (i.e. where you don't evaluate code from a string).
Edit: Also, I'm talking about Python 3. Maybe you aren't.
> Where are the main blowouts in python performance?
I did some research a few years ago that tried to quantify some of this. If you trust my methodology, the biggest problems (depending on application, of course) are: boxing of numbers; list/array indexing with boxed numbers and bounds checking; and late binding of method calls. Basically, doing arithmetic on lists of numbers in pure Python is about the worst thing you can do.
And it's not just due to dynamic typing: Even if you know that two numbers you want to add are floats, they are still floats stored in boxed form as objects in memory, and you have to go fetch them and allocate a new heap object for the result.
The basic idea of my study was as follows: Compile Python code to "faithful" machine code that preserves all the operations the interpreter has to do: dynamic lookups of all operations, unboxing of numbers, reference counting. Then also compile machine code that eliminates some of these operations by using type information or simple program analysis. Compare the execution time of the different versions; the difference should be a measure of the costs of the operations you optimized away. This is not optimal because there is no way to account for second-order effects due to caching and such. But it was a fun thing to do.
As for how to improve this, I think Stefan Brunthaler did the most, and the most successful, work on purely interpretative optimizations for Python. Here is one paper that claims speedups between 1.5x and 4x on some standard microbenchmarks: https://arxiv.org/abs/1310.2300
Basically, you can apply some standard interpreter/JIT optimization techniques like superinstructions or inline caching to Python. But these things are hard to do, they won't matter for most Python applications, and come with a lot of complications.
tl;dr: Python's dynamic features add lots of overhead to every operation, and CPython's simple implementation means you pay the overhead even when you don't use the dynamic features.
A few things quickly come to mind, after having maintained a patched version of Python 2.7:
- The dot operator (e.g. `foo.x`) hides a /very complicated/ resolution process that can be /very expensive/. (The documentation about this process also deceptively makes you /think/ you understand how it all works, whereas you probably don't unless you're intimate with the C implementation.)
- Global variables are slower to access than local variables in CPython: the former require hash table operations, whereas the latter are array operations. Global variables can also be of pretty much any type, not just strings, which further complicates how globals are handled.
- `import` statements are idiomatically done at the top-level of a module, and often are used as qualified imports! E.g., `import os` followed by the use of `os.path.join(foo, bar)` later on. This hits the costs of both global variables and the dot operator.
- Other syntactically simple constructs, like indexing, relational operators, `len(foo)`, etc, all support overloading, increasing the complexity of the implementation of these operators.
- CPython has a simple implementation (bytecode interpreter, not really any optimizations), meaning the cost to support overloading and dynamism is /always paid/.
Great article. The way I look at it is something like the swordsman scene in Raiders of the Lost Ark. The swordsman's doing fancy optimizations and C just blows away the need.
If you can find a language that's x100 the performance of an interpreted language, that speed delta will cover up lots of naivety in the code you write.
Maybe there's merit in a Python version that compiles to JS where we allow engines like V8 to do the optimisation. The dynamic nature of JS maybe enough impedance-MATCH to allow this to happen?
Writing this up, I'll probably get someone suggesting that this is already a reality with some tool. Glad to be taught.
I don't see it mentioned here, but the dask library looks like a promising solution,. It has ways to handle to these kinds of large datasets, and efficiently schedule computations that don't fit a numpy model. Worth a look.
When I was taking a python class in school, the professor did something to generate C code from the Python code, and it gave something like a 60% speedup.
Probably Cython. Sometimes it helps, sometimes it doesn't work (doesn't like generator comprehensions iirc) but mostly it provides a "sliding scale" into C or C++ land -- after the first compile, you can start littering type declarations around the code and you can stop whenever you hit the speed you want. It has been a good solution for me in the past, but probably only because I started with a Python codebase. If I were starting afresh I wouldn't bother.
CPython is obviously the way it is on purpose, by design, and quite successful. Yet I still find it ironic that this "slow" interpreter is written in C, the go to, general purpose, "low level and a half" fast language, and that "C" is right there in the name. Not knowing anything else, I might expect a project with "C" at the front of the name would be at least fast-ish, and that the naming was intended to signal that.
What? No. The c in cpython was not meant to signal performance. The c was added after alternative implementations were created to mean "the original/reference implementation". Python's competitors perl ruby and php are likewise interpreters implemented in c and are equally slow.
Anyway, what alternative did they have to implement a bytecode interpreter? C is portable and fast enough, especially with computed goto extension.
The problem here is that most language are tailored to easy learning by humans and not to strong performance. If you had a language tailored to strong performance, it would force you to bundle together seemingly unrelated data into structures used in the main hot codepath of the core algorithm.
It would then force you to specify data delivery routes and processing- and would craft a cache optimal process loop dependent on the used architecture.
The result would seem like a enormous while loop, that takes seemingly arbitrary number of for loops to preprocess data, glued together in strange overleafing unions to shove the endresult to the main process algorithm.
This would be the most optimal result for a processor- but to write a language to describe this- and compilers to implement this) - the horror.
Something I often wonder in these sort of discussions is why C# is generally omitted. Its performance is comparable to C++, with none of the trappings. It also does an excellent job of integrating some of the most useful features of functional programming into an imperative language. And multi-processor programming with the language is also incredibly simple.
But I think the best part is in programmer time. An anecdote I find endlessly entertaining is on another forum I shared some code to solve a problem people were having an issue with, and it was assumed my code was pseudo-code. It was correct, compilable C#. And they're constantly adding incredibly useful features. For instance a recent addition is more expressive tuples:
Yet as typical in scenarios like this one, the author sees the decision as being between opposite extremes of C++ and Python. The only downsides of the language I've run into are a lack of some shoot yourself in the foot features of C++, like multiple inheritance, and the fact that template specialization is awkward. Garbage collection is vastly overblown. My main work is with projects that have in memory collections in the gigs of size and you'd think the collector would be a huge issue, yet it's mostly transparent and can be controlled if necessary - which in the vast majority of cases, is not.
It has been for the longest time been a closed source MSFT only thing. It wasn't open source and running on linux was a second class citizen. Not sure if it is still a second class citizen.
It's also seen as something fairly heavyweight to write things in, such as Java. You probably don't see it used much for the same reasons why Java isn't used.
Honestly, I love C#, but if I was looking for a sane language with a wide library to draw from that matches as many use-cases as possible, I'd probably do Kotlin. Java has a solid community for just about everything, and Kotlin is close-enough to c# for my taste. But Kotlin doesn't have much name recognition outside of Android.
Why do you think it's been closed source with minimal cross platform effort? Most of everything from Microsoft related to C# is open source. This includes their compiler, the runtime, and the framework and libraries. And it's been this way since 2014. The .NET standard itself has always been open and cross platform implementations like Mono go back to 2004! I'm not sure what you mean by 'heavyweight.' Its performance is head to head with C++.
And if anybody downvoting could take a second to actually chime in, it would be illuminating. From my perspective all I see an immense amount of misinformation about the language, and I'm genuinely curious what people may not like about it.
I know little on this topic, but as you were looking for some feedback I thought I would response.
Because the Microsoft version was closed source for a long time?
It was released in 2000. You give 2014 as the open source date, which was only 4 years ago.
Woolvalley was referring to the implementation, so bringing up the specification is not that relevant.
The ECMA specification does (did?) not include ASP.NET, ADO.NET, and Windows Forms. The implementation of those APIs were potentially covered under patent, as the Microsoft Community Promise did not apply.
As Wikipedia points out, "These technologies are today[when?] not fully implemented in Mono and not required for developing Mono-applications, they are simply there for developers and users who need full compatibility with the Windows system." https://en.wikipedia.org/wiki/Mono_(software)#Mono_and_Micro... .
While that point is moot today, that is part of the history which guides current views.
If by C# you mean the ECMA specification, then that's different than C# as available for Windows by Microsoft.
Is C# a first-class citizen on Linux comparable to how Visual C# is a first-class citizen on Microsoft Windows, or how C++ is a first-class citizen on Linux? I don't have the experience with that, but it doesn't seem to be the case.
What languages do you consider to be "first-class" and "second-class" on Linux?
One clarification, since my own language was sloppy. C# did not get open sourced in 2014. That was the date that Microsoft open sourced just about everything (and that's a weasel word there - to my knowledge, it is everything) that wasn't already open source. Things like ADO/ASP/Windows Forms/etc were open sourced back in 2008, along with Microsoft's implementation of their framework libraries, which Mono rapidly integrated into their implementation.
And so on that note, much of that information on the Wiki page is outdated by at least a decade. Mono, especially since their 2.0 release (back in 2008), has been a fully fleshed out, cross platform, production ready alternative to Microsoft's implementation. So I certainly wouldn't call C# a second class citizen on Linux by any means. The one and only reason Linux is not my primary development platform is the lack of Visual Studio.
Ultimately C# started out pretty awful. The language was a mediocre java clone, performance was abysmal, and the language itself was lacking in features. But that changed relatively rapidly, and certainly today it bears little resemblance to where it started. And I think that is perhaps the problem, people seem to think C# of today is the C# of 2004, but I don't understand why that degree of misinformation is so strongly exemplified in this particular language. I think it's particularly a shame because of what a successful tool the language has evolved into.
> Microsoft to open source more of .NET, and bring it to Linux, Mac OS X
> Microsoft is porting its server-side .NET stack to Linux and Mac OS X, and is making more of that stack available as open source. ...
> In April 2014, Microsoft announced plans to open source a number of its developer technologies , including ASP.NET, the Roslyn .NET compiler platform, the .NET Micro Framework, .NET Rx and the VB and C# programming languages. ...
> Microsoft is not planning to open source the client side .NET stack, which means certain pieces like the Windows Presentation Foundation (WPF) and Windows Forms won't be going open source,
> As a .NET developer you were able to build & run code on more than just Windows for a while now, including Linux, MacOS, iOs and Android.
> The challenge is that the Windows implementation has one code base while Mono has a completely separate code base. The Mono community was essentially forced to re-implement .NET because no open source implementation was available. Sure, the source code was available since Rotor but we didn’t use an OSI approved open source license, which made Rotor a non-starter. Customers have reported various mismatches, which are hard to fix because neither side can look at the code of the other side. This also results in a lot of duplicated work in areas that aren’t actually platform specific.
This would seem to contradict your statement that "Things like ADO/ASP/Windows Forms/etc were open sourced back in 2008, along with Microsoft's implementation of their framework libraries, which Mono rapidly integrated into their implementation."
> Previously, 'cross-platform' with Microsoft was a joke - it was cross-platform but only within the Microsoft Windows operating system family. .NET Core brings the true cross-platform compatibility, which means you can have one single source code base on Windows, Mac, and Linux. This is a huge deal, especially between Windows and Linux - it gives you more choice for deployment, hosting, and scaling.
> By making its fundamental codebase open source, Microsoft is giving .NET developers an incredible opportunity to enter into areas with their existing skills which were previously locked off to them. The opportunities presented are only going to start to emerge over the next months and years - it's well worth your while checking it out and taking .NET Core for a spin.
Now, certainly it's possible that you are correct, and these are all parts of the misinformation derived from the early days of C# and .Net.
If so, could you provide some references? Otherwise it's very easy for me to conclude that you misremember the historical details.
Sure, here's an article from Microsoft describing how to access and view the source code for their library implementation, ASP/ADO/Forms/etc: https://weblogs.asp.net/scottgu/net-framework-library-source... That's from January 2008. This release is specifically what enabled Mono to really go to the next level, more than a decade ago. You can even see the little 'carve out' they made in the license to ensure Mono, in particular, could use the code down in the 'Reference License' section.
You're making a reasonable mistake of confusing .NET Core with the .NET Framework. They're different things. .NET Core is a new development that is not directly compatible with applications relying on the .NET Framework. As you can read on the blog post you linked to, .NET core was announced as open source before it was released -- which was 2015. Its implementation and technologies being open sourced is something altogether different.
The link you gave is indeed to view the source. It is not, however, an open source license, as characterized by the OSI or DFSG, nor free software as characterized by FSF.
> "Reference use" means use of the software within your company as a reference, in read only form, for the sole purposes of debugging your products, maintaining your products, or enhancing the interoperability of your products with the software, and specifically excludes the right to distribute the software outside of your company.
Microsoft agrees that (quoting from a quote in my earlier comment) "the source code was available since Rotor but we didn’t use an OSI approved open source license, which made Rotor a non-starter".
Like I said, I know little about this topic. However, I think you do not understand what open source means.
I'm pretty sure that when woolvalley used the phrase "It has been for the longest time been a closed source MSFT only thing", that "closed source" meant "not open source according to OSI or similar guidelines." Not "available in source code form", which is what you seem to think it means.
From my perspective here, you're shifting the goal posts. In particular this conversation began with you apparently thinking that Mono lacked full implementations of ASP/ADO/etc. And I think that might justify the original comment that Linux implementations have been treated as second class citizens. I say might since ADO/ASP are not really part of .NET, but rather independent projects built on top of .NET. But in any case, that hasn't been an issue for many years now.
You have now shifted into claiming that your issue is that the licenses Microsoft released their code under were insufficiently permissive. I could argue against this, because it's at best misleading -- different code was released under different licenses. You can see an archive of one of the CLI releases [1] with the license it came with. Or this [2] is a blog from the Mono founder expressing thanks for Microsoft and ECMA members releasing code under open source licenses. That blog was from the huge 2.0 release, in 2008.
But ultimately that feels like a red herring, even if a fun one! You seemed to believe (as did the person I was responding to) that C# was a Windows only thing. That hasn't been the case for a very long time, and it is completely ridiculous now a days. In either case, it might seem we agree if you think that people are basing their views on obsolete information. But that then rather begs the question of why is that so especially the case here. If I was discussing e.g. javascript frameworks from the perspective of somebody 4 years out, let alone a decade, I'd be quite justifiably skewered. In other fields, 4 years ago things like TensorFlow did not even exist as public projects. 4 years is a very long time in software, so why are people so particularly slow on the uptake in this instance? Bias, bad marketing, something else?
You asked for possible reasons for why you are being downvoted for your reply to the following comment by woolvalley: "It has been for the longest time been a closed source MSFT only thing. It wasn't open source and running on linux was a second class citizen. Not sure if it is still a second class citizen."
Your reply was: "Why do you think it's been closed source with minimal cross platform effort?"
I don't think you can argue that I'm shifting the claim to the topic of being insufficiently permissive when that was part of the original thread.
The license[1] that you pointed to is not open source. It says "You may not use or distribute this Software or any derivative works in any form for commercial purposes".
Again, just because the source code is available, that doesn't mean it's open source. There are few that would agree with you that the early Microsoft licenses you pointed to meet the usual criteria for "Open Source" as it's used in the industry.
Your argument seems to be that things have changed and we should forget about the history. As woolvalley's comments were made in the past tense ("wasn't open source", "was a second class citizen"), I think you could have made a better reply along the lines of "a lot has changed since you last looked into C#", rather than the more accusative approach you did which challenges a viewpoint which seems to be historically justified.
I find the idea that "4 years is a very long time in software" to be laughable.
I've been using Python since the late 1990s, and am still going through the Python 2->3 upgrade cycle. My primary development environment of a Unix-like terminal environment and emacs would be recognizable to people in the 1980s. I've been developing and selling my software product for 8 years, with essentially the same core codebase.
In hindsight, something I think I should have emphasized in our discussion earlier is the exact state of Mono. Everything else is mostly tangential as 'the Mono question' essentially closes the discussion of whether C# was a 'mostly closed source Microsoft only thing.' And I think the evidence is bountiful and evident that Mono has long since been a very well developed production ready environment.
Just listing a few projects built on it should really suffice to make the point. The Sims 3 was built using Mono and launched in 2009, Second Life went live with it in early 2008, Unity swapped to it after their initial 'front end' language began, in their words, "proving to be too slow and unwieldy." Their original language, quite appropriate for this topic, was Python! There's really an infinite room for discussion about this, but I think it's all a subset of the 'The Mono Question.' For instance your issue on exactly what open source license is used is completely redundant given Mono.
As for your divergence on 4 years not being a long time. I personally do agree. But our agreement stands in contradiction to the standards of software today. Entire languages are born and die in this span, libraries that didn't exist become ubiquitous, and in general 4 years is certainly far longer than necessary to expect some reasonable evolution of adaptation and opinion. The fact it has not is a peculiarity I find bemusing, and fun to consider. If nothing else, it leads to enjoyable discussion - which I suppose is the ultimate point of these forums.
Yes, that would have been appropriate as woolvalley specifically brought up Microsoft, and your reply talked a lot about Microsoft's source available release as if it were meaningful counter-example to woolvalley's reference to "closed source."
You wondered why you got downvotes? That's why.
I have no dog in this race.
If you "personally do agree" then why do you bring up an argument that you disagree with, in order to justify your views? It comes across as if you are making the argument to win some sort of rhetorical point, where the end - your advocacy of C# - justifies any tactic.
I do not care to follow up on this discussion any longer.
Mono is a product of the open source nature of C#, not the cause. In general, I think its wise to go as close to the 'first party reason' as possible. But the nuance there makes it surprisingly intricate, and nuance is often lost in online discussion...
As for '4 years being a long time in software', there's a difference between considering the industry at large, and individual experience. I, like everybody, have anecdotal experience and opinions that run contrary to the norm. And in this case my personal view is that 4 years is not really a long time in terms of software, yet for the industry and people as a whole, I think that couldn't be further from the average truth.
C# started out as a boring Java clone enterprise language from the company famous for its "embrace, extend, extinguish" strategy. In fact they invented C# because they got sued for trying to "extend" Java.
And until now no amount of cool features and Microsoft PR were able to remove that stench.
The author is a scientist analyzing his data. I never met anyone in that crowd using C#. Are there even any good data science/numerics libs out there? C++ has a lot of number crunching libs, python even more.
This is not my field, but I have used Accord.NET (http://accord-framework.net/) while learning machine learning and have no complaints, but also have little room for comparison. Any C++ library you'd consider as definitive, for some basis to compare? One other thing is that running C++ libraries in C# is, in most cases, quite trivial.
It's probably more common for data scientists in the .NET world to use F#. FsLab (https://fslab.org/) seems cool from some limited experimentation (disclaimer: I'm not a data scientist, but am data curious).
The question it addresses:
"Does Python's performance matter?"
Has always had the answer:
"Sometimes, and you have options for those cases."
The OP found a "sometimes", and he's using one of those options. In this case, he's got Python for prototyping and glue, with Haskell improving performance. This is as it should be.
I don't know of any Python advocates who say it's the right tool for every part of every job. What we will say is it's usually a good "first" tool for every job. Building a system out in Python allows you to get something representative fairly quickly, which helps identify if there are areas where Python alone is not enough.