Hacker News new | past | comments | ask | show | jobs | submit login
Node.js 14 is over 20x faster than Python3.8 for fib(n) (jott.live)
70 points by brrrrrm on Feb 9, 2021 | hide | past | favorite | 114 comments



An interpreter with a JIT is obviously faster than one without. Especially when dealing with CPU bound work.

I'm not sure this is entirely noteworthy unless you somehow think CPython has a JIT. Would be much more interesting to compare to pypy.


Indeed. Furthermore this basically just benches function call overhead by using the worst possible implementation of fib().

Function call is a well-known weak point of cpython, even amongst all its other weak points performance-wise.

It's hard to express how utterly uninteresting and useless TFA is, and if its author is surprised by the result… really the only component this tells us about is the author.

> Would be much more interesting to compare to pypy.

I'm not sure it is more interesting at all, let alone much more, but here are the results on my (obviously much slower than TFA's) machine:

    > python3.9 --version
    Python 3.9.1
    > python3.9 fib.py
    8555.904865264893 ms
    > pypy37 --version
    Python 3.7.9 (7e6e2bb30ac5fbdbd443619cae28c51d5c162a02, Jan 15 2021, 06:03:20)
    [PyPy 7.3.3-beta0 with GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]
    > pypy37 fib.py
    715.0719165802002 ms
    > node --version
    v14.15.4
    > node fib.js
    247.19056797027588 ms
(can I note that the 12 decimal precision of the script is hilarious? Because clearly when you're benching fib(35) you need that femtosecond-scale precision)


Seems like an unduly harsh take. Is everyone just supposed to inherently know this fact about cpython or find it unsurprising when they run across it?

"Obviously this isn't the most comprehensive benchmark, but the results are surprising to me."

I fully agree with this and I learned something new about cpython's weakpoints today!


It does seem a bit harsh, but I would put it like this: If you need to care about performance, then this is an extremely basic difference between the two language implementations that should not surprise you. If you don’t yet, then this is the very beginning of your education in how to care about performance—welcome to the next level!


Oh, I think it is a quite reasonable comment. It isn't interesting work being done in the benchmark and the benchmark itself is short.

If you want to get ultimate performance from Python then write a C function...

If you want to inform me about runtime performance then show me how the language runtimes are spending cycles. If you wish to convince me about a language being great then tell me about the engineering effort to create and then run something in production.


It's a 2 sentence post, the latter of which seems to be trying to dispel the idea that it's about what you're implying it is. Uninteresting to experts in the area that consider this common knowledge maybe, not a failure of the author to do something worthwhile though.


If I’m writing an article about benchmarking, then yes.


In the author's defence, the "article" isn't even 50 words long.


These details aren't important. It's like bench marking C++ and node and then complaining about implementation details. Node should be and is definitively faster then python in practically every bench mark.

I still prefer python over node though, but I can't deny the reality.


I would say it's faster generally but not definitively.


Not only that, it wasn't until 3.9 that Python got optimizations for calling callables.


Doesn't really matter, I ran it on 3.9, it's still slow. Between the kind of language Python is and the optimisation CPython allows for itself, there is simply no way it could be competitive.


  pypy test.py 
305.4699897766113 ms

  node test.js      
                 
111.49054491519928 ms

  python3 test.py     
       
3576.0366916656494 ms

Node still wins by a healthy margin.


Numba (asked by a commenter that haas since deleted:

638.8082504272461 ms


I asked, run it twice. The first time is for compilation. The second time it reaches 90ms on my machine.


Second run was 200ms for me, still 2x node.


I would expect numba to win, but I also do not think it is a fair comparison.


To be fair, python itself isn't jitting anything, while node is.


Numba doesn't really qualify as JIT if what it's really doing is compiling it and caching the results to disk then reading from them in the future for faster execution... that's just a compiler.


> An interpreter with a JIT is obviously faster than one without.

And why doesn't python's default runtime environment obviously come with a JIT then? I think they should absolutely go for it, given the huge user base of python.

Reasons like "but named functions can dynamically change" are not applicable, since JS has those same properties and can do it. They could start with optimizing the case where you call the same function over and over in a for loop.

An alternative python interpreter is also not the solution, normally what you have is the main standard python interpreter, and that is the one that should be fast, period.


> And why doesn't python's default runtime environment come with JIT? I think they should absolutely go for it, ensure the default python you get when you run python has a JIT, given the huge user base of python.

1. because CPython aims to be relatively simple and straightforward by choice

2. because the "huge user base" comes in large parts from the deep and extensive C API, which is absolute hell on a JIT

3. because most of the userbase would not give a shit anyway, it has not exactly migratd en masse to pypy: much of the userbase sees and uses Python as a glue language, Python is an interface to optimised C routines without being a pain in the ass to develop in.


> because CPython aims to be relatively simple and straightforward by choice

Sacrificing performance for core interpreter developer convenience may have been the right choice when Python was getting started; it's no longer the right choice today. Today it's short-sighted.

> because the "huge user base" comes in large parts from the deep and extensive C API, which is absolute hell on a JIT

We can have both a JIT and a "deep and extensive" (or more importantly, stable) native API, as demonstrated by node.

> because most of the userbase would not give a shit anyway

Actually a lot of the userbase don't use libraries with native extensions, are painfully aware of Python's performance issues, and are intensely interested in addressing them.


We can have both a JIT and a "deep and extensive" (or more importantly, stable) native API, as demonstrated by node.

Yes, but not that native API. Designing a native API that doesn't create huge problems later is difficult. The JVM, .NET and V8 guys managed it (mostly) but the scripting languages generally didn't. Their API is just literally the entire internals of the interpreter.

Figuring out how to JIT code in the presence of native extensions that expect the implementation to work in exactly the same way it always worked is a research problem. The only people who have got close to solving it are the GraalVM guys. They do it by virtualising the interpreter API and also JIT-compiling the C code! They run LLVM bitcode on the same engine that runs the scripting engine.


Thanks for the context, those are great points.


> 1. because CPython aims to be relatively simple and straightforward by choice

It's a painful choice, and having to use numpy for everything that a loop normally could do but is too slow for, or discourage making functions because a function call is so slow, makes an otherwise elegant language less so


Armin Ronacher, author Flask, actually has a good talk about this. The gist of it the way Python's internals leak into the language makes it very difficult to build a performant JIT that wouldn't break a large amount of userspace code.

Python lets you do _far_ more shenanigans that Javascript does; and a lot of large libraries depend on some of that behavior. Breaking it would probably cause a new 2 -> 3 situation.

https://www.youtube.com/watch?v=qCGofLIzX6g&feature=emb_titl...


> a lot of large libraries depend on some of that behavior.

Armin makes great points about path dependence of API design and how the CPython API leaks into the Python language spec. But the features being discussed are actually obscure (example: slots) or intended for debugging (example: frame introspection), and most libraries don't have a good reason to use them. We're stuck in a loop: people talk about how Python is special and can't use a JIT because its internals are not JIT-friendly, so we don't have a JIT, so implementers continue to make choices that are not JIT-friendly - not because they want to, but because they have no guidance.

The JIT doesn't have to be amazing on day 1. What it does have to do is show a commitment and a path to performant code, and illuminate situations where optimizations turn off. There's nothing fundamental in Python's design that prevents a JIT from working; a small number of rarely used dynamic features (that most people don't know about and don't know that they can negatively affect performance) should not be used to hold up interpreter design.


GraalPython is on its way to solving this, by co-JIT-compiling both Python and the code of the native extensions simultaneously. However Python is a large language and ecosystem, so it'll take a while for the implementation to mature.


> Python lets you do _far_ more shenanigans that Javascript does

Does JavaScript let you do this monstrosity of terrible code?

    Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> class Dog(object):
    ...     def speak(self):
    ...             print("bark!")
    ... 
    >>> class Cat(object):
    ...     def speak(self):
    ...             print("meow!")
    ...
    >>> animal = Dog()
    >>> animal.speak()
    bark!
    >>> animal.__class__ = Cat
    >>> animal.speak()
    meow!
    >>>


A lot of people have responded "yes", but no one's taught you how yet, so here's some example code that you can paste into your browser's Devtools or Node's REPL:

    class Dog {
      speak() {
        console.log("bark!");
      }
    }
    class Cat {
      speak() {
        console.log("meow!");
      }
    }
    animal = new Dog();
    animal.speak();
    Object.setPrototypeOf(animal, Cat.prototype);
    animal.speak();
This will produce:

    bark!
    meow!


Yes, you absolutely can. In fact, this was part of my ES5 Date constructor polyfill.


You can replace an object's prototype in JS, wouldn't that do the same?


Let me help you detox that:

> An interpreter with a JIT is faster than one without. Especially when dealing with CPU bound work. Would be interesting to compare to pypy.

Sure, here is pypy:

  > pypy3 main.py
  282 ms

  > node main.js
  105 ms
Without JIT:

  > python3 main.py
  2818 ms

  > node --jitless main.js
  998 ms
For fun:

  > cargo run --release -q
  20 ms
I enjoy brrrrrm's post for its brevity and the acknowledgement of common folk tools.


Tried it in PHP v8 (on Windows):

.\php fib.php 1402.1289348602 ms

and with PHP8's new JIT opcache on:

.\php -dopcache.enable_cli=1 -dopcache.jit_buffer_size=200M fib.php 219.55609321594 ms


It's noteworthy to the extent that the CPython development team continues to refuse to implement a JIT runtime, to the detriment of the Python community. PyPy is not the default Python interpreter, and most Python libraries don't target compatibility with it.


How does the JIT compilation help in terms of CPU bound work? Is node somehow able to automatically parallelize this code? I know cpython is limited to a single core unless you specifically use multiprocessing. Or is this related to something else?


> Or is this related to something else?

Overhead. Each operation translates to Python bytecode, and the Python interpreter performs a full loop of the core for each bytecode instruction.

The humble

    a + b
is

    LOAD_FAST a
    LOAD_FAST b
    BINARY_ADD
each of which gets painstakenly executed by the corresponding completely static handler which yields something along the lines of:

    fetch the bytecode
    jump to the handler
    access the function locals
    push the value for `a` (which TBF is just an offset into an array) onto the stack
    increment the bytecode index
    fetch the bytecode
    jump to the handler
    access the function locals
    push the value for `b` onto the stack
    increment the bytecode index
    fetch the bytecode
    jump to the handler
    popp both values off the stack
    dereference the type of `a`
    look for the pointer to the add method
    check if it's set
    call it with `a` and `b`
        which performs various runtime typechecks (e.g. are both parameters objects and integers) and does the actual addition
    push the result back onto the stack
    increment the bytecode index
Assuming a hot loop, a JIT might literally just emit an assembly-level

    add r10, r11
or whatever register it allocated to those locals.

An other component is that these are likely comparing apples and pears: CPython uses infinite-precision integer arithmetics. And due to not using a JIT it has no way to even remotely optimise any of that away. Infinite precision arithmetics are pretty expensive as they require lots of overflow checking.


Python 3 uses interpreted bytecode, it’s faster than running an interpreter over an AST but much slower than using raw machine code. For most things people use Python for this is fine, especially since there are many extensions to do cpu-intensive tasks written in C.


Crossing the interpreter-native-code domain is expensive, primarily because of poor cache usage.

You want to be either full JIT compiled code, or vice versa have as much interpreter functions, and libraries written in native code, and style the API to avoid loops.


JITs work based on assumptions that types/values will stay constant. So, here it probably assumes that it will always be working with integers. So it will be much more efficient in cases like this as it is pretty much pure, simple computation. At worst there could be one deoptimisation when it moves from 32-bit to 64-bit integers, if v8 uses 32-bit integers first.

So it can emit extremely efficient instructions based on this assumption, while CPython struggles along with infinite precision numbers.

Function calls will have a much lower overhead also since it will just be a single `call` instruction.


Agreed. The speed difference between Node and Python is well known. v8 is one of the fastest things around. However, it's more than just JIT. There are business reasons behind why Node is so fast. The number of resources google has thrown against developing v8 means that pretty much nothing can surpass it in speed any time soon.

The fact this is on someones blog and posted to the front page means that a lot of people didn't know this. Well guess what, for you guys who don't know.... here's another fun fact: C++ is about 10x faster then node which makes it about 200x faster then python.


Ignoring the bad implementation of fib(n) and use of a Python interpreter without JIT... why does this matter?

Python and JavaScript are scripting languages. Their advantages are being highly portable and relatively easy to develop and maintain. Performance has always been a weakness of both languages when compared to their pre-compiled siblings. That limitation is often mitigated by "gluing" together functionality implemented in a more performance-oriented language, but even then nobody chooses a scripting language because they expect it to be faster than the competition.

Besides, this is hardly a fair comparison. Many tech giants have competed in the browser space for years by hyper-optimizing their JavaScript engines. Comparing the performance of Node/V8 to that of CPython is like me comparing my strength to that of a child.


I think it matters because a lot of people are doing math in python.

And if good optimizations can help run a small simulation in 1 minute instead of 2, multiplied by the number of python users, it is a lot of time gained.

And I suspect, the fib(n) situation is actually quite common. I mean, not everyone who writes python is a "real" developer, there are a lot of scientists who just want the computer to run their formulas, and they write them it the most straightforward way possible. They won't bother with high performance libraries and optimizing their algorithms just to save a few minutes, but they would appreciate if the language could make things a little faster.

And it doesn't matter in the way of "hey look, Python slow, use JS". But it is good information for Python developers and advanced users, the people who non-specialists rely on.


Yeah, but it's like comparing the effectiveness of a shovel vs a spade when it comes to hammering nails. You should obviously use a hammer. If it's not worthwhile to get a hammer, then does it really matter whether you use the shovel or the spade?


If a lot of people use a shovel as a hammer, the people making shovels should start considering that use case. The spade manufacturers already did so they could take inspiration.

Real life tool manufacturers take "wrong" use into account. One of my preferred anecdote is the IMI Galil assault rifle, which has a built-in bottle opener. It was done because they noticed soldiers used magazines to open bottles, and it could cause damage.

Of course, it is not about choosing between a shovel and a spade, people who have a choice will use a hammer. But it doesn't mean we shouldn't do a favor to shovel bearers.


> If a lot of people use a shovel as a hammer, the people making shovels should start considering that use case. The spade manufacturers already did so they could take inspiration.

I completely disagree.

Your example of soldiers and their multi-tool use falsely conflates utility with necessity.

First, civilians generally don't have access to assault rifles (to even consider using it as a bottle opener) - i.e. soldiers are an edge case. Secondly, civilians who have direct access to bottle openers don't even need assault rifles - i.e. most people don't need edge-case solutions.

In the real world, when people want a hammer, they buy a hammer. They don't use shovels for hammering, unless they're forced to use one, or have no other option.

In the programming world all popular programming languages are virtually free. Programmers can choose and select between them. Modifying a language just so that it can do every specific thing (rather than the few things it is good at, or which its ecosystem is good at) is architecturally poor language design. And an excellent way to mess up the language - e.g. see PHP.

A tool should not need to accommodate to the whims and needs of every user. To use your analogy, if the shovel users have easy access to hammers, then they can grab the hammers when they need it, not force shovels into hammers.

To do otherwise is a stupid choice on the part of users, and is in no way a weakness of the design/utility of the language/tool itself.


But what's the implication here if not that people should use JavaScript for better performance? That the Python Software Foundation should invest more in the performance of the reference interpreter, CPython?

The Galil magazine example has an obvious, easy-to-implement, low-cost solution with an obvious ROI. If that weren't the case, don't you think IMI would have just cautioned soldiers about potential damage to the magazines and instructed them to use a bottle opener instead?


I think it's notable that the gap between dominant scripting languages for simple function-call bound code has become quite large.


I do not consider the difference in performance for that particular implementation of that particular function on those particular interpreters to be notable. I don't think it's a fair match-up and I don't think there's much value in comparing the effectiveness of tools for a job they aren't made for.

For additional comparison, I quickly I rewrote the function in C (output and code below). The code was comparable in length and complexity (at least the fib(n) implementation was), but the relative performance is enough to make JavaScript blush. If you really wanted to do something as trivial as this, why would you even use pure JavaScript or Python to begin with? And if performance was a concern, why would you choose a reference interpreter like CPython?

Output:

    $ ./fib 35
    14930352
    68 ms
Code:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/time.h>
    
    int fib(int n) {
        if (n == 1 || n == 0) return 1;
        return fib(n - 1) + fib(n - 2);
    }
    
    int main(int argc, char *argv[]) {
        if (argc < 2) return 1;
        int x = atoi(argv[1]);
        struct timeval start, end;
        gettimeofday(&start, NULL);
        int n = fib(x);
        gettimeofday(&end, NULL);
        printf("%d\n", n);
        long unsigned int udiff = (
            (end.tv_sec - start.tv_sec) * 1000000 +
            end.tv_usec - start.tv_usec);
        printf("%lu ms\n", udiff / 1000);
        return 0;
    }


Doesn't the Node.js version use double precision floating point vs python using infinite precision integers ? That would explain part of the difference in performance, and make the python version exact, but js version inexact.


I assumed since this seems to be testing function call overhead, rather than the math, that an equivalent function with BigIntegers would be about the same for JS. But I tried it just for fun:

    const { performance } = require('perf_hooks');

    function fib(n) {
       if (n == 0 || n == 1) { return 1; }
       return fib(n - 1) + fib(n - 2);
    }

    function fibn(n) {
       if (n == 0n || n == 1n) { return 1n; }
       return fibn(n - 1n) + fibn(n - 2n);
    }

    var t0 = performance.now(); fib(35); console.log("fib:", performance.now() - t0);
    var t0 = performance.now(); fibn(35n); console.log("fibn:", performance.now() - t0);
results:

    fib: 127.30008998513222
    fibn: 2134.1405459940434

yikes, I think you're right. (nodejs v15.6.0)

(An equivalent version with Python on my machine: ~2657.3ms)


This looks like most of the difference.

  JS with number              138 ms   (x1)
  JS with BigInt (eg. 35n)   2620 ms   (x19)
  Python                     3260 ms   (x24)


This is correct, though both are exact at the scales in question (14930352). There's a more general question of whether it's beneficial to take a perf hit for all arithmetic in order to support some corner cases of infinite precision, or be explicit that standard arithmetic won't be precise, but you can opt into it via some mechanism (BigNum, in the JS case)

I prefer the JS approach.


The specifications do say that JavaScript only has floating point numbers (and no integers) and that Python has infinite precision integers. But in implementation, for whole numbers small enough to fit into a regular 32-bit integer both Python and Node.js use regular integers as an optimization. fib(35) comfortably fits that.

I guess node.js might have the edge here with the necessary overflow checks in the addition. Both languages have to do them to fall back to either doubles or bigints respectively, but Node.js's JIT can probably do them faster.


The numbers are small so there shouldn’t be any difference for fib(35)


They're fundamentally stored as and operated on in different ways. A BigInt does suddenly become a BigInt when in excess of MAX_SAFE_INTEGER, it's a BigInt even with a value of 1n.


fib(35) is much much smaller than the safe max that is the whole point of my comment


That's one factor but there is an overall reason behind performance difference. The main reason is v8 and the inordinate amount of resources google has thrown against that thing to make it ultra fast. You should read up on the people who work on it.


FYI: pypy3 is 10x faster than CPython on this benchmark.

  ~> python fib.py
  4825.7598876953125 ms
  ~> pypy3 fib.py
  514.7459506988525 ms


Numba version:

    @numba.jit
    def fib(n):
      if n == 1 or n == 0:
        return 1
      return fib(n - 1) + fib(n - 2)
first run:

498.46601486206055 ms.

second run:

89.19310569763184 ms.

For completeness, with njit:

    @numba.njit
    def fib(n):
      if n == 1 or n == 0:
        return 1
      return fib(n - 1) + fib(n - 2)
first run:

152.62889862060547 ms

second run:

86.35592460632324 ms


Weird. On Mac I get 282 ms from pypy3, but 233 ms from numba.

    from numba import jit
    @jit
    def fib(n: int) -> int:


Even on subsequent runs?


Yeah, that's the weird part.


Also, add a couple of type annotations to the program:

  import time

  def fib(n: int) -> int:
    if n == 1 or n == 0:
      return 1

    return fib(n - 1) + fib(n - 2)

  t0 = time.time()
  fib(35)
  t1 = time.time()
  print(f"{(t1 - t0) \* 1000} ms")
The run:

  ~> mypyc fib.py
And boom:

  ~> python
  >>> import fib
  332.64994621276855 ms
(FYI, mypyc is a compiler that's part of the mypy package).


Wouldn't just straight running this through cython make more sense? Or numba?


Here are some timings using three different approaches using Cython, and also numba: https://share.cocalc.com/share/df81e09e5b8f16f28b3a2e818dcdd...

Numba wins and is about twice as fast as my most clever Cython code. Naive Cython is pretty bad, but more clever Cython is reasonably good (though not as good as numba).


Well, a point in mypyc’s favor is that the above code is still syntactically valid Python code.


Both numba and cpython work with python code AFAIK.

I don't know that cpython would take advantage of mypy annotations and you can make a cpython program not python-compatible, but you don't have to.


there's a backslash in the print statement that crashes it


I'm honestly not too surprised. I follow the v8 blog and they talk extensively about performance improvements on most releases.


V8 is a monster.

If you can trust language benchmarks, you only get notable performance benefits when switching from JS to Rust/C/C++.


1. Python doesn't have a JIT. This shouldn't be surprising or headline worthy.

2. Why benchmark an O(2^n) fibonacci? It's basically benchmarking call frame creation.

3. A single order of magnitude is honestly not that impressive a speedup for JIT vs. no JIT. Might have a lot to do with the inefficiency of the recursive fibonacci func.

Also, to quote another poster:

    @numba.njit
    def fib(n):
      if n == 1 or n == 0:
        return 1
      return fib(n - 1) + fib(n - 2)
After JIT warms up:

~86 ms

Which is...effectively identical to Node?


Of course in the category of 'single line' performance optimizations the most effective one is going to be:

    @lru_cache(1000)
    def fib(n):
        if n == 1 or n == 0:
            return 1
        return fib(n - 1) + fib(n - 2)
which lowers the time to somewhere around 20 microseconds.

I'd be interested to know if Node.js can do anything similar.


> Which is...effectively identical to Node?

You'd have to also run the node version on your machine to know, as it's unlikely you have the exact same setup as TFA.


Compare languages on performance is weird. Every language is made with another philosophy.

Python isnt made for performance, but for good code readability. Write clear, logical code for small and large-scale projects. It is also made by Guido to build applications in less time. Guido knows his language isnt the fastest, but thas wasnt the goal with Python.

Node.js is also made for other applications than Python. It's great for web developers to be in the "JavaScript everywhere" world. So it is easy for web developers to write backend and frontend code with the "same" language. It's also great for event's and "real time" communication with the (web) application.


Syntactic differences aside, the capabilities of JavaScript and Python do not differ significantly. They are both highly popular highly dynamic object-oriented-ish languages.

Knowing that the most prevalent implementation of one is significantly faster than the most prevalent implementation of the other is useful information for someone deciding between the two for a project where performance is important.

Comparing alternative implementations of the languages, which are geared toward performance, would also be valuable. But simply knowing that, if your choice to use CPython is more-or-less arbitrary, you're leaving performance on the table, is valuable.

(Of course, as discussed in the top-level thread by @SuchAnonMuchWow, this particular benchmark seems to be comparing apples to oranges, and thus may not be of much value.)


Who would've thought that V8 that gets millions invested in it by a huge company would outperform community made Python.


I'd be hard pressed to find any mainstream language that underperforms naïve CPython. Community driven or not.


Bash is pretty damn slow.


I mean sure, code the thing in assembly and it’ll run circles around node. But who would do that? I’d also take python any day over js which makes my eyes bleed and my face hurt from all the face palming at the ecosystem’s insanity.


Python feels noticeably slower than Node.js, particularly when running scripts with large amounts of imports. I now pretty much always use JavaScript for, well, scripting. Not sure why Python and Ruby are considered go-tos there.


Because not all scripting has strict performance requirements, and as much as I love JavaScript, Python blows it out of the water in scripting ergonomics and semantics IMO.

Opening / reading / writing files, parsing args, making HTTP requests, etc. which are extremely common scripting operations are easier with Python, and the end result tends to be much more succinct thanks to its syntax (this applies to all Python code compared to JavaScript, but I'd argue it's especially valuable in shorter scripts).


I’ve been scripting in Python since 2003. Node wasn’t mature/viable at the time.

Edit: didn’t exist at the time!

Python is my go-to because I know the ecosystem well and can go from zero to done with minimal effort and research.

Node has since (come into existence) and matured, and while it may be better/faster in some situations now, I haven’t encountered a project that has compelled me to switch.

I suspect the answer to your question boils down to a combination of: familiarity, preference, individual productivity, and pragmatism.

This will likely shift over time.


Node.js was created in 2009, so it wasn't even born in 2003:

https://en.wikipedia.org/wiki/Node.js


Node.js not existing probably contributed significantly to its non-viability in 2003. ;-)


Good catch. Brain fart on my part.


Python gives you better escape hatches for typical compute-heavy work (lots of libraries use C under the hood), and Python has a better multi-threading story than Node.js.

But I guess the biggest differentiators are really the ecosystems and standard libraries. Node.js is certainly catching up, but Python has generally a better selection of high-quality libraries for tasks beyond the web.


Lack of library support is probably the most valid reason here. Most of the other reasons people have posted seem like opinions on ergonomics. I think those held true 10 years ago, but modern JavaScript feels very minimal and lisp-y and is more enjoyable to program with than Python IMO.

What libraries or escape hatches to you actually use in your daily work that are exclusive to Python? The ones I'm familiar with would probably be OpenCV and the whole machine-learning family of libraries (sci-kit, spacy, tensorflow, etc).


I believe libraries like v8pp are helping to shrink the gap with C library integration.


For a lot of workloads, performance is less important than maintainability. Python's cultural and technical adherence to an ideal code structure makes it a good candidate for long-haul code where CPU-bound performance isn't going to be a factor.

IMO the opposite end of that particular scale is Perl.


Eh, when it comes to scripting, performance is almost never a concern.

Different tools are best for different jobs, for example if you need to run some basic NLP jobs pythons going to be much better.


Python as an implementation is slow but python as a language is much better designed then javascript. That is the main reason why they are gotos.


Looks like this is not fully compatible with Apple Silicon M1. I tried to install Pandas along with PyPy and ran into failure on building numpy. Looks like it's a known issue https://github.com/numpy/numpy/issues/17807


I've seen these types of comparisons, and I understand there a loads of factors playing a part in them.

However, it does make me wonder, why has python become to standard for data science? Is it Library support or purely community based?


It's because to a reasonably approximation of "None" - none of the actual data science runs in Python, it's all hyper customized libraries which do run (close to) metal fast once the data has been loaded into the appropriate data structures. Pandas is a shim on top of Numpy, which heavily leverages the Fortran77 BLAS/LAPACK libraries.

Python is used at the top of the stack because it's an easy language to learn, you can get started fast, and, for places where performance is important - nothing is running in Python anyways.


> Python is used at the top of the stack because it's an easy language to learn, you can get started fast, and, for places where performance is important - nothing is running in Python anyways.

Also interactivity and quick feedback cycle, stuff like Jupyter Notebooks (né IPython Notebooks, a spinoff from the IPython project), matplotlib, ...


> why has python become to standard for data science?

Because it's glue, so its speed doesn't matter overly much.

> Is it Library support or purely community based?

That's a dichotomy which doesn't really make sense. Python has cultivated and attracted attention from scientific communities from the start: the matrix-sig (a special interest group focusing on array computing packages) was created back in '95 and a number of their suggestions were added as language-level conveniences (that continues to this day, `@` was recently added as the "matrix multiplication" operator).


Most Python data science offloads the data crunching to Numpy, an optimized array based processing library.

This is feasible to reduce interpreter overhead considerably by using array programming:

c = a + b #This is valid Python code.

Where a and b are two same sized numpy arrays. Numpy typically handles the add in an optimized SIMD function.


For me, I have known Python since the 90s but only started using it as a daily driver a few years ago because there were big Microsoft and C/Java groups within my org and choosing one would alienate the other. So I used Python because it was good enough and non-threatening (and free and tons of libraries and healthy community, etc). I would have chosen javascript if there were as many packages. Also considered R, but hadn’t used it at the time.

I think Python is a good example of how being good enough is better than being awesome. And then it builds inertia through use and packages and friend of a friend recommendations.


I know it's not super relevant to node vs python3, but your fib(n) has a time complexity of O(2^n)... with a dp approach it can be solved in O(n). also your space complexity can be reduced to O(1).


With a better chosen dp approach you can even solve it in O(log(n)). Though at that point the extra cost of multiplying large integers becomes non-negligible.


I'm having trouble coming up with this approach on my own. could you share it please?


Javascript runtime is based on efficiently processing callback functions. Not a big surprise on recursive algorithm, which is function call algorithm.


This is a pretty terrible implementation of fib(n).


It's a common approach to measuring function spawn cost.


Maybe, but it's also why synthetic benchmarks are not convincing. Maybe node better handles a bad pattern, but if performance is a concern to this level we're more concerned about functionally equivalent good patterns even if the code is very different.


I think GP meant how the fib function was written (and not why it was chosen for the measurement). The `if n == 1 or n == 0` thing hurts my eyes too.


Regardless node IS definitively one of the fastest interpreted platforms around. It is well known that it is faster than python. Python wins in other areas, including being a much much better designed language.


Probably comes from V8 (JS engine in node), likely being the most heavily optimized language runtime engine ever made.


Agreed. They have some of the smartest people working on that.


Yeah, it's definitely because of that.


Isn't time.time() specifically not accurate to that degree and why you should use time.perf_counter_ns()?


Now optimize it with PyO3 or PyOxidizer.


Now try the tail recursive version! :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: