There are plenty of examples of Python libraries that can be performant such as ...

aldanor · on Nov 26, 2023

NumPy is a C library with Python frontend, moreover lots of functionality based on other existing C libraries like blas etc.

PyTorch, quoting themselves, is a Python binding into a monolithic C++ framework; also optionally depending on existing libs like mkl etc.

> You can write performant code in any language if you try.

Unfortunately, only to a certain extent. Sure, if you just need to multiply a handful of matrices and you want your blas ops to be blas'ed where the sheer size of data outweighs any of your actual code, it doesn't really matter. Once you need to implement lower-level logic, ie traversing and processing the data in some custom way, especially without eating extra memory, you're out of luck with Python/numpy and the rest.

benrutter · on Nov 26, 2023

> NumPy is a C library with Python frontend

I guess this is a pretty legitimate take, but in that case VectorDB looks like (from the got repo) it makes huge use of libraries like pytorch and numpy.

If numpy is fast but "doesn't count" because the operations aren't happening in python, then I guess VectorDB isn't in python either by that logic?

On the other hand, if it is in Python despite shipping operations out to C/C++ code, then I guess numpy shows that can be an effective approach?

bee_rider · on Nov 26, 2023

BLAS can be implemented in any language. In terms of LOC, most BLAS might be C libraries, but the best open source BLAS, BLIS, is totally structured around the idea of writing custom, likely assembly, kernels for a platform. So, FLOPs-wise it is probably more accurate to call it an assembly library.

LAPACK and other ancillary stuff could be Fortran or C.

Anyway, every language calls out to functions and runtimes, and compiles (or jits or whatever) down to lower level languages. I think it is just not that productive to attribute performance to particular languages. Numpy calls BLAS and LAPACK code, sure, but the flexibility of Python also provides a lot of value.

How does Numba fit into this hierarchy?

mgl · on Nov 26, 2023

This is unfortunately not correct once you start pushing the boundaries requiring careful allocation of memory, CPU cache and COU itself, see this table:

https://stratoflow.com/efficient-and-environment-friendly-pr...

_a_a_a_ · on Nov 26, 2023

I don't accept that. In the referenced article you're pulling in stuff which I believe is written in a different language (probably C). If you use native python, I'm sure you would accept it would be much slower and take up much more memory. So we have to disagree here.

dmezzetti · on Nov 26, 2023

Where do you draw the line? Most of CPython is written in C including the arrays package (https://docs.python.org/3/library/array.html) mentioned in that article.

Yes, pure Python is slower and takes up more memory. But that doesn't mean it can't be productive and performant using these types of strategies to speed up where necessary.

_a_a_a_ · on Nov 26, 2023

With respect, I think you're clouding things by trying to defend what is really defensible. Okay then.

> Where do you draw the line?

Drawing the line at native python, not pulling in packages that are written in another language. Packages written in python only are acceptable in this argument.

> But that doesn't mean it can't be productive and performant using these types of strategies to speed up where necessary.

No one said it couldn't. What we're saying is that it pure python is 'slow' and you need to escape from pure python to get the speedups.

dmezzetti · on Nov 26, 2023

I agree that pure Python isn't as fast as other options. Just comes down to a productivity tradeoff for developers. And it doesn't have to be one or the other.

_a_a_a_ · on Nov 26, 2023

Agreed, then!

iopq · on Nov 26, 2023

So to make Python fast you just need to write a library in another language, brilliant

dmezzetti · on Nov 26, 2023

If you read the article referenced, I discussed a number of ways to write performant Python such as using this package (https://docs.python.org/3/library/array.html).