Hacker News new | past | comments | ask | show | jobs | submit login

Yes, you can use the out arguments, but doing so 1) complicates your code, and 2) isn't necessarily a win. You have to do the same number of memory write bus cycles whether you write to a specified output array or to some new array.



I agree with point (1), wholeheartedly - by default the syntax is ugly unless you wrap it in a utility. (2) isn't true as far as I can tell - you can keep reusing the same output buffer, or double buffer if the output can't be the same as the input.

Either way see https://news.ycombinator.com/item?id=22786958 where I figured out the syntax to do it in place, which shouldn't result in any allocations at all and should solve both your concerns.


Regarding #2: whether you use the same buffer or a new buffer, you still have to actually write to the memory. The bottleneck at numpy scale isn't the memory allocation (mmap or whatever) but the actual memory writes (saturating the memory bus), and you need to perform the same number of writes no matter which destination array you use.


It should still be a perf gain:

1. Memory allocation isn't free 2. Doing multiple loops over the buffer (once per operation) is going to be slower than doing all the operations at once, both because of caching and the opportunity to do the operations on a register and write at the end (though who knows if the python interpreter will do that).


The Python interpreter is too stupid to do operation coalescing. That's the whole point of my initial comment.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: