1. Have gotten similar performance boosts elsewhere, meaning that you wouldn't have needed to refactor this function in the first place (although the implication of a 10000x speedup means that may not be true, although I can absolutely see the potential for 100x speedups in this code, depending on exactly what the input data is)
2. Its likely that there are much more natural ways to implement the function you have in pandas more idiomatically. These would be both clearer and likely equally fast, though possibly faster. (heck, there are even ways to refactor the code you have to make it look a lot like the direct from the paper impl)
In other words, this isn't (necessarily) a case of python having weak performance, its a case of unidiomatic python having weak performance. This is true in any language though. You can write unidiomatic code in any language, and more often than not it will be slower than a similar idiomatic method (repeatedly apply `foldl` in haskell). I'm not enough of an expert in pandas multi-level indexes to say that for certain, but I'd bet there are more efficient ways to do what you're doing from within pandas that look a lot less ugly and run similarly fast.
Granted, there's an argument to be made that the idiomatic way should be more obvious. But "uncommon pandas indexing tools should be more discoverable" is not the same as "python is unworkably slow".
1. No, that function was the bottleneck, by far, and I can tell you that >10,000x was what we got between the initial version and the final one.
2. I don't care about faster at this point. The function is fast enough. Maybe there is some magic incantation of pandas that will be readable and compute the same values, but I will believe it when I see it. What I thought was more idiomatic was much slower.
I think this is more of a case of "the problem does not fit numpy/pandas' structure (because of how the duplicated indices need to be handled), so you end up with ugly code."
1. you don't get 10000x speedups by changing languages. It's likely that this optimization would be necessary in any case.
2. You don't care about improving the code, but you did care enough to write an article saying that the language didn't fit your needs without actually doing the due diligence to check and see if the language fit your needs. That's the part that gets me.
1. Have gotten similar performance boosts elsewhere, meaning that you wouldn't have needed to refactor this function in the first place (although the implication of a 10000x speedup means that may not be true, although I can absolutely see the potential for 100x speedups in this code, depending on exactly what the input data is)
2. Its likely that there are much more natural ways to implement the function you have in pandas more idiomatically. These would be both clearer and likely equally fast, though possibly faster. (heck, there are even ways to refactor the code you have to make it look a lot like the direct from the paper impl)
In other words, this isn't (necessarily) a case of python having weak performance, its a case of unidiomatic python having weak performance. This is true in any language though. You can write unidiomatic code in any language, and more often than not it will be slower than a similar idiomatic method (repeatedly apply `foldl` in haskell). I'm not enough of an expert in pandas multi-level indexes to say that for certain, but I'd bet there are more efficient ways to do what you're doing from within pandas that look a lot less ugly and run similarly fast.
Granted, there's an argument to be made that the idiomatic way should be more obvious. But "uncommon pandas indexing tools should be more discoverable" is not the same as "python is unworkably slow".