Sometimes it's about optimizing wall time not algorithmic complexity.
If you have a batch SLA of 1 hour, and your currently spending 50-70 mins to complete the batch and 20 minutes of that time is spent date parsing and you can reduce it to 5 minutes that's an big win.
No doubt, but if your date parsing saves you 1 second per date parsed but each call into the faster library costs 2 seconds, then your performance actually suffers. The only way around this is to make a batch call such that the overhead is O(1).
I’m not going to install it to check, but when someone writes “Fast datetime parser for Python written in Rust. Parses 10x-15x faster than datetime.strptime.” it seems reasonable to assume that this is not the case.
In a language like Java where you mostly spend time in the VM and only occasionally jump into native code, that might be true. But in python a huge part of the runtime is this kind of native call. So I would not expect that this approach adds any new overhead.
Your conclusion might be right, but your reasoning is certainly wrong. Calling native functions in Python is often quite expensive because you need to marshal between PyObjects and the native types (probably allocating memory as well). This doesn’t “feel” so slow in Python because, well, everything in Python is slow. But you really start to notice it when you’re optimizing.
Of course "It depends", but in my experience that kind of thing is rare. Either you're passing in str and can just grab the char* out of the existing PyObject, or you have some more complicated thing that was wrapped once in a PyObject and doesn't need to be converted, etc. But sure, if you have some dict with a lot of data and need to convert it into an std::map you'll have a bad time.
If you have a batch SLA of 1 hour, and your currently spending 50-70 mins to complete the batch and 20 minutes of that time is spent date parsing and you can reduce it to 5 minutes that's an big win.