I recently refactored some underperforming production code using Cython. After spending ~6 hours rewriting 2 modules in Cython's "python superset" syntax, I'm left with a 33% performance boost, and virtually no additional project complexity.
I've also been replacing some rather excessive struct.unpack usage in my code with Cython's C struct pointer casting syntax, and uncovering _massive_ performance gains. 45 seconds of parsing now takes 3 seconds.
I'm pretty much convinced there's no reason to learn CPython's C API, given Cython's maturity and PyPy's improbable, scintillating ascendancy. viz. RPython may be Python's performance future, but Cython is ready now.
No additional complexity was introduced into those two modules, aside from being a different language that's easily grasped by anyone who knows Python and C. LOC-wise, the modules were about the same as the Python versions.
I've done some integration of C code using ctypes, which works quite well, and offers the obvious speed boost, but feels less coherent and ultimately less maintainable, project-wise, than a well-coded Cython module. Writing a full-on CPython module from scratch would probably offer better performance than Cython if you know the quirks and are disciplined. But to someone who doesn't already drip CPython C modules, Cython is a godsend.
Ultimately, there's 5 commonly used ways (CPython, Boost::Python, SWIG, Cython, ctypes) to integrate C into Python, and right now you'd be crazy not to give Cython a shot, if that's your need. It's very easy to learn for anyone familiar with both C and Python.
Then you're lucky because when I optimized an algorithm for my diploma thesis the code grew significantly because my Python functions suddenly had to deal with different type combinations and also, as part of the optimization, I had to switch from my short numpy-based code to manual for-loops which combined several operations that were simple (but less efficient) numpy arithmetic before. I think it highly depends on what you want to do, whether you can still work with Python objects, and how much static typing gets in the way when trying to solve your particular problem.
was the performance gain worth it in the end? In my experience numpy is pretty tight on its own, but I've seen some excellent speed gains from using cython + numpy.
No, it's a custom implementation of a simple compiler. It's nothing complicated. It converts Python to C++ and compiles that with nvcc. It also supports numpy arrays. It doesn't do any complex optimization steps like a full compiler. It's more like Cython, actually (with type annotations via an @gpu decorator). This allowed me to take my Python image processing code almost literally and annotate it with @gpu. The code isn't released, yet.
I originally wanted use Copperhead and got in contact with the developer a year ago, but it was too early even for "private" beta testing, so I never got access to their code. Also, my compiler is specialized on image processing, so probably Copperhead wouldn't have worked, anyway. I'm only jealous of Copperhead's type inferencer. :) But then again, I have to get finished with my thesis and a type inferencer wouldn't help with that goal. ;)
I've also been replacing some rather excessive struct.unpack usage in my code with Cython's C struct pointer casting syntax, and uncovering _massive_ performance gains. 45 seconds of parsing now takes 3 seconds.
I'm pretty much convinced there's no reason to learn CPython's C API, given Cython's maturity and PyPy's improbable, scintillating ascendancy. viz. RPython may be Python's performance future, but Cython is ready now.