Monday, October 23, 2017

Speeding Up Python With Cython

Speeding Up Python With Cython

Cython is a superset of Python that lets you significantly improve the speed of your code. You can add optional type declarations for even greater benefits. Cython translates your code to optimized C/C++ that gets compiled to a Python extension module. 

In this tutorial you'll learn how to install Cython, get an immediate performance boost of your Python code for free, and then how to really take advantage of Cython by adding types and profiling your code. Finally, you'll learn about more advanced topics like integration with C/C++ code and NumPy that you can explore further for even greater gains.

Counting Pythagorean Triples

Pythagoras was a Greek mathematician and philosopher. He is famous for his Pythagorean theorem, which states that in a right-angled triangle, the sum of squares of the legs of the triangles is equal to the square of the hypotenuse. Pythagorean triples are any three positive integers a, b and c that such that a² + b² = c². Here is a program that finds all the Pythagorean triples whose members are not greater than the provided limit.

Apparently there are 881 triples, and it took the program a little less than 14 seconds to find it out. That's not too long, but long enough to be annoying. If we want to find more triples up to a higher limit, we should find a way to make it go quicker. 

It turns out that there are substantially better algorithms, but today we're focusing on making Python faster with Cython, not on the best algorithm for finding Pythagorean triples. 

Easy Boosting With pyximport

The easiest way to use Cython is to use the special pyximport feature. This is a statement that compiles your Cython code on the fly and lets you enjoy the benefits of native optimization without too much trouble. 

You need to put the code to cythonize in its own module, write one line of setup in your main program, and then import it as usual. Let's see what it looks like. I moved the function to its own file called pythagorean_triples.pyx. The extension is important for Cython. The line that activates Cython is import pyximport; pyximport.install(). Then it just imports the module with the count() function and later invokes it in the main function.

The pure Python function ran 50% longer. We got this boost by adding a single line. Not bad at all.

Build Your Own Extension Module

While pyximport is really convenient during development, it works only on pure Python modules. Often when optimizing code you want to reference native C libraries or Python extension modules. 

To support those, and also to avoid dynamically compiling on every run, you can build your own Cython extension module. You need to add a little setup.py file and remember to build it before running your program whenever you modify the Cython code. Here is the setup.py file:

Then you need to build it:

As you can see from the output, Cython generated a C file called pythagorean_triples.c and compiles it a platform-specific .so file, which is the extension module that Python can now import like any other native extension module. 

If you're curious, take a peek at the generated C code. It is very long (2789 lines), obtuse, and contains a lot of extra stuff needed to work with the Python API. Let's drop the pyximport and run our program again:

The result is pretty much the same as with pyximport. However, note that I'm measuring only the runtime of the cythonized code. I'm not measuring how long it takes pyximport to compile the cythonized code on the fly. In big programs, this can be significant.

Adding Types to Your Code

Let's take it to the next level. Cython is more than Python and adds optional typing. Here, I just define all the variables as integers, and the performance skyrockets:

Yes. That's correct. By defining a couple of integers, the program runs in less than 57 milliseconds, compared to more than 13 seconds with pure Python. That's almost a 250X improvement.

Profiling Your Code

I used Python's time module, which measures wall time and is pretty good most of the time. If you want more precise timing of small code fragments, consider using the timeit module. Here is how to measure the performance of the code using timeit:

The timeit() function takes a statement to execute, a setup code that is not measured, and the number of times to execute the measured code.

Advanced Topics

I just scratched the surface here. You can do a lot more with Cython. Here are a few topics that can further improve the performance of your code or allow Cython to integrate with other environments:

  • calling C code
  • interacting with the Python C API and the GIL
  • using C++ in Python
  • porting Cython code to PyPY
  • using parallelism
  • Cython and NumPy
  • sharing declarations between Cython modules

Conclusion

Cython can produce two orders of magnitude of performance improvement for very little effort. If you develop non-trivial software in Python, Cython is a no-brainer. It has very little overhead, and you can introduce it gradually to your codebase.

Additionally, don’t hesitate to see what we have available for sale and for study in the marketplace, and don't hesitate to ask any questions and provide your valuable feedback using the feed below.


No comments:

Post a Comment