= Cython

:Author: Seth Kenlon
:Email: [email protected]

Cython is a compiler for the Python programming language, meant to optimise performance and to form an extended Cython programming language.
As an extension of Python, Cython is also a superset of the Python language, and it supports calling C functions and declaring C types on variables and class attributes.
This makes it easy to wrap external C libraries, or to embed C into existing applications, or to write C extensions for Python in a syntax as easy as Python itself.

Cython is commonly used to create C modules that speed up the execution of Python code.
This is important in complex applications for which an interpreted language isn't efficient.

== Installing Cython

You can install Cython on Linux, BSD, Windows, or Mac using Python itself:

[source,bash]
----
$ python -m pip install Cython
----

Once installed, it's ready to use.

== Python like C

A good way to start with Cython is with a simple "hello world" application.
It's not the best demonstration of Cython's advantages, but it shows what happens during Cython use.

First, create this simple Python script in a file called `hello.pyx` (the extension `.pyx` isn't magical and could technically be anything, but it's the extension Cython uses by default):

[source,python]
----
print("hello world")
----

Next, create a Python setup script.
A `setup.py` file is like Python's version of a Makefile, and Cython can use it to process your Python code:

[source,python]
----
from setuptools import setup
from Cython.Build import cythonize

setup(
   ext_modules = cythonize("hello.pyx")
)
----

Finally, use Cython to transform your Python script into C code:

[source,bash]
----
$ python setup.py build_ext --inplace
----

You can see the results in your project directory.
Cython's `cythonize` module transforms `hello.pyx` into a `hello.c` file and a `.so` library.
The C code is 2,648 lines, so it's quite a lot more text than the single line of `hello.pyx` source.
The `.so` library is similarly over 2000 times larger than its source (54,000 compared to 20 bytes).
Then again, Python itself is required to run a single Python script, so there's a lot of code propping up that single-line `hello.pyx` file.

To use the C code version of your Python "hello world" script, open a Python prompt and import the new `hello` module you've created:

[source,python]
----
>>> import hello
hello world
----

== Integrating C code into Python

A good generic test of computational power is the calculation of prime numbers.
A prime number is a positive number greater than 1 that produces a positive integer only when divided by 1 or itself.
It's simple in theory, but as numbers get larger the calculation requirements also increase.
In pure Python, it can be in under 10 lines of code:

[source,python]
----
import sys

number = int(sys.argv[1])
if not number <= 1:
   for i in range(2, number):
       if (number % i) == 0:
           print("Not prime")
           break
else:
   print("Integer must be greater than 1")
----

This script is silent upon success, and returns a message in the event that the number given to it is not prime:

[source,bash]
----
$ ./prime.py 3
$ ./prime.py 4
Not prime.
----

Converting this to Cython requires a little work, partly to make the code appropriate for use as a library, and partly for performance.

=== Scripts and libraries

Many users learn Python as a scripting language: you tell Python the steps you want it to perform, and it does the work.
As you learn more about Python, and indeed open source programming in general, you learn that much of the most powerful code out there are libraries that can be harnessed by other applications.
The _less_ specific your code is, the more likely it is to be repurposed by a programmer (yourself included) for other applications.
It can be a little more work to decouple computation from workflow, but in the end it's usually worth the effort.

In the case of this simple prime number calculator, converting it to Cython begins with a setup script:

[source,python]
----
from setuptools import setup
from Cython.Build import cythonize

setup(
   ext_modules = cythonize("prime.py")
)
----

Transform your script into C:

[source,bash]
----
$ python setup.py build_ext --inplace
----

Everything appears to be working well so far, but when you attempt to import and use your new module, you get an error:

[source,python]
----
>>> import prime
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "prime.py", line 2, in init prime
   number = sys.argv[1]
IndexError: list index out of range
----

The problem here is that a Python script expects to be run from a terminal, where arguments (in this case, an integer to test as a prime number) are common.
You need to modify your script so that it can be used as a library instead.

=== Writing a library

Libraries don't use system arguments, and instead accept arguments from other code.
Instead of using `sys.argv` to bring in user input, make your code a function that accepts an argument called `number` (or `num` or whatever variable name you prefer).

[source,python]
----
def calculate(number):
   if not number <= 1:
       for i in range(2, number):
           if (number % i) == 0:
               print("Not prime")
               break
   else:
       print("Integer must be greater than 1")

----

This admittedly makes your script somewhat difficult to test, because when you run the code in Python, the `calculate` function is never executed.
However, Python programmers have devised a common, if not intuitive, workaround for this problem.
When a Python script is executed by the Python interpreter, there's a special variable called `__name__` that gets set to `__main__`, but when it's imported as a module, `__name__` is set to the module's name.
By leveraging this, you can write a library that is both a Python module and a valid Python script.

[source,python]
----
import sys

def calculate(number):
   if not number <= 1:
       for i in range(2, number):
           if (number % i) == 0:
               print("Not prime")
               break
   else:
       print("Integer must be greater than 1")

if __name__ == "__main__":
   number = sys.argv[1]
   calculate( int(number) )
----

Now you can run the code as a command:

[source,bash]
----
$ python ./prime.py 4
Not a prime
----

And you can convert it to Cython for use as a module:

[source,python]
----
>>> import prime
>>> prime.calculate(4)
Not prime
----

== C Python

Converting code from pure Python to C with Cython can be useful, and this article has only demonstrated how it's done.
There are Cython features to help you optimize your code before conversion, options to analyze your code to find when Cython interacts with C, and much more.
If you're using Python but you're looking to enhance your code with C code, or you're looking to further your understanding of how libraries provide better extensibility than scripts, or you're just curious about how Python and C can work together, then start experimenting with Cython.