Python is one of the most popular and popular programming languages today. However, when working with Python, you will encounter or be told about one of the weaknesses up to this point: Python is slow
.
There are several ways to speed up your Python code. Maybe you read it somewhere:
- Using multi-processing libraries
- Using asynchronous
You can read more of his article here .
As above, you will approach 2 sides to speed up Python code: parallel programming
and asynchronous programming
. Now, I will introduce a different approach. It was Cython
.
What’s Cython?
Can understand Cython is an intermediate step between Python and C / C ++. It allows you to write pure Python with some minor modifications, then translate it directly into C.
Cython will bring you the combined power of Python and C:
- Python code calls back and forth C or C ++ native code at any time.
- Easily modify Python code for performance like C code simply by adding static type declarations, also in Python syntax.
- Interact effectively with the big data set.
- Integrate native with existing code, low-level or high-performance libs / apps.
…
Some other information:
- Core Developers: Stefan Behnel, Robert Bradshaw, Lisandro Dalcín, Mark Florisson, Vitja Makarov, Dag Sverre Seljebotn.
- Repository: https://github.com/cython/cython
- Homepage: https://cython.org
- License: Apache License 2.0
You can easily the Cython via pip
1 2 | pip install cython |
Compared to Python code, you need to add type information to every variable. Usually, to declare a Python variable, very simple:
1 2 | x = 1 |
With Cython, you need to add the type for that variable:
1 2 | cdef int x = 1 |
Just like in C, the type declaration for a variable in Cython is required.
Types in Cython
When using Cython, there are two different points for variables and functions.
For variable:
1 2 3 4 5 6 7 8 | cdef int a, b, c cdef char *s cdef float x = 0.5 (single precision) cdef double x = 60.4 (double precision) cdef list images cdef dict user cdef object card_deck |
All of these types are derived from C / C ++.
For function:
1 2 3 4 | def function1... cdef function2... cpdef function2... |
With:
- def: Function python pure, only called from Python.
- cdef: Cython only functions. Only called from Cython.
- cpdef: C and Python function. Can be called from C and Python.
How to speedup your code with Cython
First, I will create a pure Python code with for-loop.
1 2 3 4 5 6 7 8 | <span class="token comment">#run_test_python.py</span> <span class="token keyword">def</span> <span class="token function">test_python</span> <span class="token punctuation">(</span> x <span class="token punctuation">)</span> <span class="token punctuation">:</span> y <span class="token operator">=</span> <span class="token number">1</span> <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span> <span class="token punctuation">(</span> <span class="token number">1</span> <span class="token punctuation">,</span> x <span class="token operator">+</span> <span class="token number">1</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> y <span class="token operator">*=</span> i <span class="token keyword">return</span> y |
Applying what I have understood above, I will write the code in Cython with a meaning:
1 2 3 4 5 6 7 8 | #run_test_cython.pyx cpdef int test_cython(int x): cdef int y = 1 cdef int i for i in range(1, x+1): y *= i return y |
When coding Cython, make sure that all of your variables are set.
Next, we need to create a file to compile from Cython -> C code:
1 2 3 4 5 6 | <span class="token comment"># setup.py</span> <span class="token keyword">from</span> distutils <span class="token punctuation">.</span> core <span class="token keyword">import</span> setup <span class="token keyword">from</span> Cython <span class="token punctuation">.</span> Build <span class="token keyword">import</span> cythonize setup <span class="token punctuation">(</span> ext_modules <span class="token operator">=</span> cythonize <span class="token punctuation">(</span> <span class="token string">'run_test_cython.pyx'</span> <span class="token punctuation">)</span> <span class="token punctuation">)</span> |
After putting run_test_cython.pyx
and setup.py
with dir, we start to compile:
1 2 3 4 5 6 7 8 9 10 11 | % python setup.py build_ext --inplace Compiling run_test_cython.pyx because it changed. [1/1] Cythonizing run_test_cython.pyx running build_ext building 'run_test_cython' extension creating build creating build/temp.linux-x86_64-3.6 gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/ha.hao.minh/.pyenv/versions/viblo-venv/include -I/home/ha.hao.minh/.pyenv/versions/3.6.8/include/python3.6m -c run_test_cython.c -o build/temp.linux-x86_64-3.6/run_test_cython.o gcc -pthread -shared -L/home/ha.hao.minh/.pyenv/versions/3.6.8/lib -L/home/ha.hao.minh/.pyenv/versions/3.6.8/lib build/temp.linux-x86_64-3.6/run_test_cython.o -o /home/ha.hao.minh/workspace/viblo/112019/run_test_cython.cpython-36m-x86_64-linux-gnu.so |
Result:
1 2 3 4 | % ls build run_test_cython.c run_test_cython.cpython-36m-x86_64-linux-gnu.so run_test_cython.pyx run_test_python.py setup.py |
You will see in this folder that contains all the files needed to run C code. If you’re curious what the other Cython code will compile into is what C code you can cat to see that file:
1 2 | cat run_test_cython.c |
Come on, it’s time to show the power of C code. The following code compares the speed of Python pure and Cython:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | <span class="token comment"># speedtest.py</span> <span class="token keyword">import</span> run_test_python <span class="token keyword">import</span> run_test_cython <span class="token keyword">import</span> time <span class="token keyword">def</span> <span class="token function">speedtest_python</span> <span class="token punctuation">(</span> number <span class="token punctuation">)</span> <span class="token punctuation">:</span> start <span class="token operator">=</span> time <span class="token punctuation">.</span> time <span class="token punctuation">(</span> <span class="token punctuation">)</span> run_test_python <span class="token punctuation">.</span> test_python <span class="token punctuation">(</span> number <span class="token punctuation">)</span> end <span class="token operator">=</span> time <span class="token punctuation">.</span> time <span class="token punctuation">(</span> <span class="token punctuation">)</span> py_time <span class="token operator">=</span> end <span class="token operator">-</span> start <span class="token keyword">print</span> <span class="token punctuation">(</span> f <span class="token string">"Python time = {py_time}"</span> <span class="token punctuation">)</span> <span class="token keyword">return</span> py_time <span class="token keyword">def</span> <span class="token function">speedtest_cython</span> <span class="token punctuation">(</span> number <span class="token punctuation">)</span> <span class="token punctuation">:</span> start <span class="token operator">=</span> time <span class="token punctuation">.</span> time <span class="token punctuation">(</span> <span class="token punctuation">)</span> run_test_cython <span class="token punctuation">.</span> test_cython <span class="token punctuation">(</span> number <span class="token punctuation">)</span> end <span class="token operator">=</span> time <span class="token punctuation">.</span> time <span class="token punctuation">(</span> <span class="token punctuation">)</span> cy_time <span class="token operator">=</span> end <span class="token operator">-</span> start <span class="token keyword">print</span> <span class="token punctuation">(</span> <span class="token string">"Cython time = {}"</span> <span class="token punctuation">.</span> <span class="token builtin">format</span> <span class="token punctuation">(</span> cy_time <span class="token punctuation">)</span> <span class="token punctuation">)</span> <span class="token keyword">return</span> cy_time <span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">"__main__"</span> <span class="token punctuation">:</span> <span class="token keyword">for</span> number <span class="token keyword">in</span> <span class="token punctuation">[</span> <span class="token number">10</span> <span class="token punctuation">,</span> <span class="token number">100</span> <span class="token punctuation">,</span> <span class="token number">1000</span> <span class="token punctuation">,</span> <span class="token number">10000</span> <span class="token punctuation">,</span> <span class="token number">100000</span> <span class="token punctuation">]</span> <span class="token punctuation">:</span> <span class="token keyword">print</span> <span class="token punctuation">(</span> f <span class="token string">"Speedtest with number = {number}"</span> <span class="token punctuation">)</span> py_time <span class="token operator">=</span> speedtest_python <span class="token punctuation">(</span> number <span class="token punctuation">)</span> cy_time <span class="token operator">=</span> speedtest_cython <span class="token punctuation">(</span> number <span class="token punctuation">)</span> <span class="token keyword">print</span> <span class="token punctuation">(</span> <span class="token string">"Speedup = {}"</span> <span class="token punctuation">.</span> <span class="token builtin">format</span> <span class="token punctuation">(</span> py_time <span class="token operator">/</span> cy_time <span class="token punctuation">)</span> <span class="token punctuation">)</span> |
Results after running speedtest.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | % python speedtest.py Speedtest with number = 10 Python time = 3.5762786865234375e-06 Cython time = 7.152557373046875e-07 Speedup = 5.0 Speedtest with number = 100 Python time = 7.867813110351562e-06 Cython time = 2.384185791015625e-07 Speedup = 33.0 Speedtest with number = 1000 Python time = 0.0002810955047607422 Cython time = 9.5367431640625e-07 Speedup = 294.75 Speedtest with number = 10000 Python time = 0.02144336700439453 Cython time = 7.62939453125e-06 Speedup = 2810.625 Speedtest with number = 100000 Python time = 3.1171438694000244 Cython time = 8.630752563476562e-05 Speedup = 36116.70994475138 |
With the current machine configuration for my test:
- CPU: Intel® Core ™ i5-4460 CPU @ 3.20GHz × 4
- Ram: 8G
Fill the results in the table for easy viewing:
Number | Python time | Cython time | Speedup |
---|---|---|---|
ten | 3.5762786865234375e-06 | 7.152557373046875e-07 | 5.0 |
100 | 7.867813110351562e-06 | 2,384185791015625e-07 | 33.0 |
1000 | 0.0002810955047607422 | 9.5367431640625e-07 | 294.75 |
10000 | 0.02144336700439453 | 7.62939453125e-06 | 2810,625 |
100000 | 3.1171438694000244 | 8.630752563476562e-05 | 36116.70994475138 |
36116
– Seems like an unthinkable number .
Clearly, Cython gives you very good performance. This is also a viable solution if you want to improve your code.
Source: https://cython.org/
Thanks for reading!