Performance Optimization for Python¶
Basics¶
Interfacing with C/C++¶
Interfacing with C from Scipy lecture notes
- Very nice overview and examples of four approaches.
- Python/C API
- ctypes - A foreign function library for
Python
- It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.
- SWIG
- SWIG is an interface compiler that connects programs written in C and C++ with scripting languages such as Python
- Cython - C-Extensions for Python
Only for C++¶
- Boost.Python
- Boost.Python, a C++ library which enables seamless interoperability between C++ and the Python programming language.
- pybind11 - Seamless operability between C++11 and
Python
- pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.
- This is used by the LSST developers, please see the DM Pybind11 style guide for details.
Using Just-in-Time (JIT) Compiler¶
- pypy - a fast, compliant alternative implementation of the Python language
- numba - makes Python code fast
- Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. __Numba__only supports LLVM.
- Numba offers a range of options for parallelizing your code for CPUs and GPUs, often with only minor code changes.
- Numpy supports in Numba
- Numba has a vectorize and guvectorize decorators that can be very useful.
- hope - A Python Just-In-Time compiler for astrophysical
computations
- hope is a specialized method-at-a-time JIT compiler written in Python for translating Python source code into C++ and compiles this at runtime.
- Has not been updated for three yeears.
Tutorial and notes¶
Making Numpy faster¶
- jax - GPU- and TPU-backed NumPy with differentiation and JIT
compilation by Google
- JAX is Autograd and XLA, brought together for high-performance machine learning research.
- autograd - Efficiently computes derivatives of numpy code
Other packages¶
Parallel computing in Python¶
Tutorial¶
Software¶
Common tools:¶
More “Big Data” approach:¶
- Dask - Parallel computing with task
scheduling
- Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love
- Dask is open source and freely available. It is developed in coordination with other community projects like Numpy, Pandas, and Scikit-Learn.
- Official Dask tutorial using Jupyter notebook