Write a Python package¶
Developer Guide¶
- LSST DM Developer Guide
- Very valuable reference for developing and organizing codes for large research project.
- DM Python style guide
- Similarly, also see SKA develop portal (Still under construction…)
Structure¶
- “Dress for the job you want, not the job you have.”
- Structuring Your
Project
- From organizing files to the structure of the code, very good for beginners.
- How To Package Your Python
Code
- Aims to put forth an opinionated and specific pattern to make trouble-free packages for community use
- Cookiecutter - A logical, reasonably standardized, but flexible project structure for doing and sharing data science work
Code Format¶
- It is good practice to follow well-established code format. Not only it can help you write codes that are nice looking and easy to maintain, it will help others to read and contribute to the code.
- For Python, the PEP8 style guide is the most important one. Some of these rules feel unecessary and annoying, but there are always good reasons behind them.
- autopep8: A tool that automatically formats Python code to conform to the PEP 8 style guide
- black: The uncompromising Python code
formatter
- Blackened code looks the same regardless of the project you’re reading. Formatting becomes transparent after a while and you can focus on the content instead.
- yapf: A formatter for Python files from Google
Python setup.py file¶
- A Human’s Ultimate Guide to
setup.py
- This is very good template for using setup.py
Readme¶
- Art of README
- This is basically the only thing you need to study about writing a good readme file.
- Chinese version (中文版)
- readme-md-generator - CLI that generates beautiful README.md
files
- readme-md-generator will suggest you default answers by reading your package.json and git configuration.
Document¶
General instructions¶
- Writing change-controlled
documentation
- Manual provided by LSST DM team
- LSST DM的Documenting Python APIs with
Docstrings
- Also very good example by LSST DM. LSST adopts the Numpydoc format.
Tools¶
- sphinx - Python documentation
generator
- Sphinx is a tool that makes it easy to create intelligent and beautiful documentation.
- Sphinx uses reStructuredText as its markup language, and many of its strengths come from the power and straightforwardness of reStructuredText and its parsing and translating suite, the Docutils.
- First steps with sphinx
- On Markdown v.s. reStructuredText: Markdown is easy to use; reStructuredText is more extensible and powerful.
- Brandon’s Sphinx Tutorial from PyCon 2013
- Sphinx Tutorial by Eric Holscher is the best place to start. The GitHub repo itself is a very good example.
- Sphinx Themes
- pandoc - A universal document converter
- If you need to convert files from one markup format into another, pandoc is your swiss-army knife. e.g. It can convert reStructuredText to/from Markdown.
- rinohtype - The Python document
processor
- Rinohtype is a document processor in the style of LaTeX. It renders structured documents to PDF based on a document template and a style sheet.
- A Simple Tutorial on How to document your Python Project using Sphinx and Rinohtype
- numpydoc – Numpy’s Sphinx extensions
- Doxygen - Generate documentation from source
code
- Doxygen is the de facto standard tool for generating documentation from annotated C++ sources, but it also supports other popular programming languages such as C, Objective-C, C#, PHP, Java, Python, IDL.
- The Doxygen document site for Galsim is a very good example
- Read the Docs - Technical documentation lives
here
- Read the Docs simplifies software documentation by automating building, versioning, and hosting of your docs for you.
Test¶
- Testing Your Code from the Hitchhiker’s Guide to
Python
- A nice summary of multiple approaches of unit test in Python.
- Getting Started With Testing in Python from
RealPython
- Another very nice introduction, convering unittest, pytest, and nose.
- LSST DM: Python Unit Testing
Guide
- LSST DM standard is a very good example:LSST tests should be written using the unittest framework, with default test discovery, and should support being run using the pytest test runner
- unittest — Unit testing
framework
- Basic unit test in Python. The list of assertion methods is here
- pytest - helps you write better
programs
- The pytest framework makes it easy to write small tests, yet scales to support complex functional testing for applications and libraries.
- Examples and customization tricks for pytest: this is very useful.
-nose2 - Nicer testing for Python * nose2’s purpose is to extend unittest to make testing nicer and easier to understand.
Code Coverage¶
In computer science, test coverage is a measure used to describe the degree to which the source code of a program is executed when a particular test suite runs. A program with high test coverage, measured as a percentage, has had more of its source code executed during testing, which suggests it has a lower chance of containing undetected software bugs compared to a program with low test coverage. – Wikipedia
- Coverage.py - Code coverage testing for
Python
- Coverage.py measures code coverage, typically during test execution. It uses the code analysis tools and tracing hooks provided in the Python standard library to determine which lines are executable, and which have been executed.
- Quick start guide
- pytest has a pytest-cov plugin
- Codecov - Empower developers with tools to improve code quality and
testing
- It is web service that improves your code review workflow and quality. Free for open source. Plans starting at $2.50/month per repository. You can login with your GitHub or Bitbucket account.
- Here is a Python example for Codecov
Optimization¶
- Optimizing Python Code - Scipy Lecture
Notes
- Make it work; 2: Make it work reliably; 3: Optimization
- No optimization without measuring: profiling and timing
- Moving computation or memory allocation outside a for loop; Vectorizing for loops; Broadcasting; Use in place operations; Be easy on the memory: use views, and not copies;
- LSST DM Python performance
profiling
- Very good guide.
- The Python
Profilers
- Python comes with a series of profiling tools. The most useful ones are cProfile, profile, and pstats (convert profiling results into a report)
- Profiling Python using cProfile: a concrete
case
- cProfile 对于发现程序中的瓶颈很有帮助
- line_profiler and kernprof - Line-by-line profiling for
Python
- line_profiler is a module for doing line-by-line profiling of functions. kernprof is a convenient script for running either line_profiler or the Python standard library’s cProfile or profile modules, depending on what is available.
- Can use cProfile to identify “hotspot” (function that is the “bottleneck”), then use line_profiler to exame the issue carefully.
Visualization¶
- gprof2dot - Converts profiling output to a dot
graph
- A general tool to convert different profiling software output to a dot graph.
- SnakeViz - An in-browser Python profile
viewer
- SnakeViz is a viewer for Python profiling data that runs as a web application in your browser.
- pycallgraph - Python module that creates call graphs for Python
programs
- No longer maintained by the original author, but still available through a fork: pycallgraph2
Upload Your Package to PyPI¶
- PyPI is the Python Package Index. It is a repository of software
for the Python programming language.
- “It helps you find and install software developed and shared by the Python community”
- Basically, once you upload your project to PyPI, people can use pip install to install it.
- It is pretty straightforward to upload your project. Please read this tutorial