Writing Packages in Python (Part 1)

Tram Ho

A package is basically a collection of Python modules. Packages are a structured way for both, many packages as well as modules, eventually resulting in a well-organized data hierarchy, making folders and modules easily accessible. This article focuses on the process of writing and releasing Python packages. Here, we will see how to reduce the time required to set things up before starting the real work. Along with that, we will also explore how to set up a standardized process for writing packages and an easy to use test-based development method.

Technical Requirements

Before diving into the actual process, let’s first download the code file we will use in this article. It can be downloaded from ( ).

The Python packages mentioned in this article can be downloaded from PyPi as follows:

Create a Package

Python packaging may be a little confusing at first. The main reason behind that is the confusion about the right tools to create Python packages. But once the first package is created, you will find it is not as difficult as imagined. In addition, knowing the appropriate modern packaging tools will help a lot.

You should know how to create packages even if you are not interested in distributing your code as open source. Knowing how to create your own packages will help you better understand the packaging ecosystem and will help you work with third-party code available on PyPI that we may be using.

In addition, a closed source project or its components are available as source distribution packages that can help deploy code in different environments. Here, we will focus on the appropriate tools and techniques to create such distributions.

The tool is recommended

The Python Packaging User Guide gives a few suggestions about recommended tools for working with packages. They can usually be divided into the following two groups:

Utilities proposed by PyPA:

The Python packaging manual recommends the following tools for creating and distributing packages:

Configuration for Project

The easiest way to organize code for large applications is to divide them into packages. This makes the code simpler, easier to understand, maintain, and change. It also maximizes the possibility of reusing your code. Separate packages act as components that can be used in different programs.

setup.py

The root of the package must be distributed containing the setup.py script. It defines all metadata as described in the distutils module. The metadata package is represented as an argument in the call to the standard setup () function. Although distutils are standard library modules provided for code packaging, it is a good idea to use set-up tools instead. The setuptools package provides a number of improvements over the standard distutils module.

Therefore, the minimum content for this file is as follows:

name for the full name of the package. From there, the script provides a number of commands that can be listed with the –help-commands option, as shown in the following code:

The list of actual commands is longer and may vary depending on the available setuptools extensions. It has been truncated to show only the most important and relevant things for this article. Standard commands are integrated commands provided by distutils, while additional commands are commands provided by third-party packages, such as setuptools or any other defined and registered packages. a new order. Here, such an additional command registered by another package is bdist_wheel, provided by the wheel package.

setup.cfg

The setup.cfg file contains the default options for setup.py script commands. This is useful if the package creation and distribution process is more complicated and requires many optional arguments passed to the setup.py script commands. This setup.cfg file allows you to store such default parameters along with your source code on a project-by-project basis. This will make your distribution flow independent of the project and also provide transparency on how your package is built / distributed to users and other team members.

The syntax of the setup.cfg file is the same as provided by the built-in configuration module, so it is similar to the common Microsoft Windows INI files. Here is an example of the setup.cfg configuration file that provides some default values ​​for global commands, sdist and bdist_wheel:

The configuration of this example will ensure that the source distributions (sdist part) will always be created in two formats (ZIP and TAR) and the wheel distributions built (bdist_wheel part) will be created as The popular wheel is independent of the Python version. In addition, most outputs will be blocked on every command by the –quiet global switch. Note that this option is included here for demonstration purposes only and it may not be a reasonable option to suppress the default output for every command.

MANIFEST.in

When building a distribution with the sdist command, the distutils module browses the package directory to look for files to put into the repository. By default distutils will include the following:

Also, if your package is versioned with a version control system like Subversion, Mercurial or Git, it is possible to automatically include all version-controlled files using the setuptools extension. like setuptools-svn, setuptools-hg and setuptools -git. Integration with other version control systems is also possible through other custom extensions. Regardless of whether it is the default integrated collection strategy or the strategy defined by a custom extension, sdist will create a MANIFEST file that lists all the files and will take them into the final repository.

Assuming you are not using any additional extensions and you need to include some files not captured by default in your package distribution. You can specify a template named MANIFEST.in in your package root (the same directory as the setup.py file). This sample instructs the sdist command on files to include.

This MANIFEST.in template specifies a rule to include or exclude per line:

The full list of MANIFEST.in commands can be found in the official documentation of distutils.

The most important metadata

Besides the name and version of the package that is distributed, the most important arguments that the setup () function can get are as follows:

Trove Classifiers

PyPI and distutils provide a solution for classifying applications with classifiers called trove classifiers. All trove classifications form a tree-like structure. Each taxonomy string identifies a list of nested namespaces in which every namespace is separated by :: substring. Their list is provided for the package definition as a category argument of the setup () function.

The following is a sample list of classifiers taken from the solrq project available on PyPI:

Trove classifiers are completely optional in the package definition but provide a useful extension for the underlying metadata available in the setup () interface. In addition, trove classifiers can provide information about supported Python versions, supported operating systems, project development stages, or licenses under which code is released. Many PyPI users search and browse available packages under the category to properly categorize packages to meet their goals.

The Trove classification serves an important role in the entire packaging ecosystem and should never be overlooked. There is no organization that verifies package classifications, so it is your responsibility to provide the appropriate classifications for your packages and not to cause confusion for the entire package index.

Currently, there are 667 classifications available on PyPI that are grouped into the following nine main categories:

This list is growing and new categories are being added over time. Therefore, it is possible that their totals will vary by the time you read this. The full list of trove classifications is available here.

Popular patterns

Creating a distribution package can be a tedious task for inexperienced developers. Most metadata that setuptools or distuitls accepts in their call to setup () can be provided manually without ignoring the fact that this metadata may also be available in other parts of the project. Here is an example:

Several metadata elements are often found in different places in a typical Python project. For example, the long description content is often included in the project’s README file and it’s a good convention to place a version identifier in the init module of the package. Hard coding as package metadata as a backup setup () function for the project allows easy errors and inconsistencies in the future. Neither setuptools nor distutils can automatically select metadata information from project sources, so you need to provide it yourself. There are some common models in the Python community to solve the most common issues such as dependency management, including version / readme, etc.It is worth knowing at least a few of them because they are universal. variable to the point where they can be considered encapsulated idioms.

Share the news now

Source : Viblo