Packaging a python library

25 May 2014 (updated 30 September 2019)


This is about packaging libraries, not applications.


All the advice here is implemented in a project template (with full support for C extensions): cookiecutter-pylibrary (introduction).

I think the packaging best practices should be revisited, there are lots of good tools now-days that are either unused or underused. It's generally a good thing to re-evaluate best practices all the time.

I assume here that your package is to be tested on multiple Python versions, with different combinations of dependency versions, settings etc.

And few principles that I like to follow when packaging:

  • If there's a tool that can help with testing use it. Don't waste time building a custom test runner if you can just use py.test or nose. They come with a large ecosystem of plugins that can improve your testing.
  • When possible, prevent issues early. This is mostly a matter of strictness and exhaustive testing. Design things to prevent common mistakes.
  • Collect all the coverage data. Record it. Identify regressions.
  • Test all the possible configurations.

The structure *

This is fairly important, everything revolves around this. I prefer this sort of layout:

β”œβ”€ src
β”‚  └─ packagename
β”‚     β”œβ”€
β”‚     └─ ...
β”œβ”€ tests
β”‚  └─ ...

The src directory is a better approach because:

  • You get import parity. The current directory is implicitly included in sys.path; but not so when installing & importing from site-packages. Users will never have the same current working directory as you do.

    This constraint has beneficial implications in both testing and packaging:

    • You will be forced to test the installed code (e.g.: by installing in a virtualenv). This will ensure that the deployed code works (it's packaged correctly) - otherwise your tests will fail. Early. Before you can publish a broken distribution.
    • You will be forced to install the distribution. If you ever uploaded a distribution on PyPI with missing modules or broken dependencies it's because you didn't test the installation. Just beeing able to successfuly build the sdist doesn't guarantee it will actually install!
  • It prevents you from readily importing your code in the script. This is a bad practice because it will always blow up if importing the main package or module triggers additional imports for dependencies (which may not be available [5]). Best to not make it possible in the first place.

  • Simpler packaging code and manifest. It makes manifests very simple to write (e.g.: you package a Django app that has templates or static files). Also, zero fuss for large libraries that have multiple packages. Clear separation of code being packaged and code doing the packaging.

    Without src writting a is tricky [6]. If your manifest is broken your tests will fail. It's much easier with a src directory: just add graft src in

    Publishing a broken package to PyPI is not fun.

  • Without src you get messy editable installs (" develop" or "pip install -e"). Having no separation (no src dir) will force setuptools to put your project's root on sys.path - with all the junk in it (e.g.: and other test or configuration scripts will unwittingly become importable).

  • There are better tools. You don't need to deal with installing packages just to run the tests anymore. Just use tox - it will install the package for you [2] automatically, zero fuss, zero friction.

  • Less chance for user mistakes - they will happen - assume nothing!

  • Less chance for tools to mixup code with non-code.

Another way to put it, flat is better than nested [*] - but not for data. A file-system is just data after all - and cohesive, well normalized data structures are desirable.

You'll notice that I don't include the tests in the installed packages. Because:

  • Module discovery tools will trip over your test modules. Strange things usually happen in test module. The help builtin does module discovery. E.g.:

    >>> help('modules')
    Please wait a moment while I gather a list of all available modules...
    __future__          antigravity         html                select
  • Tests usually require additional dependencies to run, so they aren't useful by their own - you can't run them directly.

  • Tests are concerned with development, not usage.

  • It's extremely unlikely that the user of the library will run the tests instead of the library's developer. E.g.: you don't run the tests for Django while testing your apps - Django is already tested.

Alternatives *

You could use src-less layouts, few examples:

Tests in package Tests outside package
β”œβ”€ packagename
β”‚  β”œβ”€
β”‚  β”œβ”€ ...
β”‚  └─ tests
β”‚     └─ ...
β”œβ”€ packagename
β”‚  β”œβ”€
β”‚  └─ ...
β”œβ”€ tests
β”‚  └─ ...

These two layouts became popular because packaging had many problems few years ago, so it wasn't feasible to install the package just to test it. People still recommend them [4] even if it based on old and oudated assumptions.

Most projects use them incorectly, as all the test runners except Twisted's trial have incorrect defaults for the current working directory - you're going to test the wrong code if you don't test the installed code. trial does the right thing by changing the working directory to something temporary, but most projects don't use trial.

The setup script *

Unfortunately with the current packaging tools, there are many pitfalls. The script should be as simple as possible:

#!/usr/bin/env python
# -*- encoding: utf-8 -*-
from __future__ import absolute_import
from __future__ import print_function

import io
import re
from glob import glob
from os.path import basename
from os.path import dirname
from os.path import join
from os.path import splitext

from setuptools import find_packages
from setuptools import setup

def read(*names, **kwargs):
        join(dirname(__file__), *names),
        encoding=kwargs.get('encoding', 'utf8')
    ) as fh:

    description='An example package. Generated with cookiecutter-pylibrary.',
    long_description='%s\n%s' % (
        re.compile('^.. start-badges.*^.. end-badges', re.M | re.S).sub('', read('README.rst')),
        re.sub(':[a-z]+:`~?(.*?)`', r'``\1``', read('CHANGELOG.rst'))
    author='Ion\\"\'el Cristian M\\u0103rie\\u0219',
    package_dir={'': 'src'},
    py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
        # complete classifier list:
        'Development Status :: 5 - Production/Stable',
        'Intended Audience :: Developers',
        'License :: OSI Approved :: BSD License',
        'Operating System :: Unix',
        'Operating System :: POSIX',
        'Operating System :: Microsoft :: Windows',
        'Programming Language :: Python',
        'Programming Language :: Python :: 2.7',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.5',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
        'Programming Language :: Python :: 3.9',
        'Programming Language :: Python :: Implementation :: CPython',
        'Programming Language :: Python :: Implementation :: PyPy',
        # uncomment if you test on these interpreters:
        # 'Programming Language :: Python :: Implementation :: IronPython',
        # 'Programming Language :: Python :: Implementation :: Jython',
        # 'Programming Language :: Python :: Implementation :: Stackless',
        'Topic :: Utilities',
        'Changelog': '',
        'Issue Tracker': '',
        # eg: 'keyword1', 'keyword2', 'keyword3',
    python_requires='>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*',
        # eg: 'aspectlib==1.1.1', 'six>=1.7',
        # eg:
        #   'rst': ['docutils>=0.11'],
        #   ':python_version=="2.6"': ['argparse'],
        'console_scripts': [
            'nameless = nameless.cli:main',

What's special about this:

  • No exec or import trickery.
  • Includes everything from src: packages or root-level modules.
  • Explicit encodings.

Running the tests *

Again, it seems people fancy the idea of running python test to run the package's tests. I think that's not worth doing - test is a failed experiment to replicate some of CPAN's test system. Python doesn't have a common test result protocol so it serves no purpose to have a common test command [1]. At least not for now - we'd need someone to build specifications and services that make this worthwhile, and champion them. I think it's important in general to recognize failure where there is and go back to the drawing board when that's necessary - there are absolutely no services or tools that use test command in a way that brings added value. Something is definitely wrong here.

I believe it's too late now for PyPI to do anything about it, Travis is already a solid, reliable, extremely flexible and free alternative. It integrates very well with Github - builds will be run automatically for each Pull Request.

To test locally tox is a very good way to run all the possible testing configurations (each configuration will be a tox environment). I like to organize the tests into a matrix with these additional environments:

  • check - check package metadata (e.g.: if the restructured text in your long description is valid)
  • clean - clean coverage
  • report - make coverage report for all the accumulated data
  • docs - build sphinx docs

I also like to have environments with and without coverage measurement and run them all the time. Race conditions are usually performance sensitive and you're unlikely to catch them if you run everything with coverage measurements.

The test matrix *

Depending on dependencies you'll usually end up with a huge number of combinations of python versions, dependency versions and different settings. Generally people just hard-code everything in tox.ini or only in .travis.yml. They end up with incomplete local tests, or test configurations that run serially in Travis. I've tried that, didn't like it. I've tried duplicating the environments in both tox.ini and .travis.yml. Still didn't like it.


This technique is a bit outdated now. It still works fine but for simple matrices you can use a tox generative envlist (it was implemented after I wrote this blog post, unfortunately).


See python-nameless for an example using that.

As there were no readily usable alternatives to generate the configuration, I've implemented a generator script that uses templates to generate tox.ini and .travis.yml. This is way better, it's DRY, you can easily skip running tests on specific configurations (e.g.: skip Django 1.4 on Python 3) and there's less work to change things.

The essentials (full code):

setup.cfg *

The generator script uses a configuration file (setup.cfg for convenience):


python_files =
addopts =
testpaths =

force_single_line = True
line_length = 120
known_first_party = nameless
default_section = THIRDPARTY
forced_separate = test_nameless
skip = .tox,.eggs,ci/templates,build,dist

# This is the configuration for the `./` script.
# It generates `.travis.yml`, `tox.ini` and `.appveyor.yml`.
# Syntax: [alias:] value [!variable[glob]] [&variable[glob]]
# alias:
#  - is used to generate the tox environment
#  - it's optional
#  - if not present the alias will be computed from the `value`
# value:
#  - a value of "-" means empty
# !variable[glob]:
#  - exclude the combination of the current `value` with
#    any value matching the `glob` in `variable`
#  - can use as many you want
# &variable[glob]:
#  - only include the combination of the current `value`
#    when there's a value matching `glob` in `variable`
#  - can use as many you want

python_versions =

dependencies =
#    1.4: Django==1.4.16 !python_versions[py3*]
#    1.5: Django==1.5.11
#    1.6: Django==1.6.8
#    1.7: Django==1.7.1 !python_versions[py26]
# Deps commented above are provided as examples. That's what you would use in a Django project.

coverage_flags =
    cover: true
    nocov: false
environment_variables =

ci/ *

This is the generator script. You run this whenever you want to regenerate the configuration:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import absolute_import
from __future__ import print_function
from __future__ import unicode_literals

import os
import subprocess
import sys
from os.path import abspath
from os.path import dirname
from os.path import exists
from os.path import join

base_path = dirname(dirname(abspath(__file__)))

def check_call(args):
    print("+", *args)

def exec_in_env():
    env_path = join(base_path, ".tox", "bootstrap")
    if sys.platform == "win32":
        bin_path = join(env_path, "Scripts")
        bin_path = join(env_path, "bin")
    if not exists(env_path):
        import subprocess

        print("Making bootstrap env in: {0} ...".format(env_path))
            check_call([sys.executable, "-m", "venv", env_path])
        except subprocess.CalledProcessError:
                check_call([sys.executable, "-m", "virtualenv", env_path])
            except subprocess.CalledProcessError:
                check_call(["virtualenv", env_path])
        print("Installing `jinja2` into bootstrap environment...")
        check_call([join(bin_path, "pip"), "install", "jinja2", "tox", "matrix"])
    python_executable = join(bin_path, "python")
    if not os.path.exists(python_executable):
        python_executable += '.exe'

    print("Re-executing with: {0}".format(python_executable))
    print("+ exec", python_executable, __file__, "--no-env")
    os.execv(python_executable, [python_executable, __file__, "--no-env"])

def main():
    import jinja2
    import matrix

    print("Project path: {0}".format(base_path))

    jinja = jinja2.Environment(
        loader=jinja2.FileSystemLoader(join(base_path, "ci", "templates")),

    tox_environments = {}
    for (alias, conf) in matrix.from_file(join(base_path, "setup.cfg")).items():
        deps = conf["dependencies"]
        tox_environments[alias] = {
            "deps": deps.split(),
        if "coverage_flags" in conf:
            cover = {"false": False, "true": True}[conf["coverage_flags"].lower()]
        if "environment_variables" in conf:
            env_vars = conf["environment_variables"]

    for name in os.listdir(join("ci", "templates")):
        with open(join(base_path, name), "w") as fh:
        print("Wrote {}".format(name))

if __name__ == "__main__":
    args = sys.argv[1:]
    if args == ["--no-env"]:
    elif not args:
        print("Unexpected arguments {0}".format(args), file=sys.stderr)

ci/templates/.travis.yml *

This has some goodies in it: the very useful trick.

It basically just runs tox.

language: python
dist: xenial
virt: lxd
cache: false
    - LD_PRELOAD=/lib/x86_64-linux-gnu/
    - LANG=en_US.UTF-8
    - python: '3.6'
        - TOXENV=check
{%- for env, config in tox_environments|dictsort %}{{ '' }}
    - env:
        - TOXENV={{ env }}{% if config.cover %},codecov,extension-coveralls,coveralls{% endif %}
{%- if env.startswith('pypy3') %}{{ '' }}
        - TOXPYTHON=pypy3
      python: 'pypy3'
{%- elif env.startswith('pypy') %}{{ '' }}
      python: 'pypy'
{%- else %}{{ '' }}
      python: '{{ '{0[2]}.{0[3]}'.format(env) }}'
{%- endif %}{{ '' }}
{%- endfor %}{{ '' }}
  - python --version
  - uname -a
  - lsb_release -a || true
  - python -mpip install --progress-bar=off tox -rci/requirements.txt
  - virtualenv --version
  - easy_install --version
  - pip --version
  - tox --version
  - tox -v
  - cat .tox/log/*
  - cat .tox/*/log/*
    on_success: never
    on_failure: always

ci/templates/tox.ini *

envlist =
{% for env in tox_environments|sort %}
    {{ env }},
{% endfor %}

basepython =
    {bootstrap,clean,check,report,codecov,coveralls,extension-coveralls}: {env:TOXPYTHON:python3}
setenv =
passenv =
deps =
commands =
    python clean --all build_ext --force --inplace
    {posargs:pytest -vv --ignore=src}

deps =
skip_install = true
commands =
    python ci/ --no-env

deps =
skip_install = true
commands =
    python check --strict --metadata --restructuredtext
    check-manifest {toxinidir}
    isort --verbose --check-only --diff --filter-files .

deps =
skip_install = true
commands =
    coveralls --merge=extension-coveralls.json []

deps =
skip_install = true
commands =
    coveralls --build-root=. --include=src --dump=extension-coveralls.json []

deps =
skip_install = true
commands =
    codecov --gcov-root=. []

deps = coverage
skip_install = true
commands =
    coverage report
    coverage html

commands = coverage erase
skip_install = true
deps = coverage
{% for env, config in tox_environments|dictsort %}

[testenv:{{ env }}]
basepython = {env:TOXPYTHON:{{ env.split("-")[0] if env.startswith("pypy") else "python{0[2]}.{0[3]}".format(env) }}}
{% if config.cover or config.env_vars %}
setenv =
{% endif %}
{% for var in config.env_vars %}
    {{ var }}
{% endfor %}
{% if config.cover %}
usedevelop = true
commands =
    python clean --all build_ext --force --inplace
    {posargs:pytest --cov --cov-report=term-missing -vv}
{% endif %}
{% if config.cover or config.deps %}
deps =
{% endif %}
{% if config.cover %}
{% endif %}
{% for dep in config.deps %}
    {{ dep }}
{% endfor -%}
{% endfor -%}

ci/templates/.appveyor.ini *

For Windows-friendly projects:

version: '{branch}-{build}'
build: off
    - TOXENV: check
      TOXPYTHON: C:\Python36\python.exe
      PYTHON_HOME: C:\Python36
      PYTHON_VERSION: '3.6'
      PYTHON_ARCH: '32'
{% for env, config in tox_environments|dictsort %}
{% if env.startswith(('py2', 'py3')) %}
    - TOXENV: {{ env }}{% if config.cover %},codecov,coveralls{% endif %}{{ "" }}
      TOXPYTHON: C:\Python{{ env[2:4] }}\python.exe
      PYTHON_HOME: C:\Python{{ env[2:4] }}
      PYTHON_VERSION: '{{ env[2] }}.{{ env[3] }}'
      PYTHON_ARCH: '32'
{% if 'nocov' in env %}
      WHEEL_PATH: .tox/dist
{% endif %}
    - TOXENV: {{ env }}{% if config.cover %},codecov,coveralls{% endif %}{{ "" }}
      TOXPYTHON: C:\Python{{ env[2:4] }}-x64\python.exe
      PYTHON_HOME: C:\Python{{ env[2:4] }}-x64
      PYTHON_VERSION: '{{ env[2] }}.{{ env[3] }}'
      PYTHON_ARCH: '64'
{% if 'nocov' in env %}
      WHEEL_PATH: .tox/dist
{% endif %}
{% if env.startswith('py2') %}
{% endif %}
{% endif %}{% endfor %}
  - ps: echo $env:TOXENV
  - ps: ls C:\Python*
  - '%PYTHON_HOME%\python -mpip install --progress-bar=off tox -rci/requirements.txt'
  - '%PYTHON_HOME%\Scripts\virtualenv --version'
  - '%PYTHON_HOME%\Scripts\easy_install --version'
  - '%PYTHON_HOME%\Scripts\pip --version'
  - '%PYTHON_HOME%\Scripts\tox --version'
  - cmd /E:ON /V:ON /C .\ci\appveyor-with-compiler.cmd %PYTHON_HOME%\Scripts\tox
  - ps: dir "env:"
  - ps: get-content .tox\*\log\*

### To enable remote debugging uncomment this (also, see:
# on_finish:
#   - ps: $blockRdp = $true; iex ((new-object net.webclient).DownloadString(''))

If you've been patient enough to read through that you'll notice:

  • The Travis configuration uses tox for each item in the matrix. This makes testing in Travis consistent with testing locally.
  • The environment order for tox is clean, check, 2.6-1.3, 2.6-1.4, ..., report.
  • The environments with coverage measurement run the code without installing (usedevelop = true) so that coverage can combine all the measurements at the end.
  • The environments without coverage will sdist and install into virtualenv (tox's default behavior [2]) so that packaging issues are caught early.
  • The report environment combines all the runs at the end into a single report.

Having the complete list of environments in tox.ini is a huge advantage:

  • You run everything in parallel locally (if your tests don't need strict isolation) with detox. And you can still run everything in parallel if you want to use instead of Travis.
  • You can measure cummulated coverage for everything (merge the coverage measurements for all the environments into a single one) locally.

Test coverage *

There's Coveralls - a nice way to track coverage over time and over multiple builds. It will automatically add comments on Github Pull Request about changes in coverage.


  • Put code in src.
  • Use tox and detox.
  • Test both with coverage measurements and without.
  • Use a generator script for tox.ini and .travis.ini.
  • Run the tests in Travis with tox to keep things consistent with local testing.

Too complicated? Just use a python package template.

Not convincing enough? Read Hynek's post about the src layout.

Also worth checking out this short list of packaging pitfalls.

[1]There's subunit and probably others but they are widely used.
[2](1, 2) See example.
[3]There is a feature specification/proposal in tox for multi-dimensional configuration but it still doesn't solve the problem of generating the .travis.yml file. There's also tox-matrix but it's not flexibile enough.
[4]cookiecutter-pypackage is acceptable at the surface level (tests outside, correct MANIFEST) but still has the core problem (lack of src separation) and gives the wrong idea to glancing users.

It's a chicken-and-egg problem: how can pip know what dependencies to install if running the script requires unknownable dependencies?

There are so many weird corners you can get into by having the power to run arbitrary code in the script. This why people tried to change to pure metadata.

[6]Did you know the order of the rules in matters?
[*]PEP-20's 5th aphorism: Flat is better than nested.

This entry was tagged as django packaging python src testing