Python packaging pitfalls

Wed 25 June 2014

Note

Use cookiecutter-pylibrary to avoid all the pain.

Just a short list of packaging blunders ...



Forgetting to clean the build dir *
why:

distutils and setuptools will skip things if you have stuff in the build dir, even if you have changed options in your setup.py.

fix:

Integrate a rm -rf build/ in your build pipeline.

Better (clean everything): rm -rf dist build */*.egg-info *.egg-info

Forgetting to specify package data *
why:

Distutils and Setuptools don't include your non-python files by default.

fix:

Create a MANIFEST.in and use include_package_data=True. Don't use package_data as it will override your manifest, and it's a less flexible approach anyway. [4]

Make sure your data files are inside packages (they are called package data after all). An example of how to use data files.

Fine grained MANIFEST.in *

Listing few file types in MANIFEST.in [4], then adding some webfonts or templates - only to find out the release you published on PyPI doesn't include them.

why:You duplicated information you already have in the filesystem [1].
fix:Just recursive-include or graft the whole dir. Separate sources from build and temporary files (e.g.: use different directories).
Using package_data, or worse: fine grained package_data *

Listing few file types in package_data [6], then adding some webfonts - only to find out the release you published on PyPI doesn't include them.

why:You duplicated information you already have in the filesystem [1]. You have better options.
fix:Use MANIFEST.in and include_package_data=True. See forgetting-package-data.
Listing excludes/prunes before includes/grafts *

Your excludes/prunes would get overriden if there are includes/grafts that match the same files later.

why:Last rule wins [4].
fix:Use correct rule ordering. Excludes have more weight, put them last in MANIFEST.in.
Hardcoding packages list in setup.py *
why:You duplicated information you already have in the filesystem [1].
fix:Use setuptools.find_packages. Or, if you want everything to be static, have good testing [3].
Hardcoding py_modules list in setup.py *
why:

You duplicated information you already have in the filesystem [1].

fix:

Discover the modules, e.g.:

py_modules=[splitext(basename(i))[0] for i in glob.glob("src/*.py")]

Or, if you want everything to be static, have good testing [3].

Importing your package in setup.py *
why:It's risky. If your package imports dependencies they might not be available and your package becomes uninstallable. pip/easy_install might need to run your setup.py to discover dependencies. [2]
fix:If you need to extract the version then read the file instead and parse out the version. [5]
Importing unavailable tools in setup.py *
why:They might not be installed at the time setup.py is run.
fix:Use setup_requires, delay imports - import in your custom command class's methods.
Messing with the environment *

Example: the infamous from distribute_setup import use_setuptools; use_setuptools() pattern expected superuser privileges in order to upgrade setuptools.

why:Users can't always have the exact environment as you have, if you try to modify the same way you modify yours horrible failures will occur.
fix:Just import setuptools. Modern python installations, pip and virtualenv already provide setuptools. Document your requirements.
Your tests do not test the installed code *

Example: the source packages are in your current working directory.

why:

Because the current working directory is implicitly on PYTHONPATH, your package will be imported from the current directory, not from whatever was installed in site-packages.

This is a problem because it can be incomplete and you won't know it.

fix:

Change the current working directory when you test, or move your code into a directory that's not a package (doesn't have any __init__.py), like a src directory. [3]


[1](1, 2, 3, 4) It ain't' gonna update itself and you're going to forget to. Happens to the best.
[2]Except when using wheels or eggs. But you should always upload the sdist (hard to make binary available for every imaginable platform) - which already relies on running setup.py.
[3](1, 2, 3) Test installed code. How else would you know the code users will run actually works? More about testing and file layout.
[4](1, 2, 3) https://docs.python.org/2/distutils/sourcedist.html#the-manifest-in-template
[5]https://packaging.python.org/en/latest/tutorial.html#version
[6]https://docs.python.org/2/distutils/setupscript.html#distutils-installing-package-data

Surely there are other pitfalls too - that's just the silly stuff I've fallen into in the past. Share your pain below :)

This entry was tagged as packaging python src