156

The "Python Distribute" guide tells me to include doc/txt files and .py files are excluded in MANIFEST.in file

The sourcedist documentation tells me only sdist uses MANIFEST.in and only includes file you specify and to include .py files. It also tells me to use: python setup.py sdist --manifest-only to generate a MANIFEST, but python tells me this doesn't exist

I appreciate these are from different versions of python and the distribution system is in a complete mess, but assuming I am using python 3 and setuptools (the new one that includes distribute but now called setuptools, not the old setuptools that was deprecated for distribute tools only to be brought back into distribute and distribute renamed to setuptools.....)

and I'm following the 'standard' folder structure and setup.py file,

  1. Do I need a MANIFEST.in ?
  2. What should be in it ?
  3. When will all these different package systems and methods be made into one single simple process ?
Melebius
  • 6,183
  • 4
  • 39
  • 52
Neil Walker
  • 6,400
  • 14
  • 57
  • 86

2 Answers2

145

Re: "Do I need a MANIFEST.in?

No, you do not have to use MANIFEST.in. Both, distutils and setuptools are including in source distribution package all the files mentioned in setup.py - modules, package python files, README.txt and test/test*.py. If this is all you want to have in distribution package, you do not have to use MANIFEST.in.

If you want to manipulate (add or remove) default files to include, you have to use MANIFEST.in.

Re: What should be in it?

The procedure is simple:

  1. Make sure, in your setup.py you include (by means of setup arguments) all the files you feel important for the program to run (modules, packages, scripts ...)

  2. Clarify, if there are some files to add or some files to exclude. If neither is needed, then there is no need for using MANIFEST.in.

  3. If MANIFEST.in is needed, create it. Usually, you add there tests*/*.py files, README.rst if you do not use README.txt, docs files and possibly some data files for test suite, if necessary.

For example:

include README.rst
include COPYING.txt

To test it, run python setup.py sdist, and examine the tarball created under dist/.

When will all these different package systems ...

Comparing the situation today and 2 years ago - the situation is much much better - setuptools is the way to go. You can ignore the fact, distutils is a bit broken and is low level base for setuptools as setuptools shall take care of hiding these things from you.

EDIT: Last few projects I use pbr for building distribution packages with three line setup.py and rest being in setup.cfg and requirements.txt. No need to care about MANIFEST.in and other strange stuff. Even though the package would deserve a bit more documentation. See http://docs.openstack.org/developer/pbr/

Flimm
  • 136,138
  • 45
  • 251
  • 267
Jan Vlcinsky
  • 42,725
  • 12
  • 101
  • 98
  • 2
    In my limited experience is seems that if you want to include files not inside of a python module (dir with __init__.py), you have to use MANIFEST.in and use the `sdist` (means: **source distribution**) command. If you consider that `bdist` and `bdist_wheel` are **binary** and only intended to be installed in your python path, this makes sense. (Where would these non-module files and directories go? In `/usr/local/lib/python2.7/dist-packages/`? Surely not.) But it's worth mentioning since it's confusing to see the archive created and them not include the files. – Bruno Bronosky Mar 17 '15 at 20:19
  • @BrunoBronosky First advice: do not care about distribution format (wheel, sdist, egg...), just make it working well with one (e.g. sdist). It will work well with the others. Where the files go? Try installing `pytest` (using virtualenv named jingtrang) and check `/home/javl/Envs/jingtrang/lib/python2.7/site-packages/pytest-2.6.1-py2.7.egg-info/top_level` file and `/home/javl/Envs/jingtrang/lib/python2.7/site-packages/pytest` and `/home/javl/Envs/jingtrang/lib/python2.7/site-packages/_pytest` directories. And thanks for your corrections of my text, it improved it. – Jan Vlcinsky Mar 17 '15 at 22:53
  • 8
    To head off the inevitable `package_data` and `data_files` recommendations, which are out of scope, I'll continue. `package_data` lists file that get installed with your package into `dist-packages/yourpackage` which would have been skipped because the don't have a *.py name. `data_files` lists files that get installed outside of your package. Each entry specifies a target path that is prefixed with `sys.prefix` if it is relative or created directly (permissions permitting) if it begins with a `/`. – Bruno Bronosky Mar 18 '15 at 16:55
  • 2
    @JanVlcinsky it is important to know what is and [more importantly] **is not included** in different distribution formats. I have a public project that I only distribute via source distribution because I include a boto.sample.cfg file (which contains a fake AWS IAM credential) outside of the package (at the root) and the binary distributions will not include it. I make private binary builds for deploying to production that have data_files=[('/etc/', ['boto.cfg'])]. If you want to distribute non-py files, you have to know how these things work. – Bruno Bronosky Mar 18 '15 at 17:16
  • @BrunoBronosky Application configuration and package development have completely different lifecycles, so messing installation of config files with package installation is simply antipattern, package installation is only for package, not for config data. Good options are e.g. some `my-quick-start` script installed by the app, which puts the config somewhere (but reads it from inside). For maintaining boto configs check [Recommended way to manage credentials with multiple AWS accounts](http://stackoverflow.com/a/21345540/346478). – Jan Vlcinsky Mar 18 '15 at 17:58
  • 3
    @MichaelGoerz Honestly, they shouldn't. This answer is ancient, and suggesting `pbr` is a bad idea, too. – Arne Dec 26 '18 at 22:37
  • 2
    @Ame I agree, things moved on. Currently I am converting most my projects from pbr to [poetry](https://pypi.org/project/poetry/) – Jan Vlcinsky Dec 28 '18 at 00:57
  • 1
    @BrunoBronosky Thanks a lot. that was super useful. I high suggest you do post that as an updated answer to this old question here, as most probably lots of people might not need Manifest.in at all, me included, I though that was the only way to get files somewhere! your important note on the distinction about data_files and package_data was really really helpful. – Hossein Apr 20 '20 at 06:07
  • 1
    @JanVlcinsky Thanks for your answer, but I guess it'd be a good idea to reflect this in your answer with an edit. – Hossein Apr 20 '20 at 06:07
20

Old question, new answer:

No, you don't need MANIFEST.in. However, to get setuptools to do what you (usually) mean, you do need to use the setuptools_scm, which takes the role of MANIFEST.in in 2 key places:

  • It ensures all relevant files are packaged when running the sdist command (where all relevant files is defined as "all files under source control")
  • When using include_package_data to include package data as part of the build or bdist_wheel. (again: files under source control)

The historical understanding of MANIFEST.in is: when you don't have a source control system, you need some other mechanism to distinguish between "source files" and "files that happen to be in your working directory". However, your project is under source control (right??) so there's no need for MANIFEST.in. More info in this article.

Klaas van Schelven
  • 2,374
  • 1
  • 21
  • 35