5

I'm faced with a Docker build process that's pretty slow, in part because of all the Python packages we're building and installing over and over. I'd very much like to speed it up.

I've downloaded the packages from PyPI so I can get a good look at them. I've also put them in a local pypiserver (two, actually), and confirmed I can install them from there.

The packages' extensions are:

87 .whl    ............................................................
23 .tar.gz ................
 2 .zip    .

I'm thinking some of those .tar.gz's (and .zip's? and source .whl's?) would be much faster to install if I converted them to manylinux wheels, and put them in a local pypiserver instance with the same version number.  In fact, one package tends to fail to compile at random - so the build process should be a little more reliable too if this works out.

Is there a (relatively?) straightforward process for doing such a thing?  That is, to take a .tar.gz from pypi (not an arbitrary .tar.gz - only a handful from pypi) and convert it to a binary manylinux .whl?

For example, probably the most time-consuming package in our docker build is https://pypi.org/project/pycapnp/ It takes about 80 seconds to build and install on my Linux Mint 19.1 laptop. It's a .tar.gz.

Thanks!

Dustin Ingram
  • 20,502
  • 7
  • 59
  • 82
dstromberg
  • 6,954
  • 1
  • 26
  • 27

2 Answers2

2

I believe it should be pretty straightforward with pip wheel

Arne
  • 17,706
  • 5
  • 83
  • 99
sinoroc
  • 18,409
  • 2
  • 39
  • 70
  • Wow, that's nice. It's really close. The fly in the ointment is that I need something that'll work on both Debian-derivatives and Fedora-derivatives - so I think I need manylinux. The "pip wheel whatever.tar.gz" produces something that installs and imports on one, but installs and fails to import on the other. – dstromberg Nov 21 '19 at 00:20
  • Well this part doesn't seem to be straightforward. See a [short intro here](https://opensource.com/article/19/2/manylinux-python-wheels) and more technical details by following from the [pypa/manylinux repository](https://github.com/pypa/manylinux) and so on. But... Are there not wheels being built in your containers anyway? If yes, why not cache them and reuse them to cut the build time? – sinoroc Nov 21 '19 at 09:30
2

The way to do this is to download and extract the source distribution (.tar.gz) inside a compatible manylinux Docker image (this depends on the architecture you want to target) and then build the wheel with python setup.py bdist_wheel.

There's a demo project at https://github.com/pypa/python-manylinux-demo which shows how to do this automatically with CI, you could probably adapt that, or even contribute it back to the projects which aren't publishing built distributions (wheels).

Dustin Ingram
  • 20,502
  • 7
  • 59
  • 82