5

One can share Python as a source distribution (.tar.gz format) or as a built distribution (wheels format).

As I understand it, the point of built distributions is:

  • Save time: Compilation might be pretty time-consuming. We can do this once on the server and share it for many users.
  • Reduce requirements: The user does not have to have a compiler installed

However, those two arguments for bdist files seem not to hold for pure-python packages. Still, I see that natsort comes in both, a sdist and a bdist. Is there any advantage of sharing a pure-python package in bdist format?

martineau
  • 119,623
  • 25
  • 170
  • 301
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958

1 Answers1

5

From pythonwheels.com:

Advantages of wheels

  1. Faster installation for pure Python and native C extension packages.
  2. Avoids arbitrary code execution for installation. (Avoids setup.py)
  3. Installation of a C extension does not require a compiler on Linux, Windows or macOS.
  4. Allows better caching for testing and continuous integration.
  5. Creates .pyc files as part of installation to ensure they match the Python interpreter used.
  6. More consistent installs across platforms and machines.

So for me, I think the first and second points are most meaningful for a pure Python package. It's smaller, faster and also more secure.

Community
  • 1
  • 1
Sraw
  • 18,892
  • 11
  • 54
  • 87
  • 1
    I don't really see the point for (2): In the end, you want to run the package. So there you run arbitrary code. Why should it add any security if you don't do that at installation, but only at runtime? – Martin Thoma Jan 07 '20 at 05:54
  • (1) is interesting. I wasn't aware that wheels might be smaller in size and thus faster to download. – Martin Thoma Jan 07 '20 at 05:56
  • @MartinThoma, one doesn't necessarily have the same privileges at build-time and runtime. A build system will often have private keys allowing access to other source code in a corporate SCM, f/e. Anyhow, if one *can* lock down one kind of compromise without preventing another, no reason to let the perfect be the enemy of the good. (Similarly, a production runtime may have better monitoring of outbound connections &c. than a build system does; and code in the installed library is harder to hide than code that's only transiently invoked at build time but not installed, and thus not packaged). – Charles Duffy Jan 07 '20 at 15:52
  • @MartinThoma, ...sure, there are other countermeasures against build-time attacks (I'm a big fan of Nix, which runs all builds in a sandbox with no network connectivity, requiring downloads to be done ahead-of-time, hashed, and recorded in a read-only immutable store); but that's still a niche tool without wide adoption. – Charles Duffy Jan 07 '20 at 15:56
  • @MartinThoma In other words, for 1) you avoid setup time requirements (for example if the package uses [pbr](https://pypi.org/project/pbr/), is needed to install an sdist, not a bdist), and for 2) you exclude the possibility that the package maintainer was malicious or bad at writing a `setup.py`. Using bdist and leveraging standard tools and their config to install the package is, if you only care about the package being installed, strictly better than installing via sdist. – Arne Jan 13 '20 at 11:07
  • @Arne I don't understand the security concern. When you install a Python package, you add code to your machine. Do people sandbox the running Python environment, but then not sandbox the installation? – Martin Thoma Jan 13 '20 at 13:11
  • Also, if there is a bdist and an sdist on PyPI, what does pip install? And why? – Martin Thoma Jan 13 '20 at 13:12
  • It tries to install a bdist if it exist given your platform, which can be learned from the metadata the bdist provides. You're probably right about the security concerns regarding trusting the maintainer... but, from a different point of view, if I were malicious and wanted to run arbitrary code on your system I'd put it in the install code, because that's the only part of a package that will always be executed. But yeah, I'd say security is a minor point, higher speed and no build/install dependencies is more important. – Arne Jan 13 '20 at 13:50
  • I don´t really get my head around the point that bdists are supposedly smaller!? – Bennimi Jan 16 '22 at 19:04