6

I went to download PyPDF2 from conda forge:

    conda install -c conda-forge pypdf2

and got this message:

The following packages will be UPDATED:
  ca-certificates    anaconda::ca-certificates-2020.10.14-0 --> conda-forge::ca-certificates-2021.10.8-h033912b_0
  certifi            pkgs/main::certifi-2021.5.30-py39hecd~ --> conda-forge::certifi-2021.10.8-py39h6e9494a_0

The following packages will be SUPERSEDED by a higher-priority channel:
  openssl              pkgs/main::openssl-1.1.1l-h9ed2024_0 --> conda-forge::openssl-1.1.1l-h0d85af4_0

It looks suspicious to me that install of a pdf reader is trying to update security related packages.

If a bad actor has uploaded stuff to conda, this might be a risk to the wider Python ecosystem. This particular conda forge repo is top result on Google and DuckDuckGo for "conda PyPDF2" and has been downloaded a lot:

 175141 total downloads
 Last upload: 2 years and 10 months ago
Peter
  • 83
  • 2
  • 6

1 Answers1

17

Why does Conda Forge take precedence?

Conda aggressively updates security-related packages. In particular, see

$ conda config --describe aggressive_update_packages
# # aggressive_update_packages (sequence: primitive)
# #   env var string delimiter: ','
# #   A list of packages that, if installed, are always updated to the
# #   latest possible version.
# # 
# aggressive_update_packages:
#   - ca-certificates
#   - certifi
#   - openssl

This means that whenever the user requests a mutating operation on an environment, Conda will check if there is a higher priority version of any of those packages available. Whenever one uses the --channel, -c flag, it places that channel at the highest priority. Hence, the command

conda install -c conda-forge ...

run on an environment that had previously only ever used defaults channel will trigger some packages to switch to conda-forge as a source.

Is Conda Forge trustworthy?

Yes. However, every user/organization must assess their own risk and it is beyond the scope of this forum to provide a full security analysis of Conda Forge.

In lieu of that, it may be worth outlining relevant Conda Forge procedures that mitigate against compromised packages.

Security overview

While Conda Forge is open to community submissions from anyone, the following are some practices that help ensure the safety of the channel:

  • Gatekeeping. New packages must pass a review by trusted team members before being accepted and distributed by Conda Forge. For most packages, users are encouraged to use recipe-generating scripts that work downstream of other trusted repositories (e.g., PyPI, CRAN).

    Once approved, only submitters and core members have maintainer rights. Arbitrary users can submit Pull Requests to any feedstock, but again these must be reviewed and accepted by maintainers.

  • Transparent Supply Chain. All recipes are open-source and all builds are performed on CI infrastructure with open logs. Any package build on the conda-forge channel can be fully audited back to the URL that the recipe used to build it.

  • Siloed Feedstocks. Each feedstock only has an ability to upload packages for that feedstock. This is enforced by using a cf-staging channel where builds are first sent. A bot then assesses that the submitting feedstock has permission to build the package it has submitted, and only then will it relay the build to the conda-forge channel.

    This helps mitigate against a bad actor gaining access to an inconspicuous feedstock and then trying to push a build with malicious code into essential infrastructure packages (e.g., openssl or python).

I am sure more could be said, but hopefully that is a sufficient start.

Anaconda trusts Conda Forge

It may also be worth pointing out that many Anaconda package recipes are forks from Conda Forge feedstocks. This includes each of the security packages recipes (certifi, openssl, ca-certificats).

merv
  • 67,214
  • 13
  • 180
  • 245