68

It was so handy to get an idea if the package is popular or not (even if its popularity is the reason of some "import" case in another popular package). But now I don't see this info for some reason.

An example: https://pypi.python.org/pypi/blist

Why did they turn off this useful thing?

d-d
  • 1,775
  • 3
  • 20
  • 29
  • [Looks like dodgy downloads stats are a known bug which has been marked as `wontfix` - maybe they were removed because of that?](https://bitbucket.org/pypa/pypi/issues/396/download-stats-have-stopped-working-again) – Aaron Christiansen Jun 29 '16 at 14:20

4 Answers4

120

I just released https://pepy.tech/ to view the downloads of a package. I used the data from BigQuery so you will get the same result :-)

petrusqui
  • 1,445
  • 2
  • 13
  • 10
59

As can be seen in this mail.python.org article, download stats were removed because they weren't updating and would be too difficult to fix.

Donald Stufft, the author of the article, listed these reasons:

There are numerous reasons for their removal/deprecation some of which are:

  • Technically hard to make work with the new CDN
    • The CDN is being donated to the PSF, and the donated tier does not offer any form of log access
    • The work around for not having log access would greatly reduce the utility of the CDN
  • Highly inaccurate
    • A number of things prevent the download counts from being inaccurate, some of which include:
      • pip download cache
      • Internal or unofficial mirrors
      • Packages not hosted on PyPI (for comparisons sake)
      • Mirrors or unofficial grab scripts causing inflated counts (Last I looked 25% of the downloads were from a known mirroring script).
  • Not particularly useful
    • Just because a project has been downloaded a lot doesn't mean it's good
    • Similarly just because a project hasn't been downloaded a lot doesn't mean it's bad
Aaron Christiansen
  • 11,584
  • 5
  • 52
  • 78
  • 11
    The accepted answer is correct in that downloads have been disabled, and the reasons in Donald Stufft's email from 2013 are probably still pretty much valid. But since 2013, downloads had been re-enabled and were disabled only recently (~June 2016?) again. A bit more detail can be found in the [pypi-legacy issue #396](https://github.com/pypa/pypi-legacy/issues/396#issuecomment-232373133). – orbeckst Jul 27 '16 at 19:21
  • Yes, I agree. You have big amount of downloads/likes on hub.docker.com and it's tragedy - the trashiest Images has the biggest amount of downloads and likes. They have not implemented something like "dislike". Such statistic only do harm. Better to not have any rather than have it malformed. – Jurass Jun 04 '20 at 06:14
26

Recently I found out that you can query PyPI's Big Query database contributed to the PSF foundation through this link.

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • Awesome! I would like to know more about the accuracy of this data – nemesisdesign Jun 08 '17 at 17:09
  • @nemesisdesign I believe that it's updated daily. You can try an analyze all the 19099214 rows for its legitimacy/accuracy. –  Jun 08 '17 at 17:34
  • 1
    @kiran.koduru i've tried instructions from your blog post, but getting error message from Google saying table does not exist. Is this method of retrieving package metadata still working, or table name has changed perhaps? – toske Jan 28 '18 at 22:41
  • hi @toske, please try again. The table still exists. I just queried it. Maybe we can take it offline. Add a comment to the blogpost, I can walk you through it. –  Jan 30 '18 at 01:02
  • 1
    The table appears to be empty now. – Alex S Feb 21 '18 at 03:27
  • 6
    At the link it say "Unable to find table: the-psf:pypi.downloads". Does anyone get it to work? – ale5000 Mar 10 '18 at 00:50
  • 1
    Indeed this doesn't work for me either. Thankfully petrusqui's answer is all I wanted and more. – GravityWell Dec 11 '18 at 19:03
13

The pypinfo program is a Python3 command-line program to BigQuery installable via pip. If you set up the credentials (a JSON file) you should be able to write:

$ pypinfo -d 1825 blist year
Served from cache: False
Data processed: 250.31 GiB
Data billed: 250.31 GiB
Estimated cost: $1.23

| download_year | download_count |
| ------------- | -------------- |
|         2,017 |        443,067 |
|         2,016 |        391,816 |
|         2,018 |         57,689 |

Some information about the data collection is available at https://packaging.python.org/guides/analyzing-pypi-package-downloads/

Finn Årup Nielsen
  • 6,130
  • 1
  • 33
  • 43