69

I'm not sure this is possible. Google does not seem to have any answers.

Running Linux Debian can I list all pip packages and size (amount of disk space used) thats installed?

i.e. List all pip packages with size on disk?

Prometheus
  • 32,405
  • 54
  • 166
  • 302

11 Answers11

64

Modified for pip version 18 and above:

pip list | tail -n +3 | awk '{print $1}' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -hr

This command shows pip packages, sorted by descending order of sizes.

jerrymouse
  • 16,964
  • 16
  • 76
  • 97
  • Just add LANG=C at the very beginning if your terminal isn't originally in English, because "Location:|Name:" would'nt match otherwise... Thus `LANG=C pip list | tail -n +3 | awk '{print $1}' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -hr` and voilà! – Johannes Lemonde Jun 16 '21 at 09:36
  • 3
    this is the correct answer for current pip/python3. (answer marked as correct isn't very representative of total size of pkg) – josh May 24 '22 at 04:48
  • 1
    I noticed that some packages would not appear in the results because their physical directory names differ from their claimed project names (for example, `beautifulsoup4` would be installed as `bs4`.) Looks like currently we don't have a perfect solution unless we do a deep and serious scan (of `dist-info` or something like that). – 一年又一年 Aug 07 '23 at 12:20
  • Command chaining at its finest! :) – Unknown Aug 12 '23 at 11:22
38

Could please try this one(A bit long though, maybe there are better solutions):

$ pip list | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null

the output should look like this:

80K     /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/blinker
3.8M    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/docutils
296K    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/ecdsa
340K    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/execnet
564K    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/fabric
1.4M    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/flask
316K    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/httplib2
1.9M    /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/jinja2
...

should works if the package is installed in Location/Name. (location and name are from pip show <package>)


pip show <package> will show you the location:

---
Metadata-Version: 2.0
Name: Flask
Version: 0.10.1
Summary: A microframework based on Werkzeug, Jinja2 and good intentions
Home-page: http://github.com/mitsuhiko/flask/
Author: Armin Ronacher
Author-email: armin.ronacher@active-4.com
License: BSD
Location: /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages
Requires: itsdangerous, Werkzeug, Jinja2

we get the Name and Location to join them to get the location, finally use du -sh to get the package size.

Skippy le Grand Gourou
  • 6,976
  • 4
  • 60
  • 76
lord63. j
  • 4,500
  • 2
  • 22
  • 30
  • 2
    works great. to sort by size, we can add: | sort -h to the above pip list | xargs pip show.... command – faustus Sep 28 '16 at 17:46
  • or `gsort` on Mac OS X from homebrew, because standard sort on Mac does not have the `-h` flag – Gerard Dec 19 '16 at 21:31
  • i corrected this command for last python version on my answer – intika Apr 15 '18 at 06:16
  • 15
    everything here mostly worked for me. I'm using `pip 18.0` which outputs a header, so I added in a `tail -n +3 | awk '{print $1}' in between the `pip list` and `pip show` – abest Oct 30 '18 at 14:28
  • I replaced both `pip` commands with `pip3` as I'm on a Mac where pip is used for Python 2 and pip3 for Python 3; then (similar to what @abest did) I used `| sed '1,2d'` between `pip3 list` and `xargs pip3 show` to remove the 2 header rows in the `pip3 list` output; then to chop off the full path, I added `| sed -E 's/\/Library\/Frameworks\/Python.framework\/Versions\/3.7\/lib\/python3.7\/site-packages\///g'`; then for reverse sort and size in bytes I added `| sed -E 's/([0-9]).([0-9])M/\1\200000/g ; s/ +([0-9]+)M/\1000000/g ; s/([0-9]).([0-9])K/\1\200/g ; s/ +([0-9]+)K/\1000/g' | sort -rn` – jshd Sep 20 '20 at 14:41
  • 1
    I just noticed that this ignores some packages like PyAudio, although I haven't figured out what causes this yet. – jshd Sep 20 '20 at 15:02
  • I just noticed that the original command string unfortunately filters out installed packages for which there is just a .py file installed by pip and no directory, like PyAudio. – jshd Sep 20 '20 at 15:13
28

New version for new pip list format:

pip2 list --format freeze|awk -F = {'print $1'}| xargs pip2 show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null|sort -h
Petr Mach
  • 420
  • 4
  • 7
  • 19
    This also works with pip3: `pip3 list --format freeze|awk -F = {'print $1'}| xargs pip3 show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null|sort -h` – Nbfour Aug 02 '19 at 13:27
20

Go to the package site to find the size e.g. https://pypi.python.org/pypi/pip/json

Then expand releases, find the version, and look up the size (in bytes).

wisbucky
  • 33,218
  • 10
  • 150
  • 101
JMzance
  • 1,704
  • 4
  • 30
  • 49
15

There is a simple Pythonic way to find it out though.

Here is the code. Let's call this file pipsize.py.

import os
import pkg_resources

def calc_container(path):
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(path):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            total_size += os.path.getsize(fp)
    return total_size



dists = [d for d in pkg_resources.working_set]

for dist in dists:
    try:
        path = os.path.join(dist.location, dist.project_name)
        size = calc_container(path)
        if size/1000 > 1.0:
            print (f"{dist}: {size/1000} KB")
            print("-"*40)
    except OSError:
        '{} no longer exists'.format(dist.project_name)

When run with python pipsize.py this will print out something like,

pip 21.1.2: 8651.906 KB
----------------------------------------
numpy 1.20.3: 25892.871 KB
----------------------------------------
numexpr 2.7.3: 1627.361 KB
----------------------------------------
zict 2.0.0: 48.54 KB
----------------------------------------
yarl 1.6.3: 1395.888 KB
----------------------------------------
widgetsnbextension 3.5.1: 4609.962 KB
----------------------------------------
webencodings 0.5.1: 54.768 KB
----------------------------------------
wcwidth 0.2.5: 452.214 KB
----------------------------------------
uvicorn 0.14.0: 257.515 KB
----------------------------------------
tzlocal 2.1: 67.11 KB
----------------------------------------
traitlets 5.0.5: 800.71 KB
----------------------------------------
tqdm 4.61.0: 289.412 KB
----------------------------------------
tornado 6.1: 2898.264 KB

Tirtha
  • 598
  • 5
  • 9
  • 3
    I like this. I did some modifications for mine(eg. KB to MB, sort by alphabet), it was a lot of help. – HyeonPhil Youn Aug 08 '21 at 07:58
  • 2
    Here's the modified code for showing MB instead of KB and sort by size in descending order: https://gist.github.com/AnsonH/fd634ba4298376f2abd8e00f99b01be8 – AnsonH Mar 05 '22 at 04:22
7

All of the above solutions do not list packages with dashes in them: PIP converts them to underscores in the folder names:

pip list --format freeze | awk -F = {'print $1'} | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{gsub("-","_",$1); print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -h

And for Mac users:

pip3 list --format freeze | awk -F = {'print $1'} | xargs pip3 show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{gsub("-","_",$1); print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -h
Synthesis
  • 463
  • 4
  • 5
5

Here's how,

  1. pip3 show numpy | grep "Location:"
  2. this will return path/to/all/packages
  3. du -h path/to/all/packages
  4. last line will contain size of all packages in MB

Note: You may put any package name in place of numpy

Samir Kape
  • 1,733
  • 1
  • 16
  • 19
3

History :

There is no command or applications developed for that purpose at the moment, we need to check that manually

Manual Method I :

du /usr/lib/python3.5/ --max-depth=2 | sort -h
du /usr/lib64/python3.5/ --max-depth=2 | sort -h

This does not include packages/files installed out of that directory, thus said we will get 95% with those 2 simples command

Also if you have other version of python installed, you need to adapt the directory

Manual Method II :

pip list | sed '/Package/d' | sed '/----/d' | sed -r 's/\S+//2' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" $(find $2 -maxdepth 1 -iname $1)}' | xargs du -sh  | sort -h

Search the install directory with the package name with case insensitive

Manual Method II Alternative I :

pip list | sed '/Package/d' | sed '/----/d' | sed -r 's/\S+//2' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - -| awk '{print $2 "/" tolower($1)}' | xargs du -sh | sort -h

Search the install directory with the package name with lowered case

Manual Method II Alternative II :

pip list | sed '/Package/d' | sed '/----/d' | sed -r 's/\S+//2' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - -| awk '{print $2 "/" $1}' | xargs du -sh | sort -h

Search the install directory with the package name

Note :

For methods using du, output lines starting with du: cannot access need to be checked manually; The command use the install directory and add to it the name of the package but some times the package name and directory name are different...

Make it simple :

  • Use first method then
  • Use second method and just check manually package outside python classic directory
intika
  • 8,448
  • 5
  • 36
  • 55
3

How

 $ du -h -d 1 "$(pip -V | cut -d ' ' -f 4 | sed 's/pip//g')" | grep -vE "dist-info|_distutils_hack|__pycache__" | sort -h

Pros

No need to convert these:
case (Django:django)
hyphen (django-q:django_q)
naming (djangorestframework-gis:rest_framework_gis)

Cons

Dependencies and some unknown directories revealed as well...

yellowsoar
  • 33
  • 3
1

You can just run part 1 by it's self for all the current packages python tool-size.py will total them all up for you

If you want to know the exact size of a particular pip package including all its dependencies, i've created a little bash and python combo to achieve this

( based off the excellent package walking code answer above https://stackoverflow.com/a/67914559/3248788 )

Steps :

  1. create a python script to check all currently installed pip packages
  2. create a shell script to create a brand new python environment and install package to test, and run the script from step 1
  3. run shell script
  4. profit :)

Step 1

create a python script called tool-size.py

#!/usr/bin/env python

import os
import pkg_resources

def calc_container(path):
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(path):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            total_size += os.path.getsize(fp)
    return total_size

def calc_installed_sizes():
    dists = [d for d in pkg_resources.working_set]

    total_size = 0
    print (f"Size of Dependencies")
    print("-"*40)
    for dist in dists:
        # ignore pre-installed pip and setuptools
        if dist.project_name in ["pip", "setuptools"]:
            continue
        try:
            path = os.path.join(dist.location, dist.project_name)
            size = calc_container(path)
            total_size += size
            if size/1000 > 1.0:
                print (f"{dist}: {size/1000} KB")
                print("-"*40)
        except OSError:
            '{} no longer exists'.format(dist.project_name)

    print (f"Total Size (including dependencies): {total_size/1000} KB")

if __name__ == "__main__":
    calc_installed_sizes()

Step 2

create a bash script called tool-size.sh

#!/usr/bin/env bash

# uncomment to to debug
# set -x

rm -rf ~/.virtualenvs/tool-size-tester
python -m venv ~/.virtualenvs/tool-size-tester
source ~/.virtualenvs/tool-size-tester/Scripts/activate
pip install -q $1
python tool-size.py
deactivate

Step 3

run script with package you want to get the size of

tool-size.sh xxx

say for truffleHog3

$ ./tool-size.sh truffleHog3

Size of Dependencies
----------------------------------------
truffleHog3 2.0.6: 56.46 KB
----------------------------------------
smmap 4.0.0: 108.808 KB
----------------------------------------
MarkupSafe 2.0.1: 40.911 KB
----------------------------------------
Jinja2 3.0.1: 917.551 KB
----------------------------------------
gitdb 4.0.7: 320.08 KB
----------------------------------------
Total Size (including dependencies): 1443.81 KB

aqm
  • 2,942
  • 23
  • 30
0

On Mac, I navigate to the site-packages folder and do

du -h -d 1 | sort -rh | grep -v "dist-info"   

On linux you need --max-depth 1 instead of -d 1. But I think that should work.