0

I have a main directory(root) which countain 6 sub directory. I would like to count the number of files present in each sub directory and add all to a simple python list.

For this result : mylist = [497643, 5976, 3698, 12, 456, 745]

I'm blocked on that code:

import os, sys
list = []
# Open a file
path = "c://root"
dirs = os.listdir( path )

# This would print all the files and directories
for file in dirs:
   print (file)

#fill a list with each sub directory number of elements
for sub_dir in dirs:
    list = dirs.append(len(sub_dir))

My trying for the list fill doesn't work and i'm dramaticaly at my best...

Finding a way to iterate sub-directory of a main directory and fill a list with a function applied on each sub directory would sky rocket the speed of my actual data science project!

Thanks for your help

Abel

Abel
  • 3
  • 2
  • 1
    Does this answer your question? [Return number of files in directory and subdirectory](https://stackoverflow.com/questions/16910330/return-number-of-files-in-directory-and-subdirectory) – sushanth Aug 27 '20 at 13:44
  • Using `os.walk()` will help a lot. It recursively drills down sub directories. – adamkgray Aug 27 '20 at 14:00

4 Answers4

0

You need to use os.listdir on each subdirectory. The current code simply takes the length of a filepath.

import os, sys
list = []
# Open a file
path = "c://root"
dirs = os.listdir( path )

# This would print all the files and directories
for file in dirs:
   print (file)

#fill a list with each sub directory number of elements
for sub_dir in dirs:
    temp = os.listdir(sub_dir)
    list = dirs.append(len(temp))

Adding this line to the code will list out the subdirectory

davetherock
  • 224
  • 1
  • 2
  • 12
  • If you have subdirectories in your subdirectories, you'll need to use the solution linked by @sushanth – davetherock Aug 27 '20 at 13:50
  • ```FileNotFoundError Traceback (most recent call last) in 11 #fill a list with each sub directory number of elements 12 for sub_dir in dirs: ---> 13 temp = os.listdir(sub_dir) 14 list = dirs.append(len(temp)) FileNotFoundError: [WinError 3] Le chemin d’accès spécifié est introuvable: 'AMARYLLIDACEAE' – Abel Aug 28 '20 at 09:37
  • I have that error with this line of code added. Considering 'AMARYLLIDACEAE' is the first sub directory. I work on Jupyter notebook – Abel Aug 28 '20 at 09:40
0

You were almost there:

import os, sys

list = []

# Open a file
path = "c://root"
dirs = os.listdir(path)

# This would print all the files and directories
for file in dirs:
    print(file)

for sub_dir in dirs:
    if os.path.isdir(sub_dir):
        list.append(len(os.listdir(os.path.join(path, sub_dir))))

print(list)
Gustave Coste
  • 677
  • 5
  • 19
0

You can use os.path.isfile and os.path.isdir

res = [len(list(map(os.path.isfile, os.listdir(os.path.join(path, name))))) for name in os.listdir(path) if os.path.isdir(os.path.join(path, name))]
print(res)

Using the for loop

res = []
for name in os.listdir(path):
    dir_path = os.path.join(path, name)
    if os.path.isdir(dir_path):
        res.append(len(list(map(os.path.isfile, os.listdir(dir_path)))))
deadshot
  • 8,881
  • 4
  • 20
  • 39
  • With these two solutions, i have that error message: 7 list = [] ----> 8 res = [len(list(map(os.path.isfile, os.listdir(os.path.join(path, name))))) for name in os.listdir(path) if os.path.isdir(os.path.join(path, name))] 9 print(res) TypeError: 'list' object is not callable – Abel Aug 28 '20 at 09:48
  • you have used `list` as a variable name somewhere in your code that's why you are getting the error. don't use this `list = []` change the name to something else it will work – deadshot Aug 28 '20 at 09:55
0

As an alternative, you can also utilize glob module for this and other related tasks. I have created a test directory containing 3 subdirectories l,m and k containing 3 test files each.

import os, glob
  
list = []
path = "test" # you can leave this "." if you want files in the current directory

for root, dirs, files in os.walk(path, topdown=True):
   for name in dirs:
     list.append(len(glob.glob(root + '/' +  name + '/*')))

print(list)

Output :

[3, 3, 3]
Grayrigel
  • 3,474
  • 5
  • 14
  • 32
  • Your solution is close to be the better for my problem:) It create two list in a list: the first with the complete path + name of files and the second with the number of file in each directory. I just need the second list and i know how to delete the first list. But how to directly create a simple list wtih just the number of file in each sub directory? (without path and directory) – Abel Aug 28 '20 at 09:54
  • @Abel I have updated the code. You need `path` because you need to start somewhere. You can leave it to be `"."` for the current directory. Then, `os.walk` will do the job for you. – Grayrigel Aug 28 '20 at 11:33