5

I am trying to print the file extension in a certain directory and the count of each extension.

This is what I have so far...

import os 
import glob

os.chdir(r"C:\Python32\test")
x = glob.glob("*.*")
for i x:
    print(i)

>>> file1.py
    file2.py
    file3.py
    file4.docx
    file5.csv

So I am stuck, I need my overall output to be...

py    3
docx  1
csv   1

I have tried to use something like i.split("."), but I get stuck. I think I need to put the extension in a list and then count the list, but that is where I am running into problems.

Thanks for the help.

Trying_hard
  • 8,931
  • 29
  • 62
  • 85
  • Make a new empty dictionary, if the extension doesn't exist add a new entry and set the value to 1, if it does already exist increment by 1 – TheZ Oct 22 '12 at 16:42
  • Are you sure you don't get a `SyntaxError` running the above code? – Joel Cornett Oct 22 '12 at 16:43
  • possible duplicate of [Count number of files with certain extension in Python](http://stackoverflow.com/questions/1320731/count-number-of-files-with-certain-extension-in-python) – dbn Feb 28 '14 at 02:02

4 Answers4

11

Use os.path.splitext to find the extension, and use collections.Counter to count the types of extensions.

import os 
import glob
import collections

dirpath = r"C:\Python32\test"
os.chdir(dirpath)
cnt = collections.Counter()
for filename in glob.glob("*"):
    name, ext = os.path.splitext(filename)
    cnt[ext] += 1
print(cnt)
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
2

You could use collections.Counter

from collections import Counter
import os
ext_count = Counter((ext for base, ext in (os.path.splitext(fname) for fname in your_list)))
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
2
import collections
import os

cnt = collections.Counter()
def get_file_format_count():
    for root_dir, sub_dirs, files in os.walk("."):
        for filename in files:
            name, ext = os.path.splitext(filename)
            cnt[ext] += 1
    return cnt

print get_file_format_count()
Mubarak
  • 21
  • 1
0

this implementation will count the occurrences of each extension and put it into the variable c. By using the most_common method on the counter it will print the most frequent extensions first as you have in your example output

from os.path import join, splitext
from glob import glob
from collections import Counter

path = r'C:\Python32\test'

c = Counter([splitext(i)[1][1:] for i in glob(join(path, '*'))])
for ext, count in c.most_common():
    print ext, count

output

py 3
docx 1
csv 1
Marwan Alsabbagh
  • 25,364
  • 9
  • 55
  • 65