3

I have a folder that has 4 different kinds of file. For example:

Type 1: 00001_a.png

Type 2: 00231_b.mat

Type 3: 00001_c.jpg

Type 4: 00001_c.png

How can I filter these files into 4 lists? My current solution can only filter based on file extension.

all_file = os.walk(input_path).next()[2] #get files only
list_one = [ fi for fi in all_file if fi.endswith("*.png") ] # "*_a.png" won't work
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
trminh89
  • 877
  • 2
  • 10
  • 17

2 Answers2

2

Consider a regex solution using os module's list directory:

import os, re

# CURRENT DIRECTORY OF RUNNING SCRIPT (OR MANUALLY ENTER PATH)
cd = os.path.dirname(os.path.abspath(__file__))

a_pngfiles = [file for file in os.listdir(cd) if re.match("^.*_a.png$", file)]
b_matfiles = [file for file in os.listdir(cd) if re.match("^.*_b.mat$", file)]
c_jpgfiles = [file for file in os.listdir(cd) if re.match("^.*_c.jpg$", file)]
c_pngfiles = [file for file in os.listdir(cd) if re.match("^.*_c.png$", file)]
Parfait
  • 104,375
  • 17
  • 94
  • 125
1

Just omit the asterisk (*) in endswith() and it will work as expected, e.g. fi.endswith('_a.png').

Proposed better solution which avoids hard-coding the supported types:

from collections import defaultdict

def get_file_type(filename):
    base, ext = os.path.splitext(filename)
    return base.rsplit('_', 1)[1] + ext

files_by_type = defaultdict(list)
for filename in os.listdir(input_path):
    filetype = get_file_type(filename)
    files_by_type[filetype].append(filename)
taleinat
  • 8,441
  • 1
  • 30
  • 44