0

I am trying to group .xls files in list infiles based on strings in the .xls file names.

The file names are formatted like this "type_d_cross_profile_glacier_name_A-Z" where type_d is a type of glacier environment, the glacier_name is each glacier, and the A-Z is the letter of the alphabet representing which cross profile it is (there are multiple per glacier in each type, and there is not always 26 cross profiles).

I would like to group the files first by type (type_a to type_d) and then by glacier name so that the A-Z of the cross profiles for each glacier are all grouped together. I think I have to use groupby, but I can't work out how to use the key, group aspect with two different strings I want to group by.

I have used a long hand version to group the types:

type_a = [a for a in infiles if "type_a" in a]
type_b = [b for b in infiles if "type_b" in b]
type_c = [c for c in infiles if "type_c" in c]
type_d = [d for d in infiles if "type_d" in d]

which has worked fine, but I am sure there is a more elegant way in which I can group by type, and then by glacier. p.s. (I'm relatively new to python and have adhd so find multi level things are really difficult for me to comprehend; I really appreciate any help!)

1 Answers1

0

Use a dict.

types = {}

for f in infiles:
    prefix = '_'.join(f.split('_', 2)[:2]) # could also use regex
    types.setdefault(prefix, []).append(f)
   
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • Thanks! Just to make sure I understand; this separates based on the underscores and uses the second underscore? and then, using the dict. 'type', you separate the infiles into lists organised by type (identified due to the split). (Just want to make sure I understand so I can improve how I approach problems etc) – debris_glaciers Mar 21 '22 at 14:09
  • @debris_glaciers correct, except that the dict is named `types` – timgeb Mar 21 '22 at 14:32