2

Is there a way to find file names with numbers that are not consecutive? More specifically, I'm looking to list filenames with these numbers included:

path +'*.s201701*.nc'
path +'*.s201801*.nc'
path +'*.s201901*.nc'
path +'*.s201702*.nc'
path +'*.s201802*.nc'
path +'*.s201902*.nc'
path +'*.s201712*.nc'
path +'*.s201812*.nc'
path +'*.s201912*.nc' 

I can get the changes in '2017' to '2019' since the numbers are consecutive, but not the '01', '02', '12', because these aren't. This doesn't work:

glob.glob(path +'*.s201[7-9][01,02,12]*.nc'

And this works,

glob.glob(path +'*.s201[7-9][0-1][1-2]*.nc'

but also gives me files in s201*11*.nc, which I don't want. Any tips?

Cynthia GS
  • 522
  • 4
  • 20

2 Answers2

1

You can't do this with a single glob - the language just isn't sophisticated enough - but you can do it with two:

glob.glob(path +'*.s201[7-9]0[1-2]*.nc') + glob.glob(path +'*.s201[7-9]12*.nc')
Nathan Vērzemnieks
  • 5,495
  • 1
  • 11
  • 23
0

You could just check for repeat numbers using regex on the results form os.listdir. I made a sample file in the same directory as the script and it has repeat numbers. Using the first method returns an empty list. Removing the 'not' in the list comprehension returns the offending file name.

import os
import re

files = [f for f in os.listdir(path) if not re.search(r'(\d)\1+\b', f)]

print(files)
[]

Removing the 'not' to find repeat numbers:

files = [f for f in os.listdir(path) if re.search(r'(\d)\1+\b', f)]
print(files)
['s201911.txt']
Chris
  • 15,819
  • 3
  • 24
  • 37