22

I have a folder with subfolders that are all in the pattern YYYYMMDDHHMMSS (timestamp).

I want to use glob to only select the folders that match that pattern.

mikec
  • 3,543
  • 7
  • 30
  • 34

1 Answers1

31

Since glob doesn't support regular expressions, you'll have to brute-force creating the match string. One way is to take advantage of the fact that character ranges in [] are expanded:

C:\temp\py>mkdir 12345678901234

C:\temp\py>C:\Python26\python.exe
Python 2.6.2 Stackless 3.1b3 060516 (release26-maint, Apr 14 2009, 21:19:36) [M
C v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import glob
>>> glob.glob('./' + ('[0-9]' * 14))
['.\\12345678901234']
>>>

I took advantage of the fact that in Python, multiplying a string with an integer n results in that string being repeated n times.

Of course, you might want to go ahead and put in a check to verify that the given path is actually a directory:

>>> [path for path in glob.iglob('./' + ('[0-9]' * 14))]
['.\\11223344556677', '.\\12345678901234']
>>> [path for path in glob.iglob('./' + ('[0-9]' * 14)) if os.path.isdir(path)]
['.\\12345678901234']
Mark Rushakoff
  • 249,864
  • 45
  • 407
  • 398
  • Thanks for the reply. For now I was using this: [0-9][0-9][0-9][0-9][0-1][0-9][0-3][0-9][0-2][0-9][0-2][0-9][0-6][0-9] Which basically has the rules for the format I described (limiting months, days, hours minutes to their respective ranges), I just wasn't sure if there was a better way of doing it. – mikec Apr 22 '10 at 16:51
  • 9
    @mikec: It might be simpler to stick with `'[0-9]' * 14` and then only accept the timestamps which can be successfully parsed with `datetime.strptime`, if you *really* need to ensure that all the timestamps are valid. – Mark Rushakoff Apr 22 '10 at 16:57