I'm using os.walk to select files from a specific folder which match a regular expression.
for dirpath, dirs, files in os.walk(str(basedir)):
files[:] = [f for f in files if re.match(regex, os.path.join(dirpath, f))]
print dirpath, dirs, files
But this has to process all files and folders under basedir, which is quite time consuming. I'm looking for a way to use the same regular expression used for files to filter out unwanted directories in each step of the walk. Or a way to match only part of the regex...
For example, in a structure like
/data/2013/07/19/file.dat
using e.g. the following regular expression
/data/(?P<year>2013)/(?P<month>07)/(?P<day>19)/(?P<filename>.*\.dat)
find all .dat files without needing to look into e.g. /data/2012