1

I am trying to develop a CNN for image processing. I have about 130 gigs stored on a separate drive on my comp, and I'm having trouble navigating a simple python search program to search through that specified directory. Im trying to have it find a bunch of random XML files scattered in a host of sub-directories/sub-directories/subs on that drive. How do I specify for just this one python program the directory it should be searching in, keeping it only to the context of the program?

Ive tried setting a variable Path = "B:\\MainFolder\SubFolder" and using os.walk, but it makes it through the first directory then stops.

Eduardo Pascual Aseff
  • 1,149
  • 2
  • 13
  • 26
blit
  • 11
  • 2
  • Ive tried setting a variable Path = "B:\\MainFolder\SubFolder" and using os.walk, but it makes it through the first directory then stops. glob wont even recognize the path for some reason – blit Feb 21 '20 at 01:51

2 Answers2

0

can you try the following:

import os
import glob
base_dir = 'your/start/sirectory'
req_files = glob.glob(os.path.join(base_dir, '**/*.xml'), recursive=True)
Jeril
  • 7,858
  • 3
  • 52
  • 69
0

Jeril and Eduardo, thank you for the help. i took a shot at pathlib and it worked. idk what was up with my glob code, looked basically the same as yours Jeril:

import glob, os

filelist = []

from pathlib import Path

for path in Path('B:\\CTImageDataset\LIDC-IDRI').rglob('*.xml'):
    filelist.append(path.name)
    print(filelist)

Worked great, thanks again

U13-Forward
  • 69,221
  • 14
  • 89
  • 114
blit
  • 11
  • 2