As explained in the documentation, this is because you need to provide a progress indicator. Depending on what you do with your files, you can either use the files count or the files sizes.
Other answers suggested to convert the os.walk()
generator into a list, so that you get a __len__
property. However, this will cost you a lot of memory depending on the total number of files you have.
Another possibility is to precompute: you first walk once your whole file tree and count the total number of files (but without keeping the list of files, just the count!), then you can walk again and provide tqdm
with the files count you precomputed:
def walkdir(folder):
"""Walk through every files in a directory"""
for dirpath, dirs, files in os.walk(folder):
for filename in files:
yield os.path.abspath(os.path.join(dirpath, filename))
# Precomputing files count
filescount = 0
for _ in tqdm(walkdir(target_dir)):
filescount += 1
# Computing for real
for filepath in tqdm(walkdir(target_dir), total=filescount):
sleep(0.01)
# etc...
Notice that I defined a wrapper function over os.walkdir
: since you are working on files and not on directories, it's better to define a function that will progress on files rather than on directories.
However, you can get the same result without using the walkdir
wrapper, but it will be a bit more complicated as you have to resume the last progress bar state after each subfolder that gets traversed:
# Precomputing
filescount = 0
for dirPath, subdirList, fileList in tqdm(os.walk(target_dir)):
filescount += len(filesList)
# Computing for real
last_state = 0
for dirPath, subdirList, fileList in os.walk(target_dir):
sleep(0.01)
dirName = dirPath.split(os.path.sep)[-1]
for fname in tqdm(fileList, total=filescount, initial=last_state):
# do whatever you want here...
# Update last state to resume the progress bar
last_state += len(fileList)