-3

The code can run normally, but I feel the "output_file" is too long, so any way improve it?

#load input files from input_dir(include child folders)
#and create output files to output_dir(include SAME child folders)
for root,dirs,files in os.walk(input_dir):
    for file in files:
        if file.endswith('.png'):
            input_file = os.path.join(root,file)
            output_file = output_dir+os.path.basename(root)+"\\"+file
print(input_file)  
print(output_file) 

output:
C:\savepic\input\a\1.png
C:\savepic\output\a\1.png
C:\savepic\input\a\2.png
C:\savepic\output\a\2.png
C:\savepic\input\b\1.png
C:\savepic\output\b\1.png
Junior
  • 27
  • 7
  • 1
    What do you mean? A shorter way to write `output_file = output_dir+os.path.basename(root)+"\\"+file`? What's the point, it's fairly clear the way it is? – Grismar Jan 24 '22 at 02:00
  • 1
    Welcome to Stack Overflow. Please take the [tour] and read about what's on-topic in the [help/on-topic]. We're not here to do open-ended code review. _Specifically_, what do you mean by 'I feel the "output_file" is too long'? – ChrisGPT was on strike Jan 24 '22 at 02:00
  • @Grismar Yes, I feel it is incorrect style... – Junior Jan 24 '22 at 02:07
  • Interesting the question uses the word "feel." Python is about "feel", see PEP20 the Zen of python. https://www.python.org/dev/peps/pep-0020/. The variable input_file code feels more pythonic than output_file code. Using os.path.join again for output_file may feel more pythonic. – Carl_M Jan 24 '22 at 02:10

2 Answers2

1

Split it up across multiple lines using os.path.join again:

os.path.join(
     output_dir,
     os.path.basename(root),
     file
)

Note: I can't tell from your original code if this will give exactly the desired output but you get the gist of the style

Pepsi-Joe
  • 447
  • 1
  • 10
1

Since your question is about picking the best style, it's somewhat opinion-based - which is dangerous territory for StackOverflow, but I think there's something to be said for following the conventions of the most current version of the language:

from pathlib import Path

# input_dir and output_dir are defined somewhere
input_dir = '.'
output_dir = 'c:/temp'

# the 'improved' code
for input_file in Path(input_dir).glob('**/*'):
    if input_file.is_file() and input_file.suffix.lower() == '.png':
        output_file = Path(output_dir) / input_file.parent.name / input_file.name
        # do something with input_file and output_file
        print(input_file, output_file)

For pure performance, your original solution may be a bit faster, but I think this beats it for readability and robustness.

Note that I also added a .lower() to the suffix check, as you would be missing files that have '.PNG' for an extension.

As @Pepsi-Joe correctly pointed out, if you are really only after .pngfiles, this is an optimisation that could be considered even cleaner:

for input_file in Path(input_dir).glob('**/*.png'):
    if input_file.is_file():
        output_file = Path(output_dir) / input_file.parent.name / input_file.name
        # do something with input_file and output_file
        print(input_file, output_file)

You still need to check for files, as a folder could also be named something.png of course. .glob() is case-insensitive, so it will still work for something.PNG as well.

The previous solution makes it a bit easier to add file formats:

for input_file in Path(input_dir).glob('**/*'):
    if input_file.is_file() and input_file.suffix.lower() in ['.png', '.jpg']:
        output_file = Path(output_dir) / input_file.parent.name / input_file.name
        # do something with input_file and output_file
        print(input_file, output_file)
Grismar
  • 27,561
  • 4
  • 31
  • 54
  • I'm sorry that the solution you provided didn't work well in my code because I didn't provide the full code, but I really appreciate your new solution. – Junior Jan 24 '22 at 03:32
  • Hey @Grismar, is there any reason to use your logic/style for finding a file over `for file in glob.glob('some/path/*.png')`? or even `Path(input_dir).glob('**/*.png')`? – Pepsi-Joe Jan 24 '22 at 03:48
  • 1
    @Pepsi-Joe I'd use the glob on Path, if Path is being used anyway; also, `glob.glob` treats names starting with `.` specially, which you may not want. https://docs.python.org/3/library/glob.html#module-glob I do agree that `Path(input_dir).glob('**/*.png')` is a lot nicer in case that truly only `.png` are matched. I considered it, but figured the code is easier to expand upon as it is. I may add a remark to the answer though, thanks. – Grismar Jan 24 '22 at 04:39