14

My Folder Structure looks like this:

- 95000
- 95002
- 95009
- AR_95000.pdf
- AR_95002.pdf
- AR_95009.pdf
- BS_95000.pdf
- BS_95002.pdf
- BS_95009.pdf

[Note 95000, 95002, 95009 are folders]


My goal is to move files AR_95000.pdf and BS_95000.pdf to the folder named 95000, then AR_95002.pdf and BS_95002.pdf to the folder named 95002 and so on.

The PDFs are reports generated by system and thus I can not control the naming.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
Drp RD
  • 190
  • 1
  • 1
  • 6
  • check this: https://stackoverflow.com/questions/49893501/python-moving-files-to-folder-based-on-filenames or https://stackoverflow.com/questions/35510922/python-can-i-move-a-file-based-on-part-of-the-name-to-a-folder-with-that-name – Eugene Anufriev Aug 19 '20 at 13:51

1 Answers1

35

Using pathlib this task becomes super easy:

from pathlib import Path

root = Path("/path/to/your/root/dir")

for file in root.glob("*.pdf"):
    folder_name = file.stem.rpartition("_")[-1]
    file.rename(root / folder_name / file.name)

As you can see, one main advantage of pathlib over os/shutil (in this case) is the interface Path objects provide directly to os-like functions. This way the actual copying (rename()) is done directly as an instance method.


References:

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
  • This worked like as you mentioned, its fast and all files moved inside respective folders. I was fiddling with shutil but no avail and constantly getting errors...having said that if you get the time can you mention how above code works. I am getting some sence that you split the files by underscore[ _ ] and then what is next part of code `file.stem.rsplit("_", 1)[-1]` – Drp RD Aug 19 '20 at 14:18
  • 2
    @DrpRD if you are familiar with the [`split`](https://docs.python.org/3/library/stdtypes.html#str.split) method, then [`rsplit`](https://docs.python.org/3/library/stdtypes.html#str.rsplit) basically does the same, but once you give it the extra `maxsplit` argument - the splits are counted from the right. This way because we only want the last part after `_` (assuming `AB_CD_9500` is also an option) we only split once from the right - `rsplit('_', 1)`. This returns `['AR', '95000']` and by indexing with `[-1]` we take the last element (`95000`) – Tomerikoo Aug 19 '20 at 15:16