1

I´ve got a folder/sub-directories structure as follow:

-main_folder
    -sub_1
         322.txt
         024.ops
    -sub_2
         977.txt
         004.txt
    -sub_3
         396.xml
         059.ops

I´m trying to iterate with os.walk through the folder and its sub-directories and collect the names inside these folders. When a name gets found by a regex rule, I want to either store the path in list or directly move that file into a new folder (mkdir).

I´ve already got the regex done to find the document I want. For example:

find_000_099 = r'\b(0\d{2}.\w{1,4})'
find_300_399 = r'\b(3\d{2}.\w{1,4})'
find_900_999 = r'\b(9\d{2}.\w{1,4})'

I wish my expected result to be like:

-main_folder
    -sub_from_000_099
         024.ops
         004.txt
         059.ops
    -sub_from_300_399
         322.txt
         396.xml
    -sub_from_900_999
         977.txt
Lucas Mengual
  • 263
  • 6
  • 21
  • 1
    Thanks, But I think we are here to help you by solving your issues. We can help you and show you the right direction but the best thing is you work on it and change it according to your needs. I think if you are trying to design such logics, you can easily write some checks for testing the existence of directories and other common stuff. My aim is to help as fast as I can so that someone in need can get going. Happy to help!! :) – Shubham Vaishnav Aug 29 '19 at 10:13

2 Answers2

4

You can use the below-given code, which moves the file from its initial directory to the desired directory.

import os
import re
import shutil

find_000_099 = r'\b(0\d{2}.\w{1,4})'
find_300_399 = r'\b(3\d{2}.\w{1,4})'
find_900_999 = r'\b(9\d{2}.\w{1,4})'

count = 0

for roots,dirs,files in os.walk('Directory Path'):
    #print(roots, len(dirs), len(files))
    if count == 0:
        parent_dir = roots
        os.mkdir ( parent_dir  + "/sub_from_000_099" )
        os.mkdir ( parent_dir  + "/sub_from_300_399" )
        os.mkdir ( parent_dir  + "/sub_from_900_999" )
        count += 1
    else:
        print(count)
        for file in files:
            print(file)
            if re.match(find_000_099, file):
                shutil.move ( roots + "/" + file, parent_dir + "/sub_from_000_099/" + file)
            elif re.match ( find_300_399, file ):
                shutil.move ( roots + "/" + file, parent_dir + "/sub_from_300_399/" + file )
            elif re.match ( find_900_999, file ):
                shutil.move ( roots + "/" + file, parent_dir + "/sub_from_900_999/" + file )

It's a skeleton code, which fulfills your requirements. You can add checks on creating directories, by first checking whether the directory exists or not, and other checks as per your needs.

Shubham Vaishnav
  • 1,637
  • 6
  • 18
1

Here is a simpler way, using pathlib and shutil

import re
import shutil
from pathlib import Path

new_path = Path("new_folder")
if not new_path.exists(): new_path.mkdir()

# Getting all files in the main directory
files = Path("main_folder").rglob("*.*")

regs = {
    r'\b(0\d{2}.\w{1,4})': "sub_1", # find_000_099
    r'\b(3\d{2}.\w{1,4})': "sub_2", # find_300_399
    r'\b(9\d{2}.\w{1,4})': "sub_3"  # find_900_999
}

for f in files:
    for reg in regs:
        if re.search(reg, f.name):
            temp_path = new_path / regs[reg]
            if not temp_path.exists(): temp_path.mkdir()

            # Change the following method to 'move' after testing it
            shutil.copy(f, temp_path / f.name)
            break
Mahmoud Sagr
  • 106
  • 5
  • @shubham-vaishnav both answers worked for me, thanks! But this one gave the option of whether or not if that subfolder existed, so only new files would be updated to those new subfolders. Cheers! – Lucas Mengual Aug 29 '19 at 09:29