71

I've already read this thread but when I implement it into my code it only works for a few iterations.

I'm using python to iterate through a directory (lets call it move directory) to copy mainly pdf files (matching a unique ID) to another directory (base directory) to the matching folder (with the corresponding unique ID). I started using shutil.copy but if there are duplicates it overwrites the existing file.

I'd like to be able to search the corresponding folder to see if the file already exists, and iteratively name it if more than one occurs.

e.g.

  • copy file 1234.pdf to folder in base directory 1234.
  • if 1234.pdf exists to name it 1234_1.pdf,
  • if another pdf is copied as 1234.pdf then it would be 1234_2.pdf.

Here is my code:

import arcpy
import os
import re
import sys
import traceback
import collections
import shutil

movdir = r"C:\Scans"
basedir = r"C:\Links"

try:
    #Walk through all files in the directory that contains the files to copy
    for root, dirs, files in os.walk(movdir):
        for filename in files:
            #find the name location and name of files
            path = os.path.join(root, filename)
            print path
            #file name and extension
            ARN, extension = os.path.splitext(filename)
            print ARN

            #Location of the corresponding folder in the new directory
            link = os.path.join(basedir,ARN)

            # if the folder already exists in new directory
            if os.path.exists(link):

                #this is the file location in the new directory
                file = os.path.join(basedir, ARN, ARN)
                linkfn = os.path.join(basedir, ARN, filename)

                if os.path.exists(linkfn):
                    i = 0
                    #if this file already exists in the folder
                    print "Path exists already"
                    while os.path.exists(file + "_" + str(i) + extension):
                        i+=1
                    print "Already 2x exists..."
                    print "Renaming"
                    shutil.copy(path, file + "_" + str(i) + extension)
                else:

                    shutil.copy(path, link)
                    print ARN + " " +  "Copied"
            else:
                print ARN + " " + "Not Found"
Community
  • 1
  • 1
GISHuman
  • 1,026
  • 1
  • 8
  • 27
  • 1
    No, the structure is different. For instance, movdir is scans of property information and has been arranged by street name and the pdfs are named with the unique id. So C:\Scans\Main St\1234.pdf The basedir is a new structure that is going to arrange all information for a particular property by its unique ID. So C:\Links\1234 and there might be additional sub folders in the future but for now I'm just hoping to copy it to C:\Links\1234\1234.pdf – GISHuman Aug 22 '13 at 15:14
  • 1
    check [`filename_fix_existing(filename)`](https://github.com/steveeJ/python-wget/blob/master/wget.py#L72) – Grijesh Chauhan Jan 02 '16 at 10:41

5 Answers5

44

Sometimes it is just easier to start over... I apologize if there is any typo, I haven't had the time to test it thoroughly.

movdir = r"C:\Scans"
basedir = r"C:\Links"
# Walk through all files in the directory that contains the files to copy
for root, dirs, files in os.walk(movdir):
    for filename in files:
        # I use absolute path, case you want to move several dirs.
        old_name = os.path.join( os.path.abspath(root), filename )

        # Separate base from extension
        base, extension = os.path.splitext(filename)

        # Initial new name
        new_name = os.path.join(basedir, base, filename)

        # If folder basedir/base does not exist... You don't want to create it?
        if not os.path.exists(os.path.join(basedir, base)):
            print os.path.join(basedir,base), "not found" 
            continue    # Next filename
        elif not os.path.exists(new_name):  # folder exists, file does not
            shutil.copy(old_name, new_name)
        else:  # folder exists, file exists as well
            ii = 1
            while True:
                new_name = os.path.join(basedir,base, base + "_" + str(ii) + extension)
                if not os.path.exists(new_name):
                   shutil.copy(old_name, new_name)
                   print "Copied", old_name, "as", new_name
                   break 
                ii += 1
Jblasco
  • 3,827
  • 22
  • 25
  • 1
    Thanks for this, when I run this I get an error saying "ii" is not defined. Could be because I'm using 2.7 (it's compatible over 3.x with ArcGIS which may be integrated later with the code) – GISHuman Aug 22 '13 at 17:54
  • 3
    No, it's that I messed up. It should be ii = 1 (instead of zero, like we said in the other answer) instead of index = 0. – Jblasco Aug 22 '13 at 17:56
  • Inside the while loop, it should be "if not os.path.exists(new_name) ", right ? I mean, if the file doesn't exist then we should create it with the index "ii" – Lucaribou May 01 '20 at 06:46
  • Consider shutil.copy2 if you want to preserve metadata https://docs.python.org/3/library/shutil.html – JStrahl Jun 23 '20 at 15:51
43

I always use the time-stamp - so its not possible, that the file exists already:

import os
import shutil
import datetime

now = str(datetime.datetime.now())[:19]
now = now.replace(":","_")

src_dir="C:\\Users\\Asus\\Desktop\\Versand Verwaltung\\Versand.xlsx"
dst_dir="C:\\Users\\Asus\\Desktop\\Versand Verwaltung\\Versand_"+str(now)+".xlsx"
shutil.copy(src_dir,dst_dir)
user8924811
  • 439
  • 4
  • 3
  • 7
    "so its not possible, that the file exists already" ... except if you run the code twice concurrently ;) – bers Mar 01 '19 at 11:31
  • 1
    While this is a decent solution, it does not answer my question since there were duplicate file names that contained different information in the files, so it was a requirement to have an iterative naming scheme... the date of the files was irrelevent. – GISHuman Jul 09 '20 at 14:56
  • I like this solution; how to modify this to have only the most recent backup and delete the older backups? – FMFF Jun 29 '22 at 20:50
36

For me shutil.copy is the best:

import shutil

#make a copy of the invoice to work with
src="invoice.pdf"
dst="copied_invoice.pdf"
shutil.copy(src,dst)

You can change the path of the files as you want.

Alex Seceleanu
  • 505
  • 4
  • 10
  • 7
    This answer does not address the main OP's problem which is dealing with already existing files. Can you elaborate with a more complete answer on how to detect existing files and rename them? – Alejandro Piad May 07 '20 at 22:36
  • 1
    @AlejandroPiad but it does answer to what I am looking for. It may not be **directly** as the answer to the OP's question but it sure did simply it much. There's an advantage with that! as to anyone or even the OP can comprehend this answer and make it as a guide to solve his/her issue. – Ice Bear Mar 12 '21 at 05:49
  • @AlejandroPiad , Now I'm back here again to take a look at this answer and yet again it solves my issue. Thanks Alex – Ice Bear Jan 19 '23 at 06:48
6

I would say you have an indentation problem, at least as you wrote it here:

while not os.path.exists(file + "_" + str(i) + extension):
   i+=1
   print "Already 2x exists..."
   print "Renaming"
   shutil.copy(path, file + "_" + str(i) + extension)

should be:

while os.path.exists(file + "_" + str(i) + extension):
    i+=1
print "Already 2x exists..."
print "Renaming"
shutil.copy(path, file + "_" + str(i) + extension)

Check this out, please!

Jblasco
  • 3,827
  • 22
  • 25
  • 1
    It didn't change anything, unfortunately. – GISHuman Aug 22 '13 at 14:56
  • 1
    And if you could update a little the code with the changes you have done, even if they didn't work, it would be great. – Jblasco Aug 22 '13 at 15:00
  • 1
    I think there should be no "not" in my correction, sorry about that! I will edit in a moment. – Jblasco Aug 22 '13 at 15:24
  • 1
    I removed the "not" and it works! However the duplicate file is named 1234_0 and I'd like it to be named 1234_1. I think it has something to do with where I placed the i+=1. I tried moving it around to no avail! – GISHuman Aug 22 '13 at 17:22
  • Happy it worked! If you start i=0 it is normal that it will try to build it first. Try i=1 instead. – Jblasco Aug 22 '13 at 17:34
-1
import os
import shutil
import glob

src = r"C:\Source"
dest = r"C:\Destination"
par = "*"
i=1
d = []
for file in glob.glob(os.path.join(src,par)):
    f = str(file).split('\\')[-1]
    for n in glob.glob(os.path.join(dest,par)):
        d.append(str(n).split('\\')[-1])
    if f not in d:
        print("copied",f," to ",dest)
        shutil.copy(file,dest)
    else:
        f1 = str(f).split(".")
        f1 = f1[0]+"_"+str(i)+"."+f1[1]
        while f1 in d:
            f1 = str(f).split(".")
            f1 = f1[0]+"_"+str(i)+"."+f1[1]
            print("{} already exists in {}".format(f1,dest))
            i =i + 1
        shutil.copy(file,os.path.join(dest,f1))
        print("renamed and copied ",f1 ,"to",dest)
        i = 1
DILESH
  • 1
  • 1
  • 1
    Please provide additional details in your answer. As it's currently written, it's hard to understand your solution. – Community Sep 09 '21 at 13:07
  • Let's say we have two directories S and D .If we want to copy files from S to D and if the file , say file.ext to be copied already exists in D then we rename it to file_1.ext and copy it to D simlarly if file_1.ext exists we make it file_2.ext – DILESH Sep 13 '21 at 07:10