0

Related to my question here force doxygen to pick one from versioned files I'm looking for kind of very simple version control but name based . For sure it would be easier to just use vcs but i want to use it on binary files too ( about 1gb per file ). I just want to backup last version.

At the end I'm looking for something that create copy for following directory tree

rootDir
    -filename1
    -filename2
    -justName
    -otherName
    -dirA
        -dirAFile_ver01
        -dirAFile_ver02
    -dirB
        -dirBFile_01
        -dirBFile_02
        -dirBFile1
        -dirBFile2
    -dirC
        -dirCFile01
        -dirCFile02
        -dirD
            -dirDFile-01
            -dirDFile-02
            -dirDFile.0.1
            -dirDFile.0.2
            -dirDFile.1
    -dirE
        -file1.jpg
        -file2.jpg
        -file1.txt
        -file2.txt

and output should look like this

COPY_rootDir
    -filename2
    -justName
    -otherName
    -dirA
        -dirAFile_ver02
    -dirB
        -dirBFile_02
        -dirBFile2
    -dirC
        -dirCFile02
        -dirD
            -dirDFile-02
            -dirDFile.1
    -dirE
        -file2.jpg
        -file2.txt

Is there any ready to use module that would help me here? I don't even know how to define such versioning approach. Maybe there is ready to use tool ? I wrote simple script in python to create duplicate of directory tree with most recent files (by name), but it's not perfect and there is a lot of exceptions to consider, lots of possibilities for versioning naming conventions. current python script looks like this

import os, shutil
#------
#[return list of words splitted by list of characters]
def multisplit( splitStr , splitList ):
    for splitChar in splitList:
        splitStr = splitStr.replace( splitChar , " " )
    return splitStr.split()   
#------
#[first split by multisplit and then remove any number from string ]
def dualSplit( splitStr, splitList):
    firstPass = multisplit(splitStr,splitList)[0]
    secondPass = ''.join([char for char in firstPass if not char.isdigit()])
    return secondPass
#------
#be sure to use proper slashes]
def ensureSlashes( directoryPath ):
    strList = multisplit( directoryPath , ["\\", "/"] )
    return os.sep.join( strList )
#------
#[copy dirtree with latest files]
def copyLastVersions( source , destination ):
    source = ensureSlashes(source)
    sourcelen = len(source.split(os.sep))
    destination = ensureSlashes(destination)
    for root, dirs, files in os.walk(source):
        similar = []
        for file in sorted(files):
            if file not in similar:
                fname, fext = file.rsplit( "." , 1 )
                fnameOnly = dualSplit( fname , ['_', '-', '.'] )
                similar = [fn for fn in sorted(files)   if  (fnameOnly in fn)   \
                                                        and (fext in fn)        \
                                                        and (len(fnameOnly) == len(dualSplit( fn , ['_', '-', '.']))) ]
                sourceFile = os.sep.join([root, similar[-1]])
                depth =  len(root.split(os.sep)) - sourcelen
                destinationFile = os.sep.join(sourceFile.split(os.sep)[-depth-1:])
                #LOG
                """
                print "--"
                print file, " -- ", fnameOnly
                print similar
                print similar[-1]
                print "source-- ", sourceFile
                print "destin-- ", destinationFile
                print "--------------"
                """
                outPath = os.sep.join([destination,destinationFile])
                print outPath
                if not os.path.exists(os.path.dirname(outPath)):
                    os.mkdir(os.path.dirname(outPath))
                shutil.copy2(sourceFile ,outPath )

copyLastVersions( r"ROOT_SOURCE_PATH" , r"ROOT_DESTINATION_PATH")
martineau
  • 119,623
  • 25
  • 170
  • 301
bolek
  • 7
  • 3
  • I would suggest you just use the last modification date of each file to select the newest one. See [`os.path.getmtime`](https://docs.python.org/2/library/os.path.html#os.path.getmtime). As you're discovering, parsing file names is at best very error-prone. BTW, many VCS systems support binary files. – martineau May 24 '17 at 01:59
  • thanks @martineau, I will add modification time check at the end,but for sure at the very beginning I need to check for file groups that consist similar names e.g `[filename1 filename2]` ( and pick filename2-by then check creation date ).I'm more interested here on something that would help me group files by name eg.`[filename1,filename2,filename3]`, something that would find pattern of filename e.g `[filename1_01,filename1_02,filename1_03]` and check if filename1_03 is newest. Keep in mind that `filename1_01` `filename_01` `filename.1` are 3 different files and filename.jpg filename.ma to – bolek May 24 '17 at 06:05

0 Answers0