4

I'm very new at Python but I thought it would be fun to make a program to sort all my downloads, but I'm having a little trouble with it. It works perfectly if my destination only has one word in it but if the destination has two words or more this is where it goes wrong and the program gets stuck in a loop. Does anybody have a better idea to compare the lists than me

>>>for i in dstdir:
>>>    print i.split()

['CALIFORNICATION']
['THAT', "'70S", 'SHOW']
['THE', 'BIG', 'BANG', 'THEORY']
['THE', 'OFFICE']
['DEXTER']
['SPAWN']
['SCRUBS']
['BETTER', 'OF', 'TED']

>>>for i in dstdir:
>>>    print i.split()
['Brooklyn.Nine-Nine.S01E16.REAL.HDTV.x264-EXCELLENCE.mp4']
['Revolution', '2012', 'S02E12', 'HDTV', 'x264-LOL[ettv]']]
['Inequality', 'for', 'All', '(2013)', '[1080p]']

This is an example of the lists output.

I have a destination directory with only folders in it and a download directory. I want to make a program to automatically look at the source file name and then look at the destination name. if the destination name is in the source name then I have the yes to go ahead and copy the downloaded file so it is sorted in my collection.

destination = '/media/mediacenter/SAMSUNG/SERIES/'
source = '/home/mediacenter/Downloads/'
dstdir = os.listdir(destination)
srcdir = os.listdir(source)

for i in srcdir:
    source = list(i.split())
    for j in dstdir:
        count = 0
        succes = 0
        destination = list(j.split())
        if len(destination) == 1:
            while (count < len(source)):
                if destination[0].upper() == source[count].upper():
                    print 'succes ', destination, ' ', source
                count = count + 1
        elif len(destination) == 2:
            while(count < len(source)):
                if (destination[0].upper() == source[count].upper()):
                    succes = succes + 1
                    count = len(source)
            count = 0
            while(count < len(source)):
                if (destination[1].upper() == source[count].upper()):
                    succes = succes + 1
                    count = len(source)
            count = 0
            if succes == 2:
                print 'succes ', destination, ' ', source

For now I'm happy with only "success" as an output; I will figure out how to copy files as it will be a totally different problem for me in the near future

jww
  • 97,681
  • 90
  • 411
  • 885
  • 3
    You should explain what you are trying to do, possibly with an example of what you get from your program and what you expect to get. – hivert Feb 15 '14 at 14:54

4 Answers4

2

Something like this maybe. Checks if every word in the destination folder exists in the filename

dstdir = ['The Big Bang Theory', 'Dexter', 'Spawn' ]

srcdir = ['the.big.bang.theory s1e1', 'the.big.bang.theory s1e2', 'dexter s2e01']

for source in srcdir:
    for destination in dstdir:
        destinationWords = destination.split()

        if all(word.lower() in source.lower() for word in destinationWords):
            print 'succes ', destination, ' ', source

outputs:

succes  The Big Bang Theory   the.big.bang.theory s1e1
succes  The Big Bang Theory   the.big.bang.theory s1e2
succes  Dexter   dexter s2e01
M4rtini
  • 13,186
  • 4
  • 35
  • 42
  • Wow nice one! simple code even I can understand. Long time open source and community driven software user but the first time I asked a question to the community and the results have been amazing. thanks also this code works perfectly – janssens jessica Feb 20 '14 at 18:01
2

My personal favorite for fuzzy string comparisons in python is fuzzywuzzy It has a number of good examples and a very liberal license.

Some examples that might be relevant to you:

> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
> process.extract("new york jets", choices, limit=2)
  [('New York Jets', 100), ('New York Giants', 78)]
> process.extractOne("cowboys", choices)
  ("Dallas Cowboys", 90)

Or token_sort_ratio for your unordered needs.

> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
  90
> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
  100
placeybordeaux
  • 2,138
  • 1
  • 20
  • 42
0

With this simple script sugested bellow, you move files from source to destiny.

src = "/home/mediacenter/Downloads"
dst = "/media/mediacenter/SAMSUNG/SERIES"
source =  os.listdir(src)
destination = os.listdir(dst)

for filename in source:

    file_src = src +"/"+ str(filename)
    file_dst = dst +"/"+ str(filename)

    if filename not in destination and os.path.isdir(file_src) is False:
        #download file
        os.system("mv %s %s" %(file_src, file_dst))
    elif filename not in destination and os.path.isdir(file_src) is True:
        #download directory
        os.system("mv %s %s" %(file_src, dst))

It seems what you are looking for. You just need to check if filename not in destination list and move it. Does it worked for you?

Jack
  • 106
  • 1
  • 9
  • I think the problem was to match filenames against folders where there is not a exact match. For example to match a downloaded file name like 'name.of.series.episode#.format' against the folde 'name of series'. And a os neutral method for moving files may be better. – M4rtini Feb 15 '14 at 16:39
  • Tested with .txt, .jpeg, .mp4, folders, .pdf ... extensions and no problem. It works! – Jack Feb 15 '14 at 17:04
  • 1
    I don't see where you make sure that files with names like `The.Big.Bang.Theory.***` from source matches against `The big bang theory` folder in destination. But that might not even be what OP wanted. Hard to know when he\she doesn't stick around to answer questions.. – M4rtini Feb 15 '14 at 17:14
  • Right. Paraphrasing him/her `I have a destination directory with only folders in it and a download directory. I want to make a program to automatically look at the source file name and then look at the destination name. if the destination name is in the source name then I have the yes to go ahead and copy the downloaded file so it is sorted in my collection.` – Jack Feb 15 '14 at 18:28
  • In case of folders with names with spaces or dots, needs special treatment. As the question is not clear it was my interpretation. Solution manageable anyway from the code above. – Jack Feb 15 '14 at 18:34
0

From previous answer found re.sub a possible way toward problem resolution. Substitute this block:

# ...
import re

source =  os.listdir(src)
destination = os.listdir(dst)

By

source =  [re.sub(' ', '\\\\ ',w)for w in os.listdir(src)]
destination = [re.sub(' ', '\\\\ ', w) for w in os.listdir(dst)]

It does the trick to move folders with spaces between names.

Instead of comparing strings to handle special caracters I think you should look for regular expressions. I was trying to use something like this (applied to source and destination) but unssuceeded.

#snippet of code doesnt work, just to illustrate 

pattern = "[a-zA-Z0-9]"
for i,w in enumerate(source):
    for ch in w:

        if not re.match(pattern, ch) :
            print source , ch

            source[i] = re.sub( ch,r"\\" + ch, source[i])

At this link a question with simmilar concern.

Community
  • 1
  • 1
Jack
  • 106
  • 1
  • 9