0

We have one requirement in our project that detect anything that is dropped into a directory in python.

The process is like this:

  • There will be a python script running almost all time a day(sort of cron job), which will keep watch on a directory.

  • When anybody puts a file into a directory that file should be detected.

  • File dropped will have zip, xml, json or an ini format.
  • There is no fix way that how user will drop the file into that directory (i.e person could simply copy or move it using console by cp or mv command. Or person might do a FTP transfer from some other computer, or may upload it through our web interface)

I'm able to detect it while dropped by web interface but not for other ways.

Can anyone suggest me the way to detect file dropped:

def detect_file(watch_folder_path):
    """ Detect a file dropped """
    watched_files = os.listdir(watch_folder_path)
    if len(watched_files) > 0:
        filename = watched_files[0]
        print "File located :, filename
Laxmikant
  • 2,046
  • 3
  • 30
  • 44

3 Answers3

1

If this is a linux system I would suggest inotifywatch as it seems to be as it can be configure per events, like create, move_to and more.

There is a python wrapper pyinotify for it which which you can invoke like this:

python -m pyinotify -v /my-dir-to-watch
primero
  • 591
  • 1
  • 6
  • 17
0

How about:

known_files = []

def detect_file(watch_folder_path):
    files = os.listdir(watch_folder_path)
    for file in files:
        if file not in known_files:                   
            #RAISE ALERT e.g. send email
            known_files.append(file)

Add the file to the known_files list once the alert has been raised so that it does not keep alerting.

You will then want to run detect_files() on repeat at a frequency of your discretion. I recommend using Timer to achieve this. Or even more simply, execute this function inside a while True: statement, and add in time.sleep(60) to run the detect_files() check every 60 seconds for example.

cbeeson
  • 76
  • 3
0

If you don't want to use any dependency for your project you can rely on a script to compute the changes for your files. Assuming this script will always run you can write the following code:

def is_interesting_file(f):
    interesting_extensions = ['zip', 'json', 'xml', 'ini']
    file_extension = f.split('.')[-1]
    if file_extension in interesting_extension:
        return True
    return False

watch_folder_path = 0
previous_watched_files = set()

while True:
    watched_files = set(os.listdir(watch_folder_path))
    new_files = watched_files.difference(previous_watched_files)
    interesting_files = [filename for filename in new_files if is_interesting_file(filename)]
    #Do something with your interesting files

If you want to run this script on a cron or something like that using this approach, you can always save the directory listing in a file or simple database as sqlite and assign it to the previous_watched_files variable. Then you can make one iteration watching the directory for changes, clear the db/file records and creating them again with the updated listing results.

avenet
  • 2,894
  • 1
  • 19
  • 26