-2

I'm sure this is a common question but I'm such a python newbie I don't know how to search for the question.

I have a script with a function like so:

firstarg = sys.argv[1]
secondarg = sys.argv[2]

def examplefunc(firstarg):
    #does something
#returns something

def example2(firstarg, secondarg):
    #also does something and returns something

I have a folder with a huge number of txt files. For the first example function I want to cycle through all these txt files and print out the answer in to one new file (i.e. pass in each txt file as firstarg). Similarly for the second function I want to fix the first argument as one particular txt file and for the second argument cycle through all remaining txt files.

The only way I know how to run python commands with arguments like this would be to run in my terminal something of the form:

python myscript.py ./txtfile1 ./txtfile2

And then change the arguments accordingly. I'm sure there's a better way. Can anyone help?

prply
  • 23
  • 1
  • 5

3 Answers3

1

If you want to get a list of all of the files in a directory, there are two ways you can do it.

First, from outside of Python, just do this:

python myscript.py txtfile1 *

Now, your sys.argv will be something like ['myscript.py', 'txtfile1', 'txtfile1', 'txtfile2', 'txtfile3', 'txtfile4']. So, if you want to run examplefunc on every file, and then run example2 on txtfile1 against every file:

special_file = sys.argv[1]
everything_else = sys.argv[2:]
for filename in everything_else:
    examplefunc(filename)
for filename in everything_else:
    example2(special_file, filename)

But you can also do this from inside Python. Pass in a directory instead of a bunch of files, like python myscript.py txtfile1 .. Then your sys.argv will be ['myscript.py', 'txtfile1', '.'], and you can do this:

special_file = sys.argv[1]
everything = os.listdir(sys.argv[2]))
abarnert
  • 354,177
  • 51
  • 601
  • 671
1

For a start, don't pass the file names as arguments. Use os.walk to get list of all .txt files in your directory and then you can easily read the files and cycle through them in your Python code.

jazdev
  • 486
  • 3
  • 14
0

Far better is to iteratively step through each file in the directory from the program:

import os

def file_iterator(dir_to_traverse):
    for path,dirs,files in os.walk(dir_to_traverse):
        for f_name in files:
            yield os.path.join(path,f_name)
        break

for file_name in file_iterator('.'):
    print file_name
Moose
  • 148
  • 1
  • 7
  • '.' refers to the current directory. You could specify any dir, like 'C:\\Documents and Settings\\xx\\My Documents\\Some Dir' if you are on windows. Note the use of double backslashes, since a single backslash is a string escape character for printing special characters. – Moose Sep 28 '14 at 22:24
  • using raw strings or forward slashes is generally preferred to double slashes on Windows. – Wooble Sep 28 '14 at 22:31