writing output into files within subdirectory locations according to a list of input filename paths

Question

I have a txt file that contains a list of filenames with various subdirectories in this name format:

./A_blurb/test.txt
./B_foo/bar.txt
./B_foo/bric.txt
etc..

I also have script that loops through the lines in the filenames list and produces an appropriate output.

What I want is save the outputs of the files with different name in the directory that corresponds to the path as provided in the filenames list.

The code I wrote directs all the outputs (1 for each for loop) in the directory from which the script is run at the command line as such" shell$ python script.py inputfilelist.txt

This is my script:

import sys

with open(sys.argv[1]) as f:
    for filename in f:
        with open(filename.strip().strip("\n"),'a') as f1:
            #print f1
            output = []
            outfilename = filename.strip("\n").lstrip("./").replace("/", "__") + "out.txt"
            #print outfilename
            with open(outfilename, 'a') as outfile:
                line = f1.readline()
                while line and not line.startswith('GO-ID'):
                    line = f1.readline()
                data = f1.readlines()
                for line in data: 
                    line = line.split("\t")
                    GOnr = line[0].lstrip("\s")
                    pvalue = line[1].strip()
                    corrpval = float(line[2].strip())
                    if corrpval <= 0.05:
                        outstring = "GO:"+"%s %s" % (GOnr, str(corrpval))
                        outfile.write(outstring + "\n")
                        #print outstring

I'm looking for the most straightforward approach to have each loop save its outfile in the location identical to the filename's input path.

Suppose I have to use the sys module, but reading the python provided explanations, I don't quite understand how to use the sys.stdin sys.stdout functions.

Instead I've been trying this approach by defining a function upfront that reformats the input directories from the filelist, generating a full path for each new out.txt file.

def output_name(input_file):
    file_line=inputfile.strip()
    line_as_list=file_line.split("/")
    line_as_list.append("out.txt")     # file name
    line_as_list.remove(line_as_list[-2])  # remove file name of input file from path                     description 
    full_output_name="/".join(line_as_list) #join to add leading and intermittent `/` 
    return full_output_name

When I run this snippet interactively, it does what it needs too, E.g.: outputname("./A_blurb/test.txt") == "./A_blurb/out.txt" However, when I run it at the command line I get this message: return full_output_name \n SyntaxError: 'return' outside function

I carefully checked indentation but can't find what's the cause of this error message.... Thanks.

score 1 · Answer 1 · answered Oct 12 '14 at 01:29

The code at the end of the question is actually working fine. So below is a working answer to my question.

Given this list of files to loop through

string = """"
./A_blurb/test.txt
./B_foo/bar.txt
./B_foo/bric.txt
"""

The function below generates a list of the same format as the string, but removing the file.txt and adding out

def output_name(name_in):
    file_line = name_in.strip()
    line_as_list = file_line.split("/")
    line_as_list.append("out.txt")     ## generate file name
    line_as_list.remove(line_as_list[-2])  ## remove the file name
    full_output_name="/".join(line_as_list) # join fields in the list with `/`
    return full_output_name # return the re-formatted file path

This is the output:

./A_blurb/out.txt
./B_foo/out.txt
./B_foo/out.txt

The main script then loops through this list and uses each line as name to open(outfilename, 'w') with as result that the 'out.txt' files are written in the corresponding directories as where used as an input in to the script.

score 0 · Answer 2 · answered Oct 11 '14 at 21:21

0

Your script saves the file to an output path deduced from the input path.

That's ok. You shouldn't try to read and rewrite a file at the same time. It's complicated. Creating another file and then moving it to overwrite the original is easier.

Try os.rename() (or perhaps shutil.move(), also in the standard library):

# After closing the output file and the input file
os.rename(temporary_output_path, input_path)

answered Oct 11 '14 at 21:21

salezica

74,081
25
105
166

Hi, not quite sure if I understand. In the contrary to what you're saying, the code's saving the files in the path from which I'm running the script not in the output directories deduced from the input paths.. – oaklander114 Oct 11 '14 at 21:25
I thought you wanted to rewrite the original files. If you just want to save them to a relative path, try using `os.path.join()` and the other functions in that module. – salezica Oct 11 '14 at 21:27

writing output into files within subdirectory locations according to a list of input filename paths

2 Answers2