1

I have the following code:

inputActionFile = '../action.txt'
inputDaerahFile = '../daerah.txt'
inputStuffFile = '../stuff.txt'
inputTermsFile = '../terms.txt'

outputFile = 'hasil.out'

inputAction = open(inputActionFile, 'r')
inputDaerah = open(inputDaerahFile, 'r')
inputStuff = open(inputStuffFile, 'r')
inputTerms = open(inputTermsFile, 'r')

output = open(outputFile, 'w')

for actionLine in inputAction:
 for daerahLine in inputDaerah:
  for stuffLine in inputStuff:
   for termsLine in inputTerms:
    keyword = actionLine.strip() + ' ' + daerahLine.strip() + ' ' + stuffLine.strip() + ' ' + termsLine
    output.write(keyword)

inputAction.close()
inputDaerah.close()
inputStuff.close()
inputTerms.close()
output.close()

I expected the results to be looping through all these files and nesting them one by one to the output file. However, it just iterates the fourth loop. I was doing a similar thing in BaSH and want to see how to do it in Python. The BaSH code is as follows:

#!/bin/sh
input1=$1
input2=$2
input3=$3
input4=$4
output=$5

echo "###START###" > $output
#old_IFS=$IFS
IFS='
'  # new field separator, EOL

for line1 in `cat $input1`;
do
 for line2 in `cat $input2`;
 do
  for line3 in `cat $input3`;
  do
   for line4 in `cat $input4`;
   do
    echo $line1 $line2 $line3 $line4 >> $output;
   done
  done
 done
done

unset IFS;
#IFS=$old_IFS
  • IMO it writes down the first line of first file X2*X3*X4(number of lines in each file) times. Is it what you want? or maybe I'm wrong and if so correct me please. – Jahan Zinedine Dec 16 '10 at 08:42

3 Answers3

3

Each loop will just go through the file once. After having sucessfully looped through

for termsLine in inputTerms:

Once, every time it gets there, it will skip this loop, as you have reached the end of the inputTerms file.

You need to either reopen each file in each loop, (or at least seek(0) on them), or read in the files into a list in memory.

So, either:

inputAction = open(inputActionFile, 'r').readlines()
inputDaerah = open(inputDaerahFile, 'r').readlines()
inputStuff = open(inputStuffFile, 'r').readlines()
inputTerms = open(inputTermsFile, 'r').readlines()

Or:

for actionLine in open(inputActionFile, 'r'):
 for daerahLine in open(inputDaerahFile, 'r'):
  for stuffLine in open(inputStuffFile, 'r'):
   for termsLine in open(inputTermsFile, 'r'):
       etc....
Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
2

Try:

inputAction = open(inputActionFile, 'r').readlines()
inputDaerah = open(inputDaerahFile, 'r').readlines()
inputStuff = open(inputStuffFile, 'r').readlines()
inputTerms = open(inputTermsFile, 'r').readlines()
SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
  • Oh my God. You guys are so quick and so helpful! I've just learned Python today and saw that in Python the task is taking .52s while in bash it's 14s and BaSH with for loop it takes 1:48. This is to create file with 250000 lines. Python rocks! – Jullian Gafar Dec 16 '10 at 08:00
0

This is your Bash version with some changes that may speed it up (plus a couple of other changes).

#!/bin/bash
# you had sh, but your question tag says "bash"
# if you really need Bourne shell portability, you should have tagged your
# question "sh" and not "bash"

input1=$1
input2=$2
input3=$3
input4=$4
output=$5

echo "###START###" > $output
#old_IFS=$IFS

IFS=$'\n'  # new field separator, EOL

while read -r line1
do
    while read -r line2
    do
        while read -r line3
        do
            echo "$line1 $line2 $line3" | cat - "$input4"
        done < "$input3"
    done < "$input2"
done < "$input1"   >> "$output"

By eliminating the inner for loop, this could be a lot faster than your version, depending on the size of the input4 file. Saving the file write for the end may have additional speed benefits.

You can do while IFS=$'\n' read -r var, and you wouldn't need to save and restore the value of IFS (if it were necessary to do so), but it saves some repetition by setting IFS once in the manner that you did in your original (and I've reproduced in my revision).

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439