-1

I have this iteration:

with open("myFile.txt", "r") as landuse:
    next(landuse)
    for j in landuse:
        landuseList = j.split(";")
        clcKlasse = landuseList[2].strip()
        landusePlz = landuseList[3].strip("\"")
        landuseArea = landuseList[6].strip()
        landuseAreaFloat = float(landuseArea.replace("," , "."))
        if landusePlz in dictPlz:
            areaPlz = dictPlz.get(landusePlz)
            relativeShare = (landuseAreaFloat * 100) / areaPlz
            nf.write(str(clcKlasse) + "\t" + str(relativeShare) + "\t")
            prevAreaPlz = areaPlz
    print "Done"

I need this structure in my file (nf):

PLZ    "abc"    "def"    "ghi"    "jkl"    "mnl"    "opq"
1       7.54     1.20    9.98     19.57     8.68    2.15

PLZ     "abc"
2       10.17     

...

And thats the file where I read from:

"CLCKlasse";"PLZ";"area"
"abc";"1";7.54
"def";"1";1.20
"ghi";"1";9.98   
"jkl";"1";19.57
"mnl";"1";8.68
"opq";"1";2.15
"abc";"2";10.17

...

AS you can see, each line relates to a plz. But, I need the plz only written once to nf with each corresponding value in one line plus the headerline.

four-eyes
  • 10,740
  • 29
  • 111
  • 220
  • I'd recommend you split the task in two (maybe more?) parts (generating the header and the actual content), then mergin them and writting the result in the file. Keeping you data in memory and writting only actual result instead of writting each line would be a better way to do this too, I think. – konart May 18 '15 at 08:58
  • So there's a blank line between each group of PLZ in the input file? You also have two lines that start with "jkl" but none with "mnl" — yet you output values for the latter in the desired output file (`nf`). – martineau May 18 '15 at 10:10
  • @martineau sorry, that was a typo. – four-eyes May 18 '15 at 10:38

1 Answers1

1
from operator import itemgetter
from itertools import groupby


#input file
f=open('mytxt','rb')
#output file
f_out=open('out','w')

#skip the first line
header=f.readline()

# read every line
lines=f.readlines()
lines=[i.split(';')  for i in lines if i != '\n']

#grouping
groups=[]
for k,g in groupby(lines,itemgetter(1)):
    groups.append(list(g))


#iterate and write to a file
for j in range(len(groups)):
    headers=[[i[0],i[2]]  for i in groups[j]]
    final_headers=["PLZ"+'\t'] + [i[0]+'\t' for i in headers]
    final_values=[str(j+1)+'\t']+[i[1].strip()+'\t' for i in headers]
    f_out.write("".join(final_headers))
    f_out.write("\n")
    f_out.write("".join(final_values))
    f_out.write("\n")
Ajay
  • 5,267
  • 2
  • 23
  • 30
  • 1
    Thanks, I did not know that module. But I think that does not solve the issue. There still stays the problem with the `plz`. It occurs as often as `clcKlasse` and `relativeShare` does. And, if I write it into a list, how to tell to write that list to the file of its "full", meaning, the plz changes? – four-eyes May 18 '15 at 09:18
  • @Stoffer 1. Post part of your text file 2.Expected output These two are required to fully understand what you want – Ajay May 18 '15 at 09:22