3

I have written code which writes to a CSV file, reading from another file. I want to write out particular columns from the input file, so I append them to a list, then separate them by a comma and add them to the row, but the output file shows that the individual words' characters are also separated by commas. I only want words to be separated, not the characters.

import csv
def csv_reader(file,path):
    with open(path, 'w') as f1, open(file, 'r') as f2:
        write = csv.writer(f1, delimiter=',')
        read  = csv.reader((line.replace('\0','') for line in f2), delimiter="\t")
        i=1
        for row in read:
            if(len(row)==0):
                continue
            if(row[3]=="Trade"):
                continue
            else:
                if(row[6]==""):
                    r = [row[0],row[0],'A',row[8],row[9],row[0]]
                    line = ','.join(r)
                    print(line)
                    write.writerow(line)
                else:
                    r = [row[0],row[0],'B',row[6],row[7],row[0]]
                    line = ','.join(r)
                    print(line)
                    write.writerow(line)
if __name__ == "__main__":
    path = "sales.csv"
    csv_path = "FlowEdge-TRTH-Time_Sales.csv"
    csv_reader(csv_path,path)

This shows output like:

    0,7,0,0,4,5,0,0,0,0,C,8,.,H,K,",",0,7,0,0,4,5,0,0,0,0,C,8,.,H,K,",",B,",",5,.,7,",",4,",",0,7,0,0,4,5,0,0,0,0,C,8,.,H,K

while it should be like:

    0700450000C8.HK,0700450000C8.HK,B,5.7,4,0700450000C8.HK

when I do the following modification

   write.writerow([line])

It shows the complete string in one column of excel file meaning there is only one column while I want 6 columns.

smci
  • 32,567
  • 20
  • 113
  • 146
abhihacker02
  • 51
  • 1
  • 9
  • first i suggest to split your code - one method doing everything is bad habbit - and when done that - split read and write in a smaller separate tasks using this advice: https://ericlippert.com/2014/03/05/how-to-debug-small-programs/ I believe you will easy find your mistake – Drako May 11 '18 at 06:28
  • 2
    The problem is that you're doing `line=','.join(r)`, which turns the list of columns into a single string with commas, and then passing that string to `writerow`, which will iterate the string, treating each character as its own column. Do one or the other, not both. – abarnert May 11 '18 at 06:29
  • @abarnert but when i pass [line] to row writer then the output file has only one column. And if i directly pass the r list then the output file shows weird texts which remains same even after encoding in unicode – abhihacker02 May 11 '18 at 06:34
  • @abhihacker02 Who told you to pass `[line]`? – abarnert May 11 '18 at 06:38
  • @abarnert okay i thought it myself but when i do as you told i get weird characters which remains same even after encoding in unicode – abhihacker02 May 11 '18 at 06:39
  • 1
    I don't know exactly what you mean by that, but (a) that sounds like an entirely new problem, which means you probably want to ask a new question, and (b) either way, you're going to have to give us a [mcve] rather than making us guess at what your input might be, what you might have gotten wrong in attempting to "encode in unicode", and what those "weird characters" might be. – abarnert May 11 '18 at 06:46
  • You could write all this in about 6 lines flat using pandas: `read_csv`, select desired columns, `to_csv`. Do you really want to create your own csv writer? – smci May 11 '18 at 06:53
  • It shows the following weird text -> 䥒ⱃ䥒ⱃⱂ楂⁤牐捩ⱥ楂⁤楓敺L勾䍉਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱂ⸵ⰷⰴ㜰〰㔴〰〰㡃䠮ോ《〷㐰〵〰䌰⸸䭈〬〷㐰〵〰䌰⸸䭈䈬㘬㘮ㄬ〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱂ⸷㌰ㄬ〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱁ㠱㌬〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱁ㜱㔬㐰〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〵〰䌰⸸䭈䄬ㄬ⸵㌱㔬ⰹ㜰〰㔴〰〰㡃䠮ോ《〷㐰〵〰䌰⸸䭈〬〷㐰〵〰䌰⸸䭈䈬ㄬ ... I had to delete some of it because it was too long – abhihacker02 May 11 '18 at 06:54
  • @smci But the columns are not always same .It depends on the file being readed so ultimately first i'll have to read it – abhihacker02 May 11 '18 at 06:55
  • @abhihacker02: then you can either articulate the criteria by which a column is kept/dropped, or parameterize the column-list as an argument into your function/executable. It's all very doable. I see you want to select between cols 6:7 and 8:9 on a per-row basis, but that can be done too. – smci May 11 '18 at 06:59

5 Answers5

2

The problem is here:

line=','.join(r)
print(line)
write.writerow(line)

The writerow method wants a list of columns. It will add the commas between the columns (and quote or escape anything that needs it, etc.).

But you're not giving it a list of columns; you're giving it a single string. That's what ','.join(r) does: turns a list of columns into a single comma-separated string.

When you give writerow a string, instead of a list of strings, it treats the string as a sequence of characters. (That's not specific to csv; in Python, a string is a sequence of characters.) So it treats each character as a column, and adds commas between them.

Just do this:

write.writerow(r)
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • if i directly pass the r list then the output file shows weird texts which remains same even after encoding in unicode – abhihacker02 May 11 '18 at 06:36
1

To write rows in a csv file, you should contain each string in brackets. so if you're using csvwriter.writerrow() object if you have a string like 'first_name' you should use brackets around that string to avoid ',' between each character in the csv file.

csvwriter.writerow(['first_name'])

Also if you want to write more than one rows use nested brackets like below:

csvwriter.writerows([['first_name'],['last_name']])
pouya
  • 81
  • 5
0

If you just want to read particular set of columns from one .csv file and output them to another file, you might try pandas module.

import pandas as pd

data = pd.read_csv('to_load.csv', usecols=['col1', 'col2'])
data.to_csv('to_save.csv', index=False)

What it does it first loads csv to a variable named data. Only columns you define in usecols get imported. You can reference columns by their names or by index (eg. you can usecols=[0, 1, 2] - this will import first 3 columns).

Then the variable can be easily saved to csv file again thanks to .to_csv method.

umat
  • 607
  • 1
  • 13
  • 25
0

The csv.writerow() function takes a list of items and writes it to your file with the required delimiters (by default this is a comma). So for example:

The list ['a', 'b', 'c'] is written to the file as a,b,c

import csv

def csv_reader(file,path):
    with open(path, 'w', newline='') as f_output, open(file, 'r', newline='') as f_input:
        csv_output = csv.writer(f_output)
        csv_input = csv.reader((line.replace('\0', '') for line in f_input))

        for row in csv_input:
            if len(row) and row[3] != "Trade":
                if row[6] == "":
                    output_row = [row[0], row[0], 'A', row[8], row[9], row[0]]
                else:
                    output_row = [row[0], row[0], 'B', row[6], row[7], row[0]]

                print(','.join(output_row))
                csv_output.writerow(output_row)

if __name__ == "__main__":
    path="sales.csv"
    csv_path = "FlowEdge-TRTH-Time_Sales.csv"
    csv_reader(csv_path, path)

When using csv readers and writers, you should open the file with the newline='' argument.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
  • It shows the following weird text -> 䥒ⱃ䥒ⱃⱂ楂⁤牐捩ⱥ楂⁤楓敺L勾䍉਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱂ⸵ⰷⰴ㜰〰㔴〰〰㡃䠮ോ《〷㐰〵〰䌰⸸䭈〬〷㐰〵〰䌰⸸䭈䈬㘬㘮ㄬ〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱂ⸷㌰ㄬ〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱁ㠱㌬〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〰〰㡃䠮ⱋ㜰〰㔴〰〰㡃䠮ⱋⱁ㜱㔬㐰〬〷㐰〵〰䌰⸸䭈਍㜰〰㔴〵〰䌰⸸䭈䄬ㄬ⸵㌱㔬ⰹ㜰〰㔴〰〰㡃䠮ോ《〷㐰〵〰䌰⸸䭈〬〷㐰〵〰䌰⸸䭈䈬ㄬ ... I had to delete some of it because it was too long. – abhihacker02 May 11 '18 at 06:54
  • Could you edit the question to include some suitable sample lines from your CSV file? Or you could post a link to the whole file using something like pastebin. – Martin Evans May 11 '18 at 06:55
0

Thanks all but i found the error in my code

    read  = csv.reader((line.replace('\0','') for line in f2), delimiter="\t")

This was the error as i was using tab as the delimiter in a csv file.

Corrected to

    read  = csv.reader((line.replace('\0','') for line in f2), delimiter=",")
abhihacker02
  • 51
  • 1
  • 9