1

I have a CSV file, here are two lines in the file.

c1,c2,c3,c4,c5
17939,2507974,11,DVD version has 1 hour of extras of 5 bonus matches including: - Stacy Keibler vs Torrie Wilson in a bikini contest. - A tour of Trish Stratus\' place. - Behind the scenes look at the WWE women division.,NULL
16641,2425413,11,"The Australian TV version had a scene included at the end where a cop car was driving in an alley way, narrowly missing someone walking. This scene was also used in the 1980 film, \"Alligator\".",NULL
127472,2130098,13,"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\\", which would be appropriate for a DOS/Windows system.",NULL

I would like to cut the c4 columns to maximal length (say 500) and keep everything else unchanged and save it to a new csv file.

Here is my implementation.

import csv
import sys

with open("new_file_name.csv", 'w', newline='') as csvwriter:
    spamwriter = csv.writer(csvwriter, delimiter=',', quotechar='"', escapechar='\\')
    with open("old_file_name.csv", newline='') as csvreader:
        spamreader = csv.reader(csvreader, delimiter=',', quotechar='"', escapechar='\\')
        for row in spamreader:
            if len(row[3]) > 500:
                print("cut this line")
                row[n] = row[n][:500]
            spamwriter.writerow(row)

However, the CSV file that I obtained is

17939,2507974,11,DVD version has 1 hour of extras of 5 bonus matches including: - Stacy Keibler vs Torrie Wilson in a bikini contest. - A tour of Trish Stratus' place. - Behind the scenes look at the WWE women division.,NULL
16641,2425413,11,"The Australian TV version had a scene included at the end where a cop car was driving in an alley way, narrowly missing someone walking. This scene was also used in the 1980 film, ""Alligator"".",NULL
127472,2130098,13,"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\", which would be appropriate for a DOS/Windows system.",NULL

The black-slash is missing in my new csv file. What I want is

17939,2507974,11,DVD version has 1 hour of extras of 5 bonus matches including: - Stacy Keibler vs Torrie Wilson in a bikini contest. - A tour of Trish Stratus\' place. - Behind the scenes look at the WWE women division.,NULL
16641,2425413,11,"The Australian TV version had a scene included at the end where a cop car was driving in an alley way, narrowly missing someone walking. This scene was also used in the 1980 film, \"Alligator\".",NULL
127472,2130098,13,"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\\", which would be appropriate for a DOS/Windows system.",NULL

I try something like quoting=csv.QUOTE_ALL, but it also changes my origin CSV file when value of c4 is less than 500. What I want is a new CSV file without changing any origin character for the first 500 characters.

Thanks.

Polaris
  • 366
  • 1
  • 4
  • 13
  • Not related, but there is no need to chain two `with open` statements. You can simply `with open('f1') as f1, open('f2', 'w') as f2:`, for example. That will improve the readability of your code, IMHO. – accdias Dec 01 '22 at 18:43
  • very good suggestion!!! Thank you so much! do you also know how to fix my issue? that will be great for me. – Polaris Dec 01 '22 at 18:47
  • I'm not sure, but I guess `csv.writer(csvwriter, csv.QUOTE_ALL)` can help you. – accdias Dec 01 '22 at 18:55

1 Answers1

0

You can use doublequote=False in csv.writer:

import csv

with open("input.csv", "r") as f_in, open("output.csv", "w") as f_out:
    reader = csv.reader(f_in, delimiter=",", quotechar='"', escapechar="\\")
    writer = csv.writer(
        f_out,
        delimiter=",",
        quotechar='"',
        escapechar="\\",
        doublequote=False,
    )

    writer.writerow(next(reader))
    for row in reader:
        row[3] = row[3][:500]
        writer.writerow(row)

The output.csv becomes:

c1,c2,c3,c4,c5
17939,2507974,11,DVD version has 1 hour of extras of 5 bonus matches including: - Stacy Keibler vs Torrie Wilson in a bikini contest. - A tour of Trish Stratus' place. - Behind the scenes look at the WWE women division.,NULL
16641,2425413,11,"The Australian TV version had a scene included at the end where a cop car was driving in an alley way, narrowly missing someone walking. This scene was also used in the 1980 film, \"Alligator\".",NULL
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • @Polaris Please, edit your question and put the line there. – Andrej Kesely Dec 01 '22 at 19:14
  • Thank you so much!!! It works for this line. However, I have another line which is `127472,2130098,13,"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\\", which would be appropriate for a DOS/Windows system.",NULL` and it becomes `127472,2130098,13,"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\", which would be appropriate for a DOS/Windows system.",NULL` And it has one backslash missing (from `\"C:\\\"` to `\"C:\\"`). Sorry for this stupid question. – Polaris Dec 01 '22 at 19:17
  • Just to be clear the original CSV is a valid CSV, however, the modified one is a little bit confused. And some systems consider it is an invalid CSV when I import it. So this line `127472,2130098,13,"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\", which would be appropriate for a DOS/Windows system.",NULL`, they think the c3 is `"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\"`, instead of `"FACT: Dunn uploads a file from an Apple Powerbook in \"C:\\\", which would be appropriate for a DOS/Windows system."` – Polaris Dec 01 '22 at 19:25