-1

Here's a sample of my data:

from io import StringIO

data = StringIO("""software,version
Visual C++ Minimum Runtime,11.0.61030
Visual C++ Minimum Runtime,11.0.61030
Visual C++ Minimum Runtime,11.0.61030.0.0.0.0""")

Notice that the last record the version number has 0.0.0.0 in it .

How can I get to xx.yy.zz first front 3 characters and clean up the remaining data?

As an example: Visual C++ Minimum Runtime,11.0.61030.0.0.0.0 should be truncated to:

"Visual C++ Minimum Runtime,11.0.61030"

Is there an efficient way to accomplish this?

Alexander L. Hayes
  • 3,892
  • 4
  • 13
  • 34
  • You could use the `csv` and `re` modules. csv.reader will read rows, line by line, use `re` as a regular expression to truncate the stirng. And then csv.writer to write. – tdelaney Dec 14 '22 at 01:43

1 Answers1

0

You could use generators to load the file row by row and then write the truncated rows to a backup file. eg.

import csv

filename = "foo.csv"

def get_row(filename):
    with open(filename, "rb") as csvfile:
        data = csv.reader(csvfile)
        yield next(data)

with open('truncated.csv','wb') as truncatedcsv:
    writer = csv.writer(truncatedcsv, delimiter=',')
    for row in get_row(filename):
        truncated_row = # your truncation logic
        writer.writerow(truncated_row)

Don't forget to rename the new file and delete the old one.