-2

I need help to convert simple_line.txt file to csv file using the pandas library. However, I am unable to categorize image file where i want to create all the values after first space in one column.

Here is the file (sample_list.txt), listed row by row:

Image            Label
doc_pres223.jpg Durasal
doc_pres224.jpg Tab Cefepime
doc_pres225.jpg Tab Bleomycin
doc_pres226.jpg Budesonide is a corticosteroid,
doc_pres227.jpg prescribed for inflammatory,

I want the csv file to be like- enter image description here

3 Answers3

1
txt_file = r"./example.txt"
csv_file = r"./example.csv"

separator = "; "

with open(txt_file) as f_in, open(csv_file, "w+") as f_out:
    for line in f_in:
        f_out.write(separator.join(line.split(" ", maxsplit=1)))
Sebastian Loehner
  • 1,302
  • 7
  • 5
0

try this:

import pandas as pd

def write_file(filename, output):
    df = pd.DataFrame()
    lines = open(filename, 'r').readlines()
    for l in range(1, len(lines)):
        line = lines[l]
        arr = line.split(" ", maxsplit=1)
        image_line = arr[0]
        label_line = arr[1].replace('\n', '')
        df = df.append({'Image': image_line, 'Label': label_line}, ignore_index=True)

    df.to_csv(output)


if __name__ == '__main__':
    write_file('example.txt', 'example.csv')
Serhii Matvienko
  • 292
  • 3
  • 15
0

If the filenames in column Image is always the same length, then you could just treat is as a fixed width file. So the first column would be 15 characters, and the rest is the second column. Then just add two empty columns and write it to a new file.

# libraries
import pandas as pd

# set filename
filename = "simple_line.txt"

# read as fixed width
df = pd.read_fwf(filename, header=0, widths=[15, 100])

# add 2 empty columns
df.insert(1, 'empty1', '')
df.insert(2, 'empty2', '')

# save as a new csv file
filenew = "output.csv"
df.to_csv(filenew, sep=';', header=True, index=False)
BdR
  • 2,770
  • 2
  • 17
  • 36