0

Working with a text file that looks like this.

I am trying to append each line into an array and turning it to a clean dataframe. I used line.split() for lines being appended into the array but values in COL J would disappear for some rows when being appended into array because they are blank.

My code looks like this now.

import pandas as pd

my_array = []
with open('usesample01.txt') as my_file:
    for line in my_file:
        if 'RECORD' not in line and 'JOURNAL' not in line and 'VERSION' not in line and 'TIME' not in line and '-' not in line:
            my_array.append(line.split())

df = pd.DataFrame(my_array, columns = ['A','B','C','D','E','F','G','H','I','J','K','L','M'])
print(df)
lenoob
  • 1

1 Answers1

0

Your file seems to have fixed-width columns, if this is case you might reveal length of each column and then use slice to extract, consider following simple example, let file.txt content be

----- ------- -----
Able          Baker
      Charlie      

then

import pandas as pd
with open("file.txt","r") as f:
    data = [(i[:5].strip(), i[6:13].strip(), i[14:].strip()) for i in f.readlines()]
df = pd.DataFrame(data,columns=['A','B','C'])
print(df)

gives output

       A        B      C
0  -----  -------  -----
1   Able           Baker
2         Charlie       
Daweo
  • 31,313
  • 3
  • 12
  • 25