1

I have a text file with 300,000 records. A sample is below:

{'AIG': 'American International Group', 'AA': 'Alcoa Corporation', 'EA': 'Electronic Arts Inc.'}

I would like to export the records into a csv file like this:

enter image description here

I tried the following, but it does not put any line breaks between the records. So in excel, the 300,000 records are in two rows, which doesn't fit everything (16,000 column limit in Excel).

import pandas as pd

read_file = pd.read_csv (r'C:\...\input.txt')
read_file.to_csv (r'C:\...\output.csv', index=None)
HTMLHelpMe
  • 197
  • 6

1 Answers1

2

You can try:

import pandas as pd
from ast import literal_eval

with open('your_file.txt', 'r', encoding='utf-8') as f_in:
    data = literal_eval(f_in.read())

df = pd.DataFrame([{'Ticker': k, 'Name': v} for k, v in data.items()])
print(df)

# to save to CSV:
#df.to_csv('out.csv', index=False)

Prints:

  Ticker                          Name
0    AIG  American International Group
1     AA             Alcoa Corporation
2     EA          Electronic Arts Inc.
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • I do get `UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 175109: character maps to ` probably because it is such a large file and there might be some missing ":", etc. – HTMLHelpMe Mar 09 '23 at 23:01
  • 1
    @HTMLHelpMe This is probably because of the `print(df)`. If you remove it, do you still get the error? Also try `df.to_csv('out.csv', index=False, encoding='utf-8')` – Andrej Kesely Mar 09 '23 at 23:02
  • Thank you. Yes, still there. Tried `df.to_csv('out.csv', index=False, encoding='utf-8')` as well. Here is additional log detail: `----> 5 data = literal_eval(f_in.read())`. `C:\ProgramData\Anaconda3\lib\encodings\cp1252.py in decode(self, input, final) 21 class IncrementalDecoder(codecs.IncrementalDecoder): 22 def decode(self, input, final=False): ---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0] 24 25 class StreamWriter(Codec,codecs.StreamWriter):` – HTMLHelpMe Mar 09 '23 at 23:10
  • 1
    @HTMLHelpMe Try to open the file `with open('your_file.txt', 'r', encoding='utf-8') as f_in:` – Andrej Kesely Mar 09 '23 at 23:13
  • 1
    Perfect! Works great with that last change. Thank you so much :) – HTMLHelpMe Mar 09 '23 at 23:38