0

I have a major issue with my csv file, can anyone here suggestion a possible python solution to my problem?

In my csv file, the 'remarks' text column creates multiple newlines and appended itself to the next row, essentially messing up the row order. I tried to read it as text, split it by newline and delimiter, but it is challenging because newlines created from 'remarks' vary in order.

I have attached below the sample csv file below for your reference, its in txt format so you can have a better understanding of the delimiter formats, your inputs will be grateful.

Current File

key1\tkey2\tremarks\tdate_created\tprogram_type\n
1910-ASD3\tT342-1AE2\tJohan has applied for\n
this program on 2020-03-13, good application etc.\tprogram_A\n
9572-45A3\t823A-1T3C\tMary has applied for this program\n
on 2019-03-13, she has doubts about this program\n
so she switched her program on 2019-04-13 etc.\tprogram_B\n
842E-123A\t343D-6TYB\t\tnot enrolled\n

Desired Outcome

key1\tkey2\tremarks\tdate_created\tprogram_type\n
1910-ASD3\tT342-1AE2\tJohan has applied for this program on 2020-03-13, good application etc.\tprogram_A\n
9572-45A3\t823A-1T3C\tMary has applied for this program on 2019-03-13, she has doubts about this program so she switched her program on 2019-04-13 etc.\tprogram_B\n
842E-123A\t343D-6TYB\t\tnot enrolled\n
Rootie
  • 111
  • 1
  • 1
  • 8
  • this is tricky because the number of rows it can be split into seems to vary. are you able to get a different format from the upstream producer of the data? if yes, I would ask for a new delimiter to be used in this file, one that is not `\n` – gold_cy Sep 08 '20 at 17:34
  • thanks for the prompt reply, the previous files had the remarks removed because of this issue, but there was a request by the user to show the remarks column for their daily work. If you know a workaround to this do let me know thanks! One thought I had in mind was to insert quotes in remarks column, but im not sure on the approach – Rootie Sep 08 '20 at 17:38
  • ask for the file to be created using a different delimiter other than `\n`. that solution is straightforward and will save you any sort of wrangling – gold_cy Sep 08 '20 at 17:59

0 Answers0