I have a ".csv" file with multiple rows. The information is set like this:
GS3;724330300294409;50;BRABT;00147;44504942;01;669063000;25600;0
GS3;724330300294409;50;BRABT;00147;44504943;01;669063000;25600;0
GS3;724330300294409;50;BRABT;00147;44504944;01;669063000;25600;00004
I already receive information in rows (each file has almost 300000 rows). I'm sending this data to Kafka but I need to see the lines split into columns. For example:
Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9 Column10
GS3 724330300294409 50 BRABT 00147 44504942 01 669063000 25600 0
GS3 724330300294409 50 BRABT 00147 44504943 01 669063000 25600 0
GS3 724330300294409 50 BRABT 00147 44504944 01 669063000 25600 00004
I know the size for each value. For example:
3 (GS3)
15 (724330300294409)
2 (50)
5 (BRABT)
5 (00147)
8 (44504943)
2 (01)
10 (669063000)
5 (25600)
5 (0 )
I'm trying to do this through ksql on my Kafka Platform but I'm struggling. I'm new to python but it seems like a easier way to do this before I send data to Kafka.
I've been using Spooldir CSV Connector to send data to Kafka but each row is being set as a unique column on the topic.
I've used this to add ";" between data:
i = True
for line in arquivo:
if i:
i = False
continue
result = result + line[0:3].strip()+commatype+line[3:18].strip()+commatype+line[18:20].strip()+commatype+line[20:25].strip()+ ...
arquivo.close()