0

I have read my data from the s3 bucket into the lambda function using

tickets = s3_client.get_object(
     Bucket=bucket, Key=file_h)["Body"].read().decode('utf-8').splitlines()

then I used

tickets=pd.DataFrame(list(reader(tickets, delimiter=',')))

to convert it to a pandas dataframe. I only want one header but when I print the dataframe it shows that it has two headers. How can I only keep the second header with the names? Same with the indices, how can I only keep one?

0 1 ... 13 14
0 kennzeichen1 ... Lat Long
1 0 M ... 50.9347582 6.9456628
2 1 SU ... 50.9350537 6.9439027
3 2 BM ... 36.12707109999999 -79.86071799999999
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Jojo
  • 53
  • 1
  • 8
  • what is the output of `list(reader(tickets, delimiter=','))`? – mozway Oct 20 '22 at 09:13
  • something like this: ```ln, ADOLPHSTR., 23', '50.9354076', '6.976208600000001'], ['9991', 'N', 'PKW', 'MERCEDES', 'GOTENRING', '8', '113140',``` – Jojo Oct 20 '22 at 09:41
  • I tried ```tickets.columns = tickets.columns.droplevel(-1) ``` but get: [ERROR] ValueError: Cannot remove 1 levels from an index with 1 levels: at least one level must be left. – Jojo Oct 20 '22 at 09:47
  • 1
    Since its comma separated data you could use the pd.read_csv() function. Alternatively there are a few nice ways of setting column headers in this answer https://stackoverflow.com/questions/26147180/convert-row-to-column-header-for-pandas-dataframe – RodP Oct 23 '22 at 12:32

0 Answers0