0

I'm pushing a dataframe to an s3 bucket using s3fs with the following code:

s3fs = s3fs.S3FileSystem(anon=False)

with s3fs.open(f"bucket-name/csv-name.csv",'w') as f:
      my_df.to_csv(f)

The action is completed successfully, but the csv has every other row empty:

enter image description here

I'm sure this is not an issue with the dataframe since I've also tried to push the csv to s3 with a different method and the csv was properly formatted.

The code for it:

s3_res.Object(bucket_name, s3_object_name).put(Body=csv_buffer.getvalue())

Is there a setting I can use to fix or mitigate this?

Andrew Gaul
  • 2,296
  • 1
  • 12
  • 19
ire
  • 491
  • 2
  • 12
  • 26
  • 1
    Reading around, it seems that a /r/n is added at the end of each line. Try adding newline=' ' to the .open() method such that `with s3fs.open(f"bucket-name/csv-name.csv",newline=' ','w') as f: my_df.to_csv(f)` – Trygvi Laksafoss Jul 12 '22 at 18:21
  • 1
    I got it to work with your idea! I had to make some changes though: newline has to be after 'w' and the newline ' ' says that the empty line is an illegal value, so changing it just to '' makes it work. So the working code looks like this: `with s3fs.open(f"bucket-name/csv-name.csv",'w',newline='') as f: my_df.to_csv(f)` You can submit that as an answer and I'll accept it. – ire Jul 13 '22 at 07:15

1 Answers1

1

It seems the s3fs package adds /r/n to the end of each line. Adding newline=" to the .open() method should solve it.

with s3fs.open(f"bucket-name/csv-name.csv",'w',newline='') as f: my_df.to_csv(f)