0

I'm running to_csv command as follows to an ouput file on a s3 bucket with ServerSideEncryption enabled:

to_csv("s3://mys3bucket/result.csv",
       storage_option={'s3_additional_kwargs':
           {'ServerSideEncryption': 'AES256'}})

I'm getting the following attribute error:

 File "/usr/lib/python2.7/site-packages/dask/dataframe/core.py", line 1091, in to_csv
    return to_csv(self, filename, **kwargs)
  File "/usr/lib/python2.7/site-packages/dask/dataframe/io/csv.py", line 577, in to_csv
    delayed(values).compute(get=get, scheduler=scheduler)
  File "/usr/lib/python2.7/site-packages/dask/base.py", line 156, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/lib/python2.7/site-packages/dask/base.py", line 400, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/usr/lib/python2.7/site-packages/distributed/client.py", line 2159, in get
    direct=direct)
  File "/usr/lib/python2.7/site-packages/distributed/client.py", line 1562, in gather
    asynchronous=asynchronous)
  File "/usr/lib/python2.7/site-packages/distributed/client.py", line 652, in sync
    return sync(self.loop, func, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/distributed/utils.py", line 275, in sync
    six.reraise(*error[0])
  File "/usr/lib/python2.7/site-packages/distributed/utils.py", line 260, in f
    result[0] = yield make_coro()
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 260, in result
    raise_exc_info(self._exc_info)
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/usr/lib/python2.7/site-packages/distributed/client.py", line 1439, in _gather
    traceback)
  File "/usr/lib/python2.7/site-packages/dask/dataframe/io/csv.py", line 439, in _to_csv_chunk
    df.to_csv(f, **kwargs)
  File "/usr/lib64/python2.7/site-packages/pandas/core/frame.py", line 1745, in to_csv
    formatter.save()
  File "/usr/lib64/python2.7/site-packages/pandas/io/formats/csvs.py", line 161, in save
    buf = f.getvalue()
  File "/usr/lib/python2.7/site-packages/dask/bytes/utils.py", line 136, in __getattr__
    return getattr(self.file, key)
AttributeError: 'S3File' object has no attribute 'getvalue'

I searched for this error, but couldn't find a relevant solution. Do you have any idea?

Jacob Tomlinson
  • 3,341
  • 2
  • 31
  • 62
Dhruv Kumar
  • 399
  • 2
  • 13
  • I'm afraid the exact same code works for me. Perhaps check your versions? – mdurant Jun 26 '18 at 13:22
  • @mdurant, the code snippet::: `res=df1.join(df2,lsuffix='_left',rsuffix='_right') res.to_csv("s3://mys3bucket/result.csv", storage_option= {'s3_additional_kwargs': {'ServerSideEncryption': 'AES256'}})` Hope this helps – Dhruv Kumar Jun 27 '18 at 05:08

0 Answers0