12

I have a python script which reads an excel file from S3 but getting an error when it's triggered in AWS Batch. The code works fine on another Ubuntu box.

AttributeError: 'StreamingBody' object has no attribute 'seek'

Section of my code to read the excel is below

import boto3
import pandas as pd    
session = boto3.Session(aws_access_key_id = config.access_key_id, aws_secret_access_key = config.secret_access_key)
client = session.client('s3') 
obj = client.get_object(Bucket = s3_bucket, Key = s3_file)    
df = pd.read_excel(obj['Body'],sheet_name=sheet_name, skiprows=1)

Any help is much appreciated.

mtryingtocode
  • 939
  • 3
  • 13
  • 26

2 Answers2

24

It seems like read_excel has changed the requirements for the "file like" object passed in, and this object now has to have a seek method. I solved this by changing pd.read_excel(obj['Body']) to pd.read_excel(io.BytesIO(file_obj['Body'].read()))

Rory
  • 278
  • 3
  • 6
  • 1
    Worked for me for this error: 'StreamingBody' object has no attribute 'readable', when using pd.read_csv. Needed to substitute the get_object that worked in laptop by "readableSeg = client.get_object(Bucket=credentials_data['BUCKET'],Key='filename.csv')['Body']", and read_csv by "df = pd.read_csv(io.BytesIO(readableSeg.read()), encoding = encoding, sep = separator)" – Jose Rondon Jan 08 '21 at 12:12
0

Changing pandas version may do the job too.

pip install --upgrade pandas==1.0.1