Reading excel file from s3 using pandas in lambda and convert to csv

Question

I'm trying to read an excel file from a s3 bucket using python in lambda, do some manipulations using pandas, convert it to csv and putback to same bucket.

import pandas as pd
import boto3
import os
from urllib.parse import unquote_plus

    def lambda_handler(event, context):
        s3=boto3.client('s3')
        if event:

            file_obj=event['Records'][0]
            bucket = event['Records'][0]['s3']['bucket']['name']
            key = event['Records'][0]['s3']['object']['key'].encode('utf-8')
            filename=unquote_plus(key)
            ----this line throws error "a bytes-like object is required, not 'str': TypeError"

            print("Filename:",filename)
            q=(os.path.splitext(os.path.basename(filename))[0]) #read filename
            print(q)
            obj=s3.get_object(Bucket=bucket,Key=filename)
            print(obj['Body'])                       
            pd.read_excel(obj['Body'],index_col=False,header=5,usecols="A,C:M,U")
            df=df[:-1]                             
            df=df.replace(np.nan,'')                

            print(df)
            dfcsv = df.to_csv('s3://bucket/sales.csv' ,sep='\t',encoding='utf-8',index=False) #converttocsv

This throws the error:

Install xlrd >= 1.0.0 for Excel support: ImportError

Also this code works perfectly fine in local envrironment but fails in lambda.

I tried to import xlrd but it throws syntax error

Also is there a better way of writing the code for my requirement?

Possible duplicate of [Python: Pandas pd.read\_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support](https://stackoverflow.com/questions/48066517/python-pandas-pd-read-excel-giving-importerror-install-xlrd-0-9-0-for-excel) — petezurich, May 23 '19 at 05:44
I suggest you try this out locally and then package the libraries that work in your lambda along with your code — Ninad Gaikwad, May 23 '19 at 06:05
@petezurich i am running this on lambda, and when i import xlrd, it is throwing syntax error — Tejas, May 23 '19 at 08:59
@Tejas This is to be expected since you haven't installed xlrd for your lambda function. See Ninad Gaikwad comment. You need to package the libraries that aren't installed by default and upload these with your code (e.g. as a zip File). — petezurich, May 23 '19 at 09:32
@NinadGaikwad, tried packing libraries (including xlrd) with the code in aws lambda, but still gives ` No module named 'xlrd'` — pc_pyr, Aug 31 '20 at 12:14
@pc_pyr You should package the xlrd library separately in a layer and then attach that layer to your lambda. Also make sure to run the pip install command on a linux machine for best compatibility. Certain libraries are different on windows and linux. — Ninad Gaikwad, Aug 31 '20 at 13:38

Reading excel file from s3 using pandas in lambda and convert to csv

0 Answers0