0

I have an aws-lambda function that is triggered whenever a CSV file is uploaded to s3 bucket. I am using serverless framework with Python 3.6, the issue is that i am getting this error message

a bytes-like object is required, not 'str': TypeError

Traceback (most recent call last):

File "/var/task/handler.py", line 33, in csvfile

fichier = obj['Body'].read().split('\n')

TypeError: a bytes-like object is required, not 'str'

I have did some research in the net here, the issue, is i am not using the open method cause the file is read by s3 event so don't know how to fix it

Here is my code :

import logging
import boto3
from nvd3 import pieChart
import sys
import csv


xdata = []
ydata = []
xdata1 = []
ydata1 = []


logger = logging.getLogger()
logger.setLevel(logging.INFO)

def csvfile(event, context):

    s3 = boto3.client('s3')    
    # retrieve bucket name and file_key from the S3 event
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    file_key = event['Records'][0]['s3']['object']['key']
    logger.info('Reading {} from {}'.format(file_key, bucket_name))
    # get the object
    obj = s3.get_object(Bucket=bucket_name, Key=file_key)
    # get lines inside the csv
    fichier = obj['Body'].read().split('\n')
    #print lines
     for ligne in fichier:
        if len(ligne) > 1:
            logger.info(ligne.decode())
            liste = ligne.split(',')
            print(liste)
            if liste[2] == 'ByCateg':
                xdata.append(liste[4]) 
                ydata.append(liste[1]) 
            elif liste[2] == 'ByTypes':
                xdata1.append(liste[4]) 
                ydata1.append(liste[1]) 

         print ' '.join(xdata) 

print('Function execution Completed')

and here is my serverless.yml code:

service: aws-python # NOTE: update this with your service name

provider:
  name: aws
  runtime: python3.6
  stage: dev
  region: us-east-1
  iamRoleStatements:
        - Effect: "Allow"
          Action:
              - s3:*
              - "ses:SendEmail"
              - "ses:SendRawEmail"
              - "s3:PutBucketNotification"
          Resource: "*"

    functions:
  csvfile:
    handler: handler.csvfile
    description: send mail whenever a csv file is uploaded on S3 
    events:
      - s3:
          bucket: car2
          event: s3:ObjectCreated:*
          rules:
            - suffix: .csv
ner
  • 711
  • 3
  • 13
  • 30

1 Answers1

4

The problem is that

fichier = obj['Body'].read()

returns a bytes object, not a string. This is because the encoding may require more than a single character. Now you're using split on a bytes object, but you can't split it using a string, you need to split using another bytes object. Specifically

fichier = obj['Body'].read().split(b'\n')

should fix your error, but depending on what you're expecting maybe decoding is more appropriate before the split?

fichier = obj['Body'].read().decode("utf-8").split('\n')
kabanus
  • 24,623
  • 6
  • 41
  • 74
  • i already try this before, i am getting this error message : 'str' object has no attribute 'decode': AttributeError Traceback (most recent call last): File "/var/task/handler.py", line 38, in csvfile logger.info(ligne.decode()) AttributeError: 'str' object has no attribute 'decode'` – ner Nov 14 '17 at 10:24
  • 3
    That's a different problem. You converted `fichier` to string with that last decode, and I'm guessing somewhere else in your code you passed it as is, but a bytes object is expected - I'm not going to debug for you, the stack should help. You can just stay with bytes all over - use the first version, and use `bytes` everywhere. – kabanus Nov 14 '17 at 10:27
  • @kabanus, this solution worked for me. I was reading the data from the S3 file and getting error to split the data because of byte format. – AbhinavVaidya8 Apr 04 '20 at 12:46