here is my python code in my lambda layer. Shout out to John R, for some of this paginator code. from api gateway, I pass in path param (bucket) and query string params (fmt & date), such as:
https://3snk9o61.execute-api.us-east-1.amazonaws.com/v1/br-candles?fmt=json&date=today
This code is probably overly convoluted but it works. My problem is on this line: raw_df = wr.s3.read_csv(path1,path2, use_threads=True) The commented line above that is the original and works fine, but I dont want to parse the whole bucket contents. I want the dataframe to be limited to just the specific objects that are defined in the "object_list". The error that I get "no files found on s3://br-candles/br4.csv" implies that its not seeing multiple files. It is just finding the first file but its supposed to parse a list of files. Probably a very simple fix but I would appreciate any advice. Thanks
import json
import base64
import awswrangler as wr
import boto3
def lambda_handler(event, context):
s3 = boto3.client('s3')
object_list = []
bucket_name = event['pathParameters']['bucket']
format = event['queryStringParameters']['fmt']
day = event['queryStringParameters']['date']
print(day)
paginator = s3.get_paginator("list_objects_v2")
page_iterator = paginator.paginate(Bucket=bucket_name)
for result in page_iterator:
object_list += filter(lambda obj: obj['Key'].endswith('.csv'), result['Contents'])
object_list.sort(key=lambda x: x['LastModified'])
A = (object_list[-1]['Key'])
B = (object_list[-4]['Key'])
full_path = f"s3://{bucket_name}"
path1 = f"s3://{bucket_name}/{A}"
path2 = f"s3://{bucket_name}/{B}"
#raw_df = wr.s3.read_csv(path=full_path, path_suffix=['.csv'], use_threads=True)
raw_df = wr.s3.read_csv(path1,path2, use_threads=True)
for df in raw_df:
if day == 'today':
etc.etc.. no issues below