I create a dataframe in a Jupyter Notebook instance (AWS Sagemaker) by connecting to an AWS Athena table using a SQL connection like the example below:
I have made a paramter per the link (https://pypi.org/project/pyathena/#query-with-parameter). The parameter is allowing me to insert a single month into the query but fails if I try multiple months - can someone please assist me with how you can insert multiple months into the dictionary :
from pandasql import sqldf
from pyathena import connect
from pyathena.pandas.cursor import PandasCursor
curr_month = 202206
prev_month = 202203
conn = connect(work_group='prod_user',
region_name= 'eu-west-1')
sql_query = ('''
select
year_month,
id,
field_1,
field_2,
field_3
from
mart.table_xyz
WHERE
year_month = (%(param)s)
;
''')
cursor = connect(work_group='prod_user',
region_name= 'eu-west-1',
cursor_class=PandasCursor).cursor()
user_df = cursor.execute(sql_query,{"param": curr_month}).as_pandas()
I have tried a list like below to return two months but that causes an error. Any help would be appreciated
cursor = connect(work_group='prod_user',
region_name= 'eu-west-1',
cursor_class=PandasCursor).cursor()
user_df = cursor.execute(sql_query,{"param": [curr_month,prev_month]}).as_pandas()