Questions tagged [pyathena]

Resources:

50 questions
2
votes
2 answers

Query a table/database in Athena from a Notebook instance

I have developed different Athena Workgroups for different teams so that I can separate their queries and their query results. The users would like to query the tables available to them from their notebook instances (JupyterLab). I am having…
1
vote
1 answer

SQL Datetime WHERE clause returning wrong month

I am extracting data from AWS Athena using pyathena library and following function: def import_ben_datalake(ACCESS_KEY, SECRET_KEY, S3_DIR, REGION, start, end): conn = pyathena.connect(aws_access_key_id = ACCESS_KEY, …
danimille
  • 350
  • 1
  • 12
1
vote
0 answers

Pyathena error: None type has no attribute get

I'm coding a python script, to import data from aws s3 bucket. In some machines, it returns None type has no attribute get. While debugging pyathena, I have found that this error comes from a lambda expression in util.py file, in the following…
1
vote
2 answers

Choosing data catalog in pyathena?

I'm trying to use pyathena (which looks simpler than the native boto3) to perform some queries . However, I wasn't able to find how can I define which data catalog to use. For example the query execution using boto3: athena_client =…
Nir99
  • 185
  • 3
  • 15
1
vote
2 answers

Unable to read data from AWS Glue Database/Tables using Python

My requirement is to use python script to read data from AWS Glue Database into a dataframe. When I researched I fought the library - "awswrangler". I'm using the below code to connect and read data: import awswrangler as wr profile_name =…
PKV
  • 167
  • 3
  • 13
1
vote
1 answer

TypeError: No matching overloads found for java.util.Properties.setProperty(str,str)

I was trying to connect to an athena database with PyAthenaJDBC. I was looking for some information about how to do this and I trid this code: import contextlib from urllib.parse import quote_plus # PY2: from urllib import quote_plus from…
1
vote
0 answers

Can not query to AWS athena from Google Colaboratory

I want to execute a query to AWS athena by pyathena on Google Colaboratory. But NoCredentialsError will occur. NoCredentialsError: Unable to locate credentials As running the same code in sagemaker will succeed, I think the code and user…
michi kan
  • 51
  • 4
1
vote
2 answers

Store Amazon Athena Query Results into new Table

I need to store Amazon Athena query results into New Amazon Athena Table.
Tajinder
  • 2,248
  • 4
  • 33
  • 54
0
votes
0 answers

Sagemaker Spark Session - Import aws athena table

I was hoping to import an AWS Athena Database table within a spark session. I have previously setup Notebook instances and used the pyathena library to connect to the athena table and then run Pandas dataframes. However I would like to use Pyspark…
0
votes
0 answers

Running Great Expectations in AWS Athena

Hello I need some help on running GX on AWS Athena. Here is my config conn_str = f"awsathena+rest://:@athena.{region_name}.amazonaws.com/{schema_name}?s3_staging_dir={s3_staging_dir}" data = context.sources.add_sql( name="props",…
Muhammad Raihan Muhaimin
  • 5,559
  • 7
  • 47
  • 68
0
votes
0 answers

Getting the following error: AttributeError: 'Engine' object has no attribute '_instantiate_plugins', when connect langchain.SQLDatabase to AWS Athena

I'm trying to connect langchain + aws athena, and facing the following error: from langchain import SQLDatabase, SQLDatabaseChain from sqlalchemy import…
0
votes
1 answer

In Pyathena can you insert multiple values(Dates) into a SQL parameter?

I create a dataframe in a Jupyter Notebook instance (AWS Sagemaker) by connecting to an AWS Athena table using a SQL connection like the example below: I have made a paramter per the link (https://pypi.org/project/pyathena/#query-with-parameter).…
Patty
  • 41
  • 1
  • 7
0
votes
0 answers

PyAthena parses an ARRAY column correctly but the result is a string

Starting from a single column abc of type ARRAY with one row: SELECT ARRAY ['a', 'b', 'c'] AS abc If we execute the query with pyathena using the ArrowCursor: cursor = pyathena.connect(**AWS_PARAMETERS).cursor(ArrowCursor) execution_result…
Filippo Vitale
  • 7,597
  • 3
  • 58
  • 64
0
votes
1 answer

ModuleNotFoundError: No module named 'pyathena' when running AWS Glue Job

Despite setting the parameter for my Python AWS Glue Job like this: --additional-python-modules pyathena I still get the following error when I try and run the job: ModuleNotFoundError: No module named 'pyathena' I have also tried the following…
ChrisDanger
  • 1,071
  • 11
  • 10
0
votes
0 answers

Running synchronized aws athena queries

As part of benchmarking aws Athena vs server-less Redshift, I'm working on writing a load test script based on Locust and later compare the results. I'm using Python 3.10.4. When I started working on the Athena client I'll use, I noticed that the…
mango
  • 1
  • 2