Questions tagged [aws-data-wrangler]

AWS Data Wrangler offers abstracted functions to execute usual ETL tasks like load/unload data from Data Lakes, Data Warehouses and Databases. It integrates with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Project: awswrangler · PyPI

69 questions
0
votes
1 answer

wr.redshift.to_sql failed in AWS Data Wrangler2.12.1

awswrangler 2.12.1 I am able to write data.head() into the db, but got error when trying to write all the data. The data is copied from another table and did some cleaning before to_sql. I also did data =…
0
votes
1 answer

How to Use SSL Verify with AWS Wrangler

AWS Wrangler provides a convenient interface for consuming S3 objects as pandas dataframes. I want to use this instead of boto3 clients, resources, nor sessions when getting objects. I also need to use SSL verification. The following boto3 client…
Wassadamo
  • 1,176
  • 12
  • 32
0
votes
1 answer

awswrangler: Can't start a new thread when trying to read table

I am trying to access a table in an AWS bucket. When I try to access it using awswrangler.read_parquet function I get an error saying that I am not able to access that file because I can't create new threads. I am usually able to access that file…
Miss.Saturn
  • 155
  • 2
  • 13
0
votes
1 answer

Visual Studio doesn't show help pop up with DataFrame from awswrangler

I am using VS Code with Microsoft Python extension. If I create a Pandas dataframe and write the name of the variable VS Code popups all kinds of help text. However, if I have a variable made using wr.athena.read_sql_query, I don't get any help text…
0
votes
1 answer

store parquet files (in aws s3) into a spark dataframe using pyspark

I'm trying to read data from a specific folder in my s3 bucket. This data is in parquet format. To do that I'm using awswrangler: import awswrangler as wr # read data data = wr.s3.read_parquet("s3://bucket-name/folder/with/parquet/files/", dataset…
0
votes
0 answers

Why s3.to_parquet switching data types on publish to AWS Glue?

I'm creating a dataframe like so: concatdatafile = pd.concat(datafile, axis=0, ignore_index=True, sort=False) then checking some of the field data types before publish: logger.info(" *** concatdatafile['FS Seal Time…
0
votes
1 answer

Unable to properly install awswrangler on conda python 3.8 env (Connection Issue)

Here is the process I've followed so far. Create Env: conda create -n py38 python=3.6 anaconda Install awswrangler: conda install -c conda-forge awswrangler When I go into my notebook and try to import it into my notebook, I get the following…
Madhav Thaker
  • 360
  • 2
  • 12
0
votes
1 answer

Having trouble installing the most recent version of awswrangler on conda environment

My current conda environment is running python 3.8.5. When I look at their documentation, it shows that the newest version is 2.5.0. For some reason, when I initially installed it via conda install -c conda-forge awswrangler, it installed version…
Madhav Thaker
  • 360
  • 2
  • 12
0
votes
1 answer

Pandas merge two DF with rows replacement

I faced with an issue to merge two DF into one and save all duplicate rows by id value from the second DF. Example: df1 = pd.DataFrame({ 'id': ['id1', 'id2', 'id3', 'id4'], 'com': [134.6, 223, 0, 123], 'malicious': [False, False, True,…
4d61726b
  • 427
  • 5
  • 26
1 2 3 4
5