Amazon S3 Select enables applications to retrieve only a subset of data from an Amazon S3 object by using simple SQL expressions.
Questions tagged [amazon-s3-select]
91 questions
0
votes
1 answer
Empty Column not being listed in S3 select in databricks
I'm querying a JSON file in S3 with multiple columns:
SELECT a, b, c FROM json.`s3://my-bucket/file.json.gz`
And the file looks like this:
{a: {}, b: 0, c: 1}
{a: {}, b: 1, c: 2}
{a: {}, b: 2, c: 3}
The query above fails and…

tty
- 95
- 1
- 5
0
votes
0 answers
Amazon S3 Select: getting truncated data when querying large JSON object
I have a big json document stored in S3 with a structure like this:
{ "result": {
"id": "123",
"commits": ["comm1", "comm2", ..., "commN"]
}
}
There are other fields there too and the number of commits can go into thousands.
When I…

Juraj Martinka
- 3,991
- 2
- 23
- 25
0
votes
0 answers
S3 select aws console
I have the following data in a file:
{"new_date":"2022-06-09","code":34,"value":33,"id":18}
{"new_date":"2022-06-09","code":34,"value":36,"id":19}
{"new_date":"2022-06-09","code":34,"value":35,"id":15}
In AWS console im trying to execute the…

user1555190
- 2,803
- 8
- 47
- 80
0
votes
2 answers
botocore.exceptions.ClientError: An error occurred (ParseSelectMissingFrom) when calling the SelectObjectContent operation: Missing FROM after SELECT
Goal: Use S3 Select to extract columns from a .parquet on S3.
I've tried various queries. Including the key in the query makes no difference.
Code:
s3 = boto3.client('s3')
s3_uri = 's3://my-bucket/my-folder/'
bucket, prefix =…

DanielBell99
- 896
- 5
- 25
- 57
0
votes
2 answers
Getting partial json response for s3select with aws java sdk v2
I am trying to implement s3select in a spring boot app to query parquet file in s3 bucket, I am only getting partial result from the s3select output, Please help to identify the issue, i have used aws java sdk v2.
Upon checking the json…

Barani
- 33
- 6
0
votes
1 answer
How can we search the json file based on date field using S3 select
I am trying to query data on JSON file using S3-Select. I am unable to filter based on the date field. I have tried using current_date, sysdate and a few CAST options too. I am planning to compute the curent_date and send it from API. Before that, I…

Ashwin Kumar
- 101
- 1
- 1
- 9
0
votes
1 answer
Getting OverMaxRecordSize when querying through S3 select
Getting the following error The character number in one record is more than our max threshold, maxCharsPerRecord: 1,048,576 while running any query and trying to fetch any record.
I've tried changing from JSON schema to CSV but that hasn't worked.…

fvnbab
- 1
- 1
0
votes
1 answer
AWS SDK2 java s3 select example - how to get result bytes
I am trying to use aws sdk2 java for s3 select operations but not able to get extract the final data. Looking for an example if someone has implemented it. I got some idea from [this post][1] but not able to figure out how to get and read the full…

user1805280
- 251
- 1
- 5
- 14
0
votes
0 answers
AWS S3 Select CSV WHERE filtering not working on last column
I have a csv file in my s3 that looks like this
name,status,age,loc
aaa,aaa,1,zz
bbb,bbb,2,yy
ccc,,3,pp
ddd,ddd,4,aaa
SELECT * FROM s3object s WHERE name ='aaa'
This query returns first row correctly.
SELECT * FROM s3object s WHERE loc ='aaa'
This…

lclankyo
- 221
- 3
- 10
0
votes
1 answer
how to read json.snappy file from athena
I have input file in s3 bucket with .json.snappy compression and I am trying to read through athena table. I tried using different serde 'org.apache.hive.hcatalog.data.JsonSerDe' & 'org.openx.data.jsonserde.JsonSerDe' but it didn't work, Athena…

PB22
- 31
- 4
0
votes
0 answers
using s3 select I need to query JSON file. need some examples code snippets
using s3 select I need to query JSON file. need some examples code snippets using boto3
Thanks in advance
sundar

Sundar
- 95
- 1
- 13
0
votes
0 answers
S3 Select (python) does not return the header when using WHERE clause despite FileHeaderInfo=NONE
When I submit this query:
SELECT * FROM s3object with the FileHeaderInfo of the input serialization set to NONE, I get the records expected with their header.
As soon as I add a where clause like this:
SELECT * FROM s3object WHERE _3 =…

Jeff Saremi
- 2,674
- 3
- 33
- 57
0
votes
1 answer
ClientError: An error occurred (InvalidTextEncoding) when calling the SelectObjectContent operation: UTF-8 encoding is required. reading gzip file
I am getting the above error in my code. encoding=latin-1 needs to be included as a parameter somewhere in select-object-content. Since I am new to this, I am not sure, where to add it.
Can anyone help me in this?
Code:
client =…

Beginner
- 143
- 1
- 12
0
votes
1 answer
JS : Getting datatype from CSV using Amazon S3 Select
I am trying to read a CSV from amazon S3 bucket (this could be any CSV so I do not have the header/datatype info ahead of read.
I am able to get the header info using :
const params = {
Bucket: 'mybucket',
Key: file,
ExpressionType:…

user14013917
- 149
- 1
- 10
0
votes
1 answer
S3-Select Pricing on JSON
I am confused about the S3 select pricing regarding data returned and data scanned. If I want to access something at an index in a json file, does it still scan the entire file and the data scanned counts for the entire file size? Suppose I use the…

Nock
- 3
- 1