Amazon S3 Select enables applications to retrieve only a subset of data from an Amazon S3 object by using simple SQL expressions.
Questions tagged [amazon-s3-select]
91 questions
1
vote
1 answer
Write a S3 Select query to exclude a carriage return(\r) rows
I have a csv column that has data with \r character. How can write a query to eliminate such data
SELECT rv FROM s3object s
this gives me:
I don't want such rows. Want to eliminate it all.
This query still returns me the same results
SELECT rv…

Denil Parmar
- 154
- 12
1
vote
1 answer
Getting maxCharsPerRecord: 1,048,576 in S3 in AWS S3 SelectObjectContent
I am fetching records from the s3 JSON file using s3 select. Everything working for me when I fetch data from small JSON files ie 2MB(with record count around 10000)
Following is my query
innerStart = 1
innerStop = 100
maximumLimit = 100
query =…

Dibish
- 9,133
- 22
- 64
- 106
1
vote
0 answers
How to use S3 Select for Nested Parquet Objects
I have dumped data into a parquet file.
When I use
SELECT * FROM s3object s LIMIT 1
it gives me the following result.
{
"name": "John",
"age": "45",
"country": "USA",
"experience": [{
"company": {
…

Natasha Perera
- 175
- 12
1
vote
2 answers
javascript - convert string to json array
I was using s3 select to fetch selective data and display them on my front end .
I converted array of byte to buffer and then to string like below as string
let dataString = Buffer.concat(records).toString('utf8');
the result i got was string like…

sumit
- 15,003
- 12
- 69
- 110
1
vote
1 answer
Code Optimization on s3 read csv and ingest back to s3 bucket
ddict = defaultdict(set)
file_str = query_csv_s3(s3, BUCKET_NAME, filename, sql_exp, use_header)
# read CSV to dataframe
df = pd.read_csv(StringIO(file_str))
fdf = df.drop_duplicates(subset='cleverTapId',…

Dharmendra Yadav
- 49
- 8
1
vote
1 answer
AWS S3 Select get data for column with a / in the name
I am trying to use S3 Select to query some data from a CSV file on S3 using the following query:
aws s3api select-object-content \
--bucket \
--key \
--expression "select `lineItem/intervalUsageStart` from s3object limit 100"…

jobin
- 2,600
- 7
- 32
- 59
1
vote
2 answers
Aws s3 selectObjectContent by version id
Is there a way we can run select object content (s3 select) on specific version of s3 object using version Id?
I cannot find any references in select object content documentation to specify the version Id like we have version Id field in get Object…

Mohit Hapani
- 49
- 7
1
vote
0 answers
Is it possible to consider the second row as the header for a .csv file in S3 Select?
Is it possible to consider the second row of a .csv file as the headers and skip the first row in S3 Select?
Example:
The structure of my file is as follows:
A B C
a b c d e f
1 2 3 4 5 6
Now I want skip
A B C
And query on
a b c d e…

Pallav Doshi
- 209
- 2
- 9
1
vote
1 answer
S3 select query not recognizing data
I generate a dataframe, write the dataframe to S3 as CSV file, and perform a select query on the CSV in S3 bucket. Based on the query and data I expect to see '4' and '10' printed but I only see '4'. For some reason S3 is not seeing the '10'.
It…

David Hurley
- 43
- 4
1
vote
1 answer
AWS S3 Select skips missing values in result set
I'm trying to read a parquet file using S3 Select, but running into issues when the data contains missing values - the results returned from S3 select skip all missing values, making it impossible to parse the output. A reproducible example with…

ytsaig
- 3,267
- 3
- 23
- 27
1
vote
2 answers
S3 Select Python error
I'm trying to catch the data form a S3 object. I'm using a S3 Select feature as below:
boto3 version : 1.7.59
import boto3
s3 = boto3.client('s3')
r = s3.select_object_content(
Bucket="bucket",
Key="file.json",
ExpressionType='SQL',
…

Andres Urrego Angel
- 1,842
- 7
- 29
- 55
1
vote
3 answers
Querying rows by index in S3 Select
With mysql the following code:
SELECT * from TABLE limit 5, 10
would pull the 5th through 10th rows of the table. What is the equivalent for doing this through the SQL engine in S3 select (PrestoDB I believe)? Is there a rownumber constructor or…

Ajjit Narayanan
- 632
- 2
- 8
- 18
1
vote
2 answers
s3 select to pandas Dataframe
I am using S3 Select to read the csv file and outputting into JSON. Now I want the JSON Output from S3 Select into pandas dataframe. Is it possible to convert S3 Select JSON output to pandas dataframe?

thotam
- 941
- 2
- 16
- 31
1
vote
1 answer
S3 Select with boto3 - internalerror
Has anyone got "S3 Select" (https://aws.amazon.com/blogs/aws/s3-glacier-select/ ,
https://aws.amazon.com/about-aws/whats-new/2018/04/amazon-s3-select-is-now-generally-available/) with boto3 (or even cli or another sdk) working? I am getting…

tooptoop4
- 234
- 3
- 15
- 45
0
votes
0 answers
How to use S3 Select for Nested Parquet Objects?
I am getting started with the s3-select and I am trying to get the count of array size in the inner parquet object. Following example is one entry from the parquet file.
{
"id" : 12,
"date" : "2023-07-06"
"employee": {
"name": "stack…

Dipu
- 1