14

The document just says that it is a query service but not explicitly states that it can or cannot perform data update.

If Athena cannot do insert or update, is there any other aws service which can do like a normal DB?

Theo
  • 131,503
  • 21
  • 160
  • 205
kzfid
  • 688
  • 3
  • 10
  • 17

7 Answers7

22

Amazon Athena is, indeed, a query service -- it only allows data to be read from Amazon S3.

One exception, however, is that the results of the query are automatically written to S3. You could, therefore, use a query to generate results that could be used by something else. It's not quite updating data but it is generating data.

My previous attempts to use Athena output in another Athena query didn't work due to problems with the automatically-generated header, but there might be some workarounds available.

If you are seeking a service that can update information in S3, you could use Amazon EMR, which is basically a managed Hadoop cluster. Very powerful and capable, and can most certainly update information in S3, but it is rather complex to learn.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Thanks for your reply. If I have read access only, can the result be written to s3 and what is the default path to store the result? – kzfid Jan 11 '18 at 15:52
  • 2
    **Update:** Amazon Athena can now `CREATE TABLE AS` to create new tables based on queries from existing tables. The data is stored back into Amazon S3 and can be in various formats including Parquet and ORC. – John Rotenstein Mar 11 '19 at 04:41
  • 4
    **Update**: Amazon Athena now supports `INSERT INTO` https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/ – Jerry Chi Nov 19 '19 at 01:37
13

Amazon Athena adds support for inserting data into a table using the results of a SELECT query or using a provided set of values

Amazon Athena now supports inserting new data to an existing table using the INSERT INTO statement.

https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/

https://docs.aws.amazon.com/athena/latest/ug/insert-into.html

Bucketed tables not supported

INSERT INTO is not supported on bucketed tables. For more information, see Bucketing vs Partitioning.

Hariprasad
  • 1,611
  • 2
  • 14
  • 19
6

AWS S3 is a object storage. Both Athena and S3 Select is for queries. The only way to modify a object(file) in S3 is to retrieve from S3, modify and upload back to S3.

Ashan
  • 18,898
  • 4
  • 47
  • 67
6

As of September 20, 2019 Athena also supports INSERT INTO: https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/

Theo
  • 131,503
  • 21
  • 160
  • 205
3

Finally there is a solution from AWS. Now you can perform CRUD (create, read, update and delete) operations on AWS Athena. Athena Iceberg integration is generally available now. Create the table with:

TBLPROPERTIES ( 'table_type' ='ICEBERG' [, property_name=property_value])

then you can use it's amazing feature.

For a quick introduction, you can watch this video. (Or search Insert / Update / Delete on S3 With Amazon Athena and Apache Iceberg | Amazon Web Services on Youtube)

Read Considerations and Limitations

  • Thanks for this, I just needed to remove the word EXTERNAL and add TBLPROPERTIES ( 'table_type' ='ICEBERG') to my CREATE TABLE command. Then, I was able to update. – ChrisDanger Mar 01 '23 at 17:41
1

Athena supports CTAS (create table as) statements as of October 2018. You can specify output location and file format among other options.

https://docs.aws.amazon.com/athena/latest/ug/ctas.html

To INSERT into tables you can write additional files in the same format to the S3 path for a given table (this is somewhat of a hack), or preferably add partitions for the new data.

Like many big data systems, Athena is not capable of handling UPDATE statements.

Kirk Broadhurst
  • 27,836
  • 16
  • 104
  • 169
0

We could use something known as Apache Iceberg in collaboration with Athena to perform CRUD operations on S3 data inside AWS itself.

The only caveat being that at the time of table creation we need to use extra parameter as table_type = 'ICEBERG'.

Eg: create table demo ( id string, attr1 string ) location 's3://path' TBLPROPERTIES ( 'table_type' = 'ICEBERG' )

For more details : https://www.youtube.com/watch?v=u1v666EXCJw