0

I am looking for a way to query the AWS DynamoDB data with SQL Syntax using amazon EMR.

I have my DynamoDB table set up and ready. How can I import/query the data using Hue? The table in DynamoDB has a size of around 8GB.

Hendrik
  • 4,849
  • 7
  • 46
  • 51

1 Answers1

0

Please follow the below steps:-

Hive to query non-live DynamoDB data:-

1) Export Data from DynamoDB to Hive

Refer Section : Exporting Data from DynamoDB in EMR Hive Commands link below

2) Use Amazon EMR to query data stored in DynamoDB

Refer Section : Querying Data in DynamoDB in EMR Hive Commands link below

3) Use Hue to run the queries (i.e. run Hive queries from Hue workbench)

EMR Hive Commands

Hue Supported

Hive to query live DynamoDB:-

1) Create Hive table to map to DynamoDB table

http://docs.aws.amazon.com/emr/latest/ReleaseGuide/EMR_Interactive_Hive.html

2) Once you create the Hive table and run queries on it, it will refer the live DynamoDB table to get the data

Disadvantage : It consumes DynamoDB read or write units for each execution. In other words, it will cost you for each query execution.

Sample code:-

CREATE EXTERNAL TABLE hivetable1 (col1 string, col2 bigint, col3 array<string>)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' 
TBLPROPERTIES ("dynamodb.table.name" = "dynamodbtable1", 
"dynamodb.column.mapping" = "col1:name,col2:year,col3:holidays"); 
notionquest
  • 37,595
  • 6
  • 111
  • 105