8

Currently, I'm thinking of how to search a keyword within a string of text (e.g. search "happy" inside the string "I'm a very happy man" and return to me that text) using AWS DynamoDB. Is there a way to query this?

What I know of is that Query allows "begin with" or "between" which doesn't really help me in this case.

Also, lets say i have a million records in the table "A", is it easy to migrate data into a different table "B"/"C" if I break up table "A"?

Thanks in advance!

K.Liu
  • 271
  • 1
  • 2
  • 15

3 Answers3

10

DynamoDB cannot efficiently do a query for "Contains" keyword because it doesn't build indices to do so. The only indices that are built are those on the primary key (hash or hash and range), local secondary indices and global secondary indices. Using the CONTAINS filter in scan will cause Dynamo DB to perform a full table scan, which could potentially eat a lot of your configured read throughput, causing other queries to get throttled. You can consider scan with CONTAINS filter if this is not a concern for you.

AWS cloud search is more appropriate for full text search queries. The AWS cloud search service has a section documenting how data in DynamoDB maybe queried - http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-dynamodb-data.html.

  • This isnt great, as cloudsearch needs to be periodically synced to dynamoDB periodically (not as an item gets added to the DB). The AWS doc advises to sync your db with cloudsearch periodically for example at the end of each day. But in a lot of cases we want our data to be immediately available for search once saved. If anyone can point me a workaround please do! – conor909 Jun 13 '17 at 21:38
  • 1
    @conor909 you can write a simple lambda which is triggered on every update to DynamoDB (using DynamoDB streams) and updates the document to CloudSearch. This way you have almost (~3s delay) a realtime search on DynamoDB. – Rajat Apr 30 '18 at 01:22
1

Amazon CloudSearch is probably what you're looking for:

You can specify a DynamoDB table as a source when configuring indexing options or uploading data to a search domain through the console or command line tools. This enables you to quickly set up a search domain to experiment with searching data stored in DynamoDB database tables.

http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-dynamodb-data.html

Dave Liepmann
  • 1,555
  • 1
  • 18
  • 22
Trevel
  • 801
  • 1
  • 7
  • 13
  • So far I havnt found a clean way to add an item to my dynamoDB and for it to be immediately available for search by cloudsearch. The AWS doc advises to sync your db with cloudsearch periodically for example at the end of each day. But in a lot of cases we want our data to be immediately available for search once saved – conor909 Jun 13 '17 at 21:32
  • 1
    My initial thought would be to set up a trigger connected to a Lambda function that'll add it to the search. That should let you keep it up to date within a minute or so. – Trevel Jun 14 '17 at 18:29
  • 1
    It's a good suggestion @Trevel, but in my case I'm just trying to set up a noSql DB with flexible search functionality. It sounds a bit over kill to have the DB + cloudsearch + lambda functions just to get this going. Iv started a discussion around this here https://stackoverflow.com/q/44530846/1853114 – conor909 Jun 14 '17 at 18:38
0

It sounds like what you are looking for is the Contains condition:

If the target attribute of the comparison is of type String, then the operator checks for a substring match.

You didn't specify how you where querying DynamoDB, so unfortunately I can't give you a specific example. However, if you were using java, you would probably use a QueryFilter.

Your second question seems to have already been answered.

Community
  • 1
  • 1
mnewton
  • 444
  • 1
  • 5
  • 12
  • I did saw the condition "Contains", however, it is only applicable for Scan operation. I'm trying to store comments in the range attribute so that I can query all comments that contains the word "happy". Is there a better way around it other than using Scan? – K.Liu Jun 14 '15 at 08:53
  • Well first, what SDK are you actually using? How are you accessing DynamoDB? – mnewton Jun 14 '15 at 21:37