4

In DynamoDB i have a table with the following structure.
The actions "field" contains all the info (and this is the field i would like to search into) and orderId it's the primary key

{
  "actions": [
    {
      "actionDescription": "8f23029def1d6baa4",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533730680,
      "user": {
        "fullName": "XXXXX",
        "userName": "xxxxx@xxxx.xxx",
      }
    },
    {
      "actionDescription": "21857e61037bc29ec",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731788,
      "user": {
        "fullName": "XXXXX",
        "userName": "xxxxx@xxxx.xxx",
      }
    },
    {
      "actionDescription": "cf10abd44e24cef56",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731788,
      "user": {
        "fullName": "XXXXX",
        "userName": "xxxxx@xxxx.xxx",
      }
    },
    {
      "actionDescription": "7787fe7a5bf4d22de",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731789,
      "user": {
        "fullName": "OOOOOO",
        "userName": "ooooo@oooo.ooo",
      }
    },
    {
      "actionDescription": "9528c439021f504bf",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731789,
      "user": {
        "fullName": "XXXXX",
        "userName": "xxxxx@xxxx.xxx",
      }
    },
    {
      "actionDescription": "bfba100e0e54934b2",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731789,
      "user": {
        "fullName": "XXXXX",
        "userName": "xxxxx@xxxx.xxx",
      }
    },
    {
      "actionDescription": "f789dc12f1dbe3be2",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731789,
      "user": {
        "fullName": "OOOOOO",
        "userName": "ooooo@oooo.ooo",
      }
    },
    {
      "actionDescription": "4cd6b68dfea7cf8ee",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731789,
      "user": {
        "fullName": "XXXXX",
        "userName": "xxxxx@xxxx.xxx",
      }
    },
    {
      "actionDescription": "1e3a0e95f8e5106d7",
      "actionTitle": "UNDEFINED_ACTION",
      "timestamp": 1533731790,
      "user": {
        "fullName": "OOOOOO",
        "userName": "ooooo@oooo.ooo",
      }
    }
  ],
  "orderId": "13aae31"
}

What i would like to do it's to make the scan terms in PHP to be able to search by userName. or by any field inside the actions array (timestamp, actionTitle, etc, etc).
Bellow it's one of the many terms i tried to use but i was unable to achieve any results

$params = [
 'TableName'                 => $this->tableName,
 'FilterExpression'          => "userName = :searchTerm",
 'ExpressionAttributeValues' => [
     ':searchTerm' => 'ooooo@oooo.ooo',
  ],
 'ReturnConsumedCapacity'    => 'TOTAL',
];
$results = $this->dynamoDbClient->scan($params);

Can you please guide my by telling me what i'm missing?
Also, please note: I don't want to get a specific orderId, i would like to get ALL orderIds containing the searchTerm (in this case userName)

tsompanis
  • 324
  • 1
  • 3
  • 15
  • Are you trying to add a "Search" feature to your application? If so, Scanning may not be the right approach. AWS CloudSearch may be a more scalable and fast solution – joseph.l.hunsaker Aug 24 '18 at 05:30

3 Answers3

4

Your best bet with this item schema is to filter the table items yourself. That is to say, scan the table with no filter expression and write your own code to filter the results. Scanning without the filter expression will consume the same amount of read capacity units.

You can set the filter expression to something like this, however this isn't scalable and only works if you have a fixed number of items in the actions list.

  actions[0].user.userName == :searchTerm OR actions[1].user.userName == :searchTerm OR actions[2].user.userName == :searchTerm OR ....

If you need complex search abilities you are probably better off using a dedicated search database. AWS provides two services around this, AWS CloudSearch and AWS ElasticSearch. You can use DynamoDB streams to keep your search indexes up to date.

If you are set on scanning the DynamoDB table with a filter you can refactor your structure to include additional attributes that have all the searchable information in a set (or concatenated string)

{
  "actions": [....],
  "actionsDescriptions": Set["8f23029def1d6baa4", "21857e61037bc29ec", "cf10abd44e24cef56", "7787fe7a5bf4d22de", "9528c439021f504bf", "bfba100e0e54934b2", "f789dc12f1dbe3be2", "4cd6b68dfea7cf8ee", "1e3a0e95f8e5106d7"],
  "actionTitles": Set["UNDEFINED_ACTION"],
  "timestamps": Set[1533730680, 1533731788, 1533731789, 1533731790],
  "user_fullNames": Set["XXXXX"],
  "user_userNames": Set["ooooo@oooo.ooo", "xxxxx@xxxx.xxx"],
  "orderId": "13aae31"
}

Notice you have to use a Set (or concatenate all the values into a string) since the contains functions only works on strings and sets.

Then you can use a filter expression like this

contains(user_userNames, :searchTerm)
cementblocks
  • 4,326
  • 18
  • 24
0

The DynamoDB QueryFilter and ScanFilter options do not currently support the CONTAINS operator for maps. You'll need to build another lookup table indexed by userName to avoid scanning the entire table.

E.g. new table schema:

{
  "userName": "xxxxx@xxxx.xxx"
  "orderId": "13aae31"
}

Where the hash key is userName and orderId is the ID of an order in the other table.

The closest you can get with the current schema is to use @cementblocks's suggestions to scan the whole table and filter application-side or query each element in the list individually.

Matt D.
  • 896
  • 6
  • 5
0

If you are adding a "Search" like feature to your application, then scanning may not be the best approach.

DynamoDB scan can be expensive and slow, especially when you have many rows.

So, if you intend on adding a "Search" feature you may consider using AWS CloudSearch. It is a scalable "Search" feature. You can quickly enable "Search" from a DynamoDB table.