1

I have id as the hash key of my table and returnItemId which is the GSI. The returnItemId is a string which contains values separated by commas. Given a number for the GSI, I want to be able to query and get the correct item that contains it by using contains

var params = {
    "AttributeDefinitions": [ // describbes the key schema of the table
    {
      "AttributeName": "id",
      "AttributeType": "S"
    },
    {
      "AttributeName": "returnItemId",
      "AttributeType": "S"
    }
  ],
  // Hash for Primary Table
  "KeySchema": [
    {
      "AttributeName": "id",
      "KeyType": "HASH"
    }
  ],

  "GlobalSecondaryIndexes": [
    {
      "IndexName": "ReturnItemIndex",
      "KeySchema": [
        {
          "AttributeName": "returnItemId", //must match one of attributedefinitions names
          "KeyType": "HASH"
        }
      ],
      "Projection": {
        "ProjectionType": "ALL"
      },
      "ProvisionedThroughput": {
        "ReadCapacityUnits": 5,
        "WriteCapacityUnits": 5
      }
    }
  ],

  "ProvisionedThroughput": {
    "ReadCapacityUnits": 5,
    "WriteCapacityUnits": 5
  },


  "TableName": "my-table"
};
dynamodb.createTable(params, function(err, data) {
    if (err) ppJson(err); // an error occurred
    else ppJson(data); // successful response

});

Then I am going to create 2 items

var params = {
    TableName: 'my-table',
    Item: { 
    
        "id": "the_first_item",
        "returnItemId": "123,456,789"
    },
};
docClient.put(params, function(err, data) {
    if (err) ppJson(err); // an error occurred
    else ppJson(data); // successful response
});

And the second item

var params = {
    TableName: 'my-table',
    Item: { 
    
        "id": "the_second_item",
        "returnItemId": "987,654,321"
    },
};
docClient.put(params, function(err, data) {
    if (err) ppJson(err); // an error occurred
    else ppJson(data); // successful response
});

The two items look like enter image description here

I am trying to run a query and get the correct item which contains 987 using the following query. Since my first item has 123,456,789 and the second item has 987,654,321 this method should return the second item.

var params = {
    TableName: 'my-table',
    IndexName: 'ReturnItemIndex', // optional (if querying an index)
    KeyConditionExpression: 'contains(returnItemId, :return_id)',
    //FilterExpression: 'contains(returnItemId, :return_id)', // a string representing a constraint on the attribute
    ExpressionAttributeValues: { ':return_id': '987' },
};
docClient.query(params, function(err, data) {
    if (err) ppJson(err); // an error occurred
    else ppJson(data); // successful response
});

But am getting errors about using contains in keyconditionexpression. Is this method possible?

NoSQLKnowHow
  • 4,449
  • 23
  • 35
MMM
  • 315
  • 1
  • 6
  • 19
  • https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#query-property – Atul Kumar Apr 12 '19 at 03:54
  • I've seen that link before but I don't think it answers my question. I've tried getting `contains` to work with `KeyConditions` but did not succeed – MMM Apr 12 '19 at 04:13
  • can you log the error also ? – Atul Kumar Apr 12 '19 at 04:13
  • It wasn't an error more so I don't understand if that was even a valid method. My stack overflow question is not about finding the correct resource but understanding if it's possible and if someone knows how to implement this solution – MMM Apr 12 '19 at 04:15
  • It is possible to to query, using `ConditionalOperator`, see how you can use this for your problem. – Atul Kumar Apr 12 '19 at 04:16
  • I figured out the solution without using ConditionalOperator. For those curious, use a dummy key value that is consistent across and use `contains` on the field you're interested in. – MMM Apr 12 '19 at 05:00
  • And it is not possible to use the conditional operator for key values in my particular scenario. Research is best advised before suggesting advice. – MMM Apr 12 '19 at 05:00

1 Answers1

1

contains can only be used in filters which means:
- query or scan operations will traverse all data to apply your filters
- your cost of read operations will include all data read not just matched data
- with contains "12", you would probably match "123" and "124" too
- better than comma separated is to use StringSet or NumberSet data type

I would suggest another layout
Keyschema:
Partiton Key: id
Sort Key: returnItemId
GSI
Partition Key: returnItemId

Data:

------------------------------------
| id                | returnItemId |
------------------------------------
| "the_first_item"  | "123"        |
| "the_first_item"  | "456"        |
| "the_first_item"  | "789"        |
| "the_second_item" | "987"        |
| "the_second_item" | "654"        |
| "the_second_item" | "321"        |
------------------------------------

then query GSI for key condition returnItemId = 987 ( no filter expression )

Adrian Praja
  • 402
  • 2
  • 5