4

I have a User table, and I want to fetch data using the following query, for education purposes, I'm using a SQL-like query.

SELECT * FROM User
WHERE (gender = "Male")
AND (age between 25-30 OR height between 5.4-5.9 OR city="India, US")

I'm trying to create the above query in AWS Lambda using Node.js

Any feedback will be appreciated.

jarmod
  • 71,565
  • 16
  • 115
  • 122
Nasreen Ustad
  • 1,564
  • 1
  • 19
  • 24

1 Answers1

13

Here is how you would do it with the awscli:

aws dynamodb scan \
    --table-name User \
    --filter-expression "(gender = :gender) AND ((age BETWEEN :age1 AND :age2) OR (height BETWEEN :h1 AND :h2) OR (city = :city))" \
    --expression-attribute-values '{":gender":{"S":"Male"}, ":age1":{"N":"25"}, ":age2":{"N":"30"}, ":h1":{"N":"5.4"}, ":h2":{"N":"5.9"}, ":city":{"S":"India, US"}}'

Here's how you would do it with the low-level AWS JavaScript SDK functions:

const AWS = require("aws-sdk");
AWS.config.update({region: 'us-east-1'});
const ddb = new AWS.DynamoDB();

const params = {
  TableName: 'User',
  FilterExpression: '(gender = :gender) AND ((age BETWEEN :age1 AND :age2) OR (height BETWEEN :h1 AND :h2) OR (city = :city))',
  ExpressionAttributeValues: {
    ':gender': {S: 'Male'},
    ':age1': {N: '25'},
    ':age2': {N: '30'},
    ':h1': {N: '5.4'},
    ':h2': {N: '5.9'},
    ':city' : {S: 'India, US'},
  },
};

ddb.scan(params, (err, data) => {
  if (err) {
    console.log('Error:', err);
  } else {
    for (const item of data.Items) {
      console.log(item);
    };
  }
});

Finally, here's how you would do it with the higher-level DynamoDB DocumentClient which makes attribute mapping much simpler:

const AWS = require("aws-sdk");
AWS.config.update({region: 'us-east-1'});
const dc = new AWS.DynamoDB.DocumentClient();

const params = {
  TableName: 'User',
  FilterExpression: '(gender = :gender) AND ((age BETWEEN :age1 AND :age2) OR (height BETWEEN :h1 AND :h2) OR (city = :city))',
  ExpressionAttributeValues: {
    ':gender': 'Male',
    ':age1': 25,
    ':age2': 30,
    ':h1': 5.4,
    ':h2': 5.9,
    ':city': 'India, US',
  }
};

dc.scan(params, (err, data) => {
  if (err) {
    console.log('Error:', err);
  } else {
    for (const item of data.Items) {
      console.log(item);
    };
  }
});

Note that these are table scans and consequently all items in the table are visited.

jarmod
  • 71,565
  • 16
  • 115
  • 122
  • 1
    thankyou so much, your answer helped me a lot. I really appreciate – Nasreen Ustad Jun 26 '19 at 09:10
  • just wondering, since scan is not recommended, is there a better way to do this? it seems like the answer is "not really" as it's a very SQL-like use case. perhaps creating a GSI with a PK: # or PK: gender, SK: birthday since those are probably the most important/constant, then filter on the rest of the attributes? good or bad idea? got the idea from: https://www.alexdebrie.com/posts/dynamodb-filter-expressions/ any ideas when a filter on multiple attributes like this starts to slow down? like at how many attributes and items does scan start falling apart? – atkayla Apr 20 '23 at 21:43
  • @atkayla agree that if you want arbitrary queries, NoSQL is not going to be able to optimize them beyond scan/filter. For the specific query in question, one option might to be use 3 GSIs (GSI1: pk=gender, sk=age, GSI2: pk=gender, sk=height, GSI3: pk=gender, sk=city or pk=gender#city) and then issue 3x queries in parallel and aggregate the results to accomplish the query: gender match AND (age match OR height match OR city match). The combination of query and server-side filter would probably work reasonably well. For query performance guidelines, I'd defer to Alex DeBrie. – jarmod Apr 21 '23 at 00:21