0

I have a table storedgames, which contains 2092 items.

table

And it also has an index on that table, which also lists 2092 items.

user-index

when I fetch data, I use the index, to obtain the items for one specific user.

const params = {
  TableName: "storedgames",
  IndexName: "user-index",
  KeyConditionExpression: "#usr = :usr",
  ExpressionAttributeNames: { "#usr": "user" },
  ExpressionAttributeValues: { ":usr": user }
};

const data = await new Promise((resolve, reject) => {
  docClient.query(params, (err, data) => {
    if (err) { reject(err); } else { resolve(data); }
  });
}).catch((err) => {
  console.error(err);
  return false;
});

However, the above code does not return all items. It only finds 42. And for today's items there is only 1 hit. When I check directly on the AWS webpage, I actually find more items for today.

more items on webpage in table

And even when I do this using the index, it finds more records.

more items on webpage using index

When I leave out the filtering of the day, I actually find over 130 items, while my javascript code only returns 42 items when I leave out the day filter.

130 vs 42 items


So my question is, why does the data of my index seem to be incomplete when I call it programmatically ?

bvdb
  • 22,839
  • 10
  • 110
  • 123

1 Answers1

1

The records actually contain a lot of data, and there appears to be a limit in the amount of data that can be fetched per query.

A single Query operation can retrieve a maximum of 1 MB of data. This limit applies before any FilterExpression is applied to the results. If LastEvaluatedKey is present in the response and is non-null, you must paginate the result set (see Paginating the Results).

So, I one possible solution, is to perform multiple fetches until you have the entire collection.

const queryAllItems = (params, callback) => {
  let fullResult = { Items: [], Count: 0, ScannedCount: 0 };

  const queryExecute = (callback) => {
    docClient.query(params, (err, result) => {
      if (err) {
        callback(err);
        return;
      }

      const { Items, LastEvaluatedKey, Count, ScannedCount } = result;

      fullResult.Items = [...fullResult.Items, ...Items];
      fullResult.Count += Count;
      fullResult.ScannedCount += ScannedCount;

      if (!LastEvaluatedKey) {
        callback(null, fullResult);
        return;
      }

      params.ExclusiveStartKey = LastEvaluatedKey;
      queryExecute(callback);
    });
  }

  queryExecute(callback);
}

Unfortunately, this isn't a complete solution. In my situation, a query for a mere 130 items (which require 4 actual fetches) takes about 15 seconds.

bvdb
  • 22,839
  • 10
  • 110
  • 123
  • You can also use the eachPage method on the request. e.g. `docClient.query(params).eachPage((err, data, next) => {})` but it's a little harder to use with promises. https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Request.html#eachPage-property – cementblocks Aug 16 '19 at 18:06
  • @cementblocks is it faster? - my current solution is way too slow. I am seriously considering to ditch the dynamodb database. – bvdb Aug 17 '19 at 13:27
  • It would run at about the same speed. What is the provisioned read throughput on your table? Where are you calling it from? locally or in AWS? – cementblocks Aug 18 '19 at 20:23
  • @cementblocks It's "on demand" and it runs on an EC2 server of amazon. Could it just be that moving the 4MB of data through the network causes the delay ? – bvdb Aug 19 '19 at 08:46