I am working on a ledger with a table of Transactions. Each entry has a transaction_id
, account_id
, timestamp
and other metadata. I need to query for all Transactions for a given account_id
with a between operator on timestamp
My planned approach was to build an index on account_id
, transaction_id
and timestamp
. However I have noted a limitation on inequalities and indexes from the AWS Documentation and I had planned applying this to timestamp
Query performance is improved only when you use an equality predicate; for example, fieldName = 123456789.
QLDB does not currently honor inequalities in query predicates. One implication of this is that range filtered scans are not implemented.
...
Warning
QLDB requires an index to efficiently look up a document. Without an indexed field in the WHERE predicate clause, QLDB needs to do a table scan when reading documents. This can cause more query latency and can also lead to more concurrency conflicts.
Transactions would be generated and grow indefinetly over time, and I would need to be able to query a weeks worth of transactions at a time.
Current Query:
SELECT *
FROM Transactions
WHERE "account_id" = 'test_account' and "timestamp" BETWEEN `2020-07-05T00:00Z` AND `2020-07-12T00:00Z`
I know it is possible to stream the data to a database more suited for this query, such as dynamodb, but I would like to know if my performance concerns performing the above query is valid, and if it is, what is the recommended indexes and query to ensure this scales and does not result in a scan across all transactions for the given account_id
?