CAVEAT: DynamoDB is not a time series database and there may be better solutions out there. I'm making an attempt at giving you a solution but any solution with DDB will require some tradeoffs.
Assumptions (emphasis mine):
- The granularity you're looking for is at day level
- You want to find all matching windows across all campaigns that are active and that contain your target date (meaning the window starts before the
target date
and window ends after that date)
- There can be hundreds/thousands of windows per campaign, but not more then 10,000
- Window length is on the order of weeks/months not years..
You can create a table with the following schema:
partition key: cid (string) // campaign unique id - eg. 'Campaign1'
sort key: wid (string) // window id (unique within campaign) - eg. 'Window1'
attribute: wsm (string) // window start month - eg. '202301'
attribute: wed (string) // window end date - eg. '20230128'
attribute: ttl (numeric) // window expiration as timestamp in epoch seconds
NOTE Based on the assumption that you only care about active campaigns you will want to set a TTL on window expiration and remove expired windows from your table on the date of expiration. If you don't want to fully delete expired windows you can: use a secondary table for the lookup, or insert double records for each campaign window in your table to be used for lookup.
Example:
cid |
wid |
wsm |
wed |
start |
end |
data |
Campaign#1 |
Window#1 |
202301 |
20230131 |
2023-01-01T00:00:00.000Z |
2023-01-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#2 |
202302 |
20230231 |
2023-02-01T00:00:00.000Z |
2023-02-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#3 |
202303 |
20230331 |
2023-03-01T00:00:00.000Z |
2023-03-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#4 |
202304 |
20230431 |
2023-04-01T00:00:00.000Z |
2023-04-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#5 |
202305 |
20230531 |
2023-05-01T00:00:00.000Z |
2023-05-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#6 |
202306 |
20230631 |
2023-06-01T00:00:00.000Z |
2023-06-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#7 |
202307 |
20230731 |
2023-07-01T00:00:00.000Z |
2023-07-31T23:59:00.000Z |
{} |
Campaign#1 |
Window#8 |
202308 |
20230831 |
2023-08-01T00:00:00.000Z |
2023-08-31T23:59:00.000Z |
{} |
Campaign#2 |
Window#1 |
202302 |
20230314 |
2023-02-14T00:00:00.000Z |
2023-03-14T23:59:00.000Z |
{} |
Campaign#2 |
Window#2 |
202302 |
20230601 |
2023-02-20T00:00:00.000Z |
2023-06-01T23:59:00.000Z |
{} |
Campaign#3 |
Window#1 |
202305 |
20230628 |
2023-05-15T00:00:00.000Z |
2023-06-28T23:59:00.000Z |
{} |
And create a GSI called intervals-gsi
on PK:wsm
, SK:wed
To find all matching windows you will need to run a number of queries depending on max length of your campaign windows. You will need to filter the results of the same-month query but given the partitioning there shouldn't be that many records to filter. The queries can be executed in parallel to speed up the process.
For example, let's say that you know that your campaigns can at most run for 6 months, given a target date of `09/17/2023' you can run the following queries using your GSI:
QUERY intervals-gsi WHERE wsm="202309" AND wed > "20230917" FILTER start <= '2023-09-17'
QUERY intervals-gsi WHERE wsm="202308" AND wed > "20230917"
QUERY intervals-gsi WHERE wsm="202307" AND wed > "20230917"
QUERY intervals-gsi WHERE wsm="202306" AND wed > "20230917"
QUERY intervals-gsi WHERE wsm="202305" AND wed > "20230917"
QUERY intervals-gsi WHERE wsm="202304" AND wed>"20230917"
Another example, let's say that your campaigns run for at most 3 months and the target date is 02/05/2024
. You would run:
QUERY intervals-gsi WHERE wsm="202302" AND wed>"20240205" FILTER start < '2024-05-02'
QUERY intervals-gsi WHERE wsm="202301" AND wed>"20240205"
QUERY intervals-gsi WHERE wsm="202212" AND wed>"20240205"
Final notes
I've made some assumptions about your data. Feel free to tweak the assumptions and adjust the design to fit your needs. For example, if your campaigns don't run for more than 2 months each, you may be better off with a different partitioning granularity (eg weekly). The idea being you will trade # of queries for precision. Or if those hundreds/thousands of windows are really mostly on the order of tens/hundreds, optimize for that and let the outlier cases be handled less efficiently..