I am trying to learn graphdb and compare its capabilities with relational databases. Consider the following problem:
I have two lists of date ranges: date-in and date-out:
Date-in date ranges:
- 1/1/2000-12/31/2025
- 1/1/2026-12/31/2030
- 2/1/2030-12/31/2033
Date-out date ranges:
- 2/1/2005-12/31/2020
- 1/1/2024-12/31/2026
The calculation that I want to do is to subtract all the Date-out date ranges from the Date-in date ranges, meaning that I want to know all the date ranges described in the Date-in ranges that are not described in the Date-out ranges. Note the date ranges may overlap.
The correct answer is:
- 1/1/2000-1/31/2005
- 1/1/2021-12/31/2023
- 1/1/2027-12/31/2030
- 2/1/2030-12/31/2023
I know how to solve this problem using a relational db such as Postgres. The relational db solution would be:
Use generate_series() to list all the days described in the Date-in ranges and Date-out date. There are ~33 years here, so about 12,000 days total.
Select all the days from the Date-in days list that are not present in the Date-out days list. This should be fast, because again, it's only ~33 years, so about 12,000 days total.
Use the "island" detection SQL query https://www.red-gate.com/simple-talk/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/ to find contiguous date ranges from the resulting list of days from Step 2.
I don't have an approach to solve this problem using a graphdb gremlin traversal. I tried the same approach in graphdb as in the relational db, but there is no generate_series() in the AWS Neptune implementation of graphdb that I'm aware of. Furthermore I didn't want to add nodes to a graphdb just to run a read type query like this one.
Is there a graphdb solution for this problem?