Assume a graph database to store a very large DAG on disk:
There are many things that are not required, which allows for optimization.
Basically what I do need is:
- store a directed acyclic graph, no cycles, at most one edge per node-pair
fromID
,toID
,weight
(can beINT
,INT
,FLOAT
)- return connected components efficiently and conveniently
- return all zero-indegeree nodes efficiently and conveniently
- return all descendents of a node efficiently and conveniently
- manage sizes of up to 100 million nodes, with up to 10 billion edges
- modest resource requirements
- free / open-source
Do you have some experience that allow you to give recommendations?