1

I would like to store multiple Trees on disk (my trees represent hierarchies of files and folders) with key/value information on each node.

I want to be able to compare these trees (intersection, union, difference, ...), they would have million of nodes.

Which solution is best suited? document-based stores (MongoDB...) or Graph DB (Neo4J...)?

Community
  • 1
  • 1
yohm
  • 452
  • 7
  • 14

1 Answers1

0

A graph database would be a natural fit for a tree structure. In Neo4j you could represent files and folders very simply like this:

(:Folder)-[:CONTAINS]->(:File)
(:Folder)-[:CONTAINS]->(:Folder)

If you then wanted to compare structures, you could make two cypher queries

MATCH path=(top_folder:Folder)-[:CONTAINS*1..15]->(leaf)
WHERE
  ID(top_folder) = {specified_id}
  (leaf:File OR leaf:Folder) AND
  NOT(leaf-[:CONTAINS]->())
RETURN path

This would give you back all of the paths of the directory structure starting at the folder that you specify which you could then compare.

Alternatively if you're comfortable with Java (or some language with Java integration), you can use the built-in Neo4j APIs to build your own recursive algorithms to compare the structures.

Brian Underwood
  • 10,746
  • 1
  • 22
  • 34