6

I am building web application that should have high write load and thousands, even millions of hierarchical records representing user defined/constructed trees. I am not trying to build forum with threads but huge database with thousands of small-sized hierarchies (trees with up to 10-20 descendants)...

I am aware of many models for storing hierarchies - currently I am using Nested Sets but performance with huge data and load is issue. I am also doubtful that Adjacency Lists or something similar may resolve this.

I have been experimenting with Mongo database which is superfast key/value storage but I can use only MySQL.

I would like to hear about other people experiences with similar issues.

idenoq
  • 83
  • 5

2 Answers2

5

If you can install MySQL plugins, then OQGraph storage engine is what you need.

Mchl
  • 61,444
  • 9
  • 118
  • 120
  • 1
    +1 However installing a plugin is not open to everybody. That's why I've awarded the bounty to @barryhunter – Johan Nov 23 '11 at 19:51
  • Thanks anyway ;) Remember, the more people know about OQGraph, the sooner we might see it as part of default installations at hosting companies :) – Mchl Nov 23 '11 at 21:53
4

What is the problem with nested sets?

Is recomputing the lft/rgt values when you add/remove nodes?

Pretty sure with a bit of careful planning, you can tweak it so do only have to do rare recomputations. I've not actully tried it, but did do some planning for a system once (the client didnt want the system in the end!)

One, is multiplying the values, by say 1000, when first calculating them. Then if you add a node, you can just insert numbers between the values. Its only when there is a large number of insertions, do you start running out of numbers. A low priority batch process, could recompute the tree to free up numbers for fresh insertions.

Deleting can also be archived, with manipulating numbers. In fact a node without children is easy. No recomputation nedded. Gets more complicated if children, but I think should be doable.

barryhunter
  • 20,886
  • 3
  • 30
  • 43
  • +1 Joe Celko has some great posts on this somewhere. I believe his book "Joe Celko's SQL for smarties" has a section on it also. Certainly worth a Google search. – Mr Moose Nov 23 '11 at 05:41