12

I have a product catalog. Each category consists of different number (in deep) of subcategories. The number of levels (deep) is unknown, but I quite sure that it will not be exceed of 5,6 levels. The data changes are much more rarely then reads.

The question is: what type of hierarchical data model is more suitable for such situation. The project is based on Django framework and it's peculiarities (admin i-face, models handling...) should be considered.

Many thanks!

S2201
  • 1,339
  • 3
  • 18
  • 35

5 Answers5

6

Nested sets are better for performance, if you don't need frequent updates or hierarchical ordering.

If you need either tree updates or hierarchical ordering, it's better to use parent-child data model.

It's easily constructed in Oracle and SQL Server 2005+, and not so easily (but still possible) in MySQL.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
4

According to these articles:

http://explainextended.com/2009/09/24/adjacency-list-vs-nested-sets-postgresql/ http://explainextended.com/2009/09/29/adjacency-list-vs-nested-sets-mysql/

"MySQL is the only system of the big four (MySQL, Oracle, SQL Server, PostgreSQL) for which the nested sets model shows decent performance and can be considered to stored hierarchical data."

jwfearn
  • 28,781
  • 28
  • 95
  • 122
Enrique
  • 4,693
  • 5
  • 51
  • 71
  • 1
    Gosh... compared to what? I've found that Nested Sets pretty much blow the doors off the competition. The exception would be the functionality of CONNECT BY in Oracle. – Jeff Moden Oct 11 '12 at 04:46
4

I would use the Modified Preorder Tree Traversal algorithm, MPTT, for this sort of hierarchical data. This allows great performance on traversing the tree and finding children, if you don't mind a bit of a penalty on changes to the structure.

Luckily Django has a great library available for this, django-mptt. I've used this in a number of projects with a lot of success. There's also django-treebeard which offers several alternative algorithms, but I haven't used it (and it doesn't seem as popular as mptt anyway).

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
1

The Adjacency List is much easier to maintain and Nested Sets are a lot faster to query.

The problem has always been that converting an Adjacency List to Nested Sets has taken way too long thanks to a really nasty "push stack" method that's loaded with RBAR. So people end up doing some really difficult maintenance in Nested Sets or not using them.

Now, you can have your cake and eat it, too! You can do the conversion on 100,000 nodes in less than 4 seconds and on a million rows in less than a minute! All in T-SQL, by the way! Please see the following articles.

Hierarchies on Steroids #1: Convert an Adjacency List to Nested Sets

Hierarchies on Steroids #2: A Replacement for Nested Sets Calculations

Jeff Moden
  • 3,271
  • 2
  • 27
  • 23
  • Many thanks for those well written articles, good job! I had to convert an adjacency list into a nested-set representation in PostgreSQL and managed it well with the help of those. Cheers – VH-NZZ Apr 27 '20 at 22:05
  • @VH-NZZ - Thank you for the great feedback. Good to see that the method isn't locked in to T-SQL only. I'm going to have to learn PostgreSQL because of a new aspect of my job and so it's wonderful to know that things like this do have a "migration path". – Jeff Moden May 04 '20 at 12:05
  • I think you'll find that PSQL is one absolutely awesome database system. You can definitely look forward to it to adding that skill to your toolset. – VH-NZZ May 04 '20 at 21:38
1

http://www.sqlsummit.com/AdjacencyList.htm

Jayapal Chandran
  • 10,600
  • 14
  • 66
  • 91