0

I have a large self-referencing table Items. Items have parent Items, something like so:

ITEMS
Long id
String name
String type
Long parentId (fk. ref Items)

The use is non-circular (so it's a representation of a tree, not a graph).

Now, I need to try to find a certain Item. Let's say I need to find Item with type "TV" and name "Spongebob", with the parent with type "Movie" and name "42".

My question is: which way is better (speed is most important)?

  1. First select all Items with type "Movie" and name "42", and then select for the item with type "TV", name "Spongebob", and parent ID that matches the list from the first query

  2. A single HQL database interaction, which will look something like:

    SELECT i from ItemModel i WHERE i.type="TV" AND i.name="Spongebob" AND i.parent.type="TV" AND i.parent.name="42"
    

Now, #1 will require more database interactions (in this case 2 vs. 1, but there's a potential for 3 or 4 levels of depth), so I'm inclined to think that #2 would be more efficient. But, I'm unsure how mysql optimizes these queries, and I'm worried that the #2 query might actually be slower, do to the self-referencing joins.

Would appreciate any input.

  • MySQL doesn't support recursive functions, so it is generally poorly suited to this "adjacency list" model for hierarchical data. Have you considered the "nested sets" or "transitive closure" models instead? – eggyal Apr 29 '13 at 22:26

1 Answers1

0

First one is better like you thought.

You might try hash indexes on text fields if the query is too slow.

viljun
  • 370
  • 3
  • 12
  • I'm sorry, I guessed that the second would be better. Which did you think was better? And if it's the first, why? Thanks – user2133266 Apr 30 '13 at 05:24