0

Example table:

CREATE TABLE adj_list_model (
    id INT NOT NULL AUTO INCREMENT,
    parent_id INT,
    my_string varchar(255),//some random information
    PRIMARY KEY (id)
)

The tree I am creating will have multiple "root" users in the same table (when parent_id = NULL). These "root" users may at some point acquire a parent_id from some "leaf" user (user that has no one under them). The doubt I have is how to make sure that I don't create a "loop" similar to the following one:

Example Tree Design:

  • a
    • b
    • c
      • d
        • e
        • f
          • g

if user "a" acquires user "g" as his Parent, then the loop created would be:

a -> c -> d -> f -> g -> a -> c... and so on forever

QUESTION: what's a good way to check if user "g" is under user "a" when user "a" wants go to under user "g" in the tree? (so that in those specific cases the action can be prevented)

Key points to consider: Having two trees merging into one would happen very often. If the number of levels in the tree where hypothetically 80, the amount of time it might take to do the checking to prevent loops might be considerable, which is why I am looking for an efficient method.


EDITED: The current options I have had (though I am skeptical) are:

  1. Creating an extra column that shows the current "root" user for each user in the table. In those cases, every time a "root" user obtained a parent, everyone under him would have to be updated with the new "root" user, and what worries me is how much strain this might put on the server, especially if there are a lot of users and if there is a high frequency of "root" users obtaining a parent.

  2. Checking a "root" users path before giving him a parent. If in the case above user "g" had his path checked by looping through each user above g 1 by 1 (seeing what their parent was, over and over until getting to the root), and found that the root was user "a", then yes, the action could be prevented, though I am not sure how straining this would be on the server. If anyone has an idea, let me know please!

Webeng
  • 7,050
  • 4
  • 31
  • 59
  • You probably need a recursive query. What I would do is just to add one colum where I store this element's root. You also have to update these values when you change something in your structure but this can be done without recursion. – flowit Apr 19 '16 at 20:13
  • @flowit So I thought about doing that previously, though as you say, every time the "root" obtains a parent, everyone under that root would have to be updated, which might be extremely laborious on the server if the table had a lot of users, and if among those users the action of "root" users obtaining parents was frequent. Any other possibilities out there? – Webeng Apr 19 '16 at 20:18

1 Answers1

1

For the option with an additional root_id column, in MySql syntax:

CREATE PROCEDURE change_root() 
BEGIN

  # this is the leaf node id, which will be the new parent of a root user
  SET @new_parent = x;
  # this is the old root user id, which will be the new child of the former leaf node
  SET @old_root = y;
  # get the leaf's root
  SET @new_root = SELECT root_id FROM adj_list_model WHERE id=x;

  # updating the dataset is possible as long as the leaf is not a child of the root user
  IF @new_root <> y THEN

    # link the former leaf to its new child
    UPDATE adj_list_model SET parent_id=x WHERE id=y;

    # @old_root is no longer a root, so update all occurences to the new root
    UPDATE adj_list_model SET root_id=@new_root WHERE root_id=@old_root;

  END IF;

END;

This is really not that complicated and much faster than a recursive solution. But in the end it depends on your workload and needs.

flowit
  • 1,382
  • 1
  • 10
  • 36
  • Thank you for your answer! Would root_id be indexed in your example? If it were not indexed, wouldn't `UPDATE adj_list_model SET root_id=@new_root WHERE root_id=@old_root;` take very long for a big table considering that it will go through every single entry? – Webeng Apr 19 '16 at 22:06
  • As I said, depends on your workload. How often do these updates occur? Maybe this happens so rarely that a few ms more aren't worth the extra effort to keep an index of root_id. Try it out! – flowit Apr 19 '16 at 22:14
  • So if for instance the database had to update 80 different users, would you say all that updating would be done below 1 second? And more importantly, how many users would you guess would be needed to be updated for it to take 1 second for the code to finalize? – Webeng Apr 19 '16 at 22:21
  • How would I know that? I don't know your hardware, I don't know your DBMS, I don't know anything about your users. Try it out. You can also read into profiling and tuning, but only AFTER you found out that your system is too slow. – flowit Apr 20 '16 at 10:35