1

I'm facing a problem with the nested set model, with MySQL. I can insert, delete, move subtrees to another parent, everything works fine.

But I can't figure out how to order siblings. For example, I have these siblings :

A, B, C, D, E

And I want to move B after D, obtaining this :

A, C, D, B, E

I found tons of stored procedures for inserting, deleting, etc. but not a single one to order siblings. The only one I found is a procedure for swapping siblings, but that's not what I want to achieve.

I tried to write my own one, but it seems complicated and doesn't work in all cases.

If you know how to move a node before or after one of his siblings, this would be greatly appreciated.

PJP
  • 756
  • 3
  • 15

2 Answers2

5

So... I re-wrote everything and here is a stored procedure that works well to move one node after one of his siblings. If we want to move the node to the first position, just pass the parent id in place of the sibling id.

DELIMITER |
-- sibling parameter is either :
-- - the sibling id after which we want to put the page
-- - the parent id if we want to put the page on the first position
CREATE PROCEDURE move_after_sibling (IN to_move_id INT(10), IN parent_id INT(10), IN sibling_id INT(10))
LANGUAGE SQL
DETERMINISTIC
BEGIN
    DECLARE to_move_lft INT(10);
    DECLARE to_move_rgt INT(10);
    DECLARE parent_lft INT(10);
    DECLARE parent_rgt INT(10);
    DECLARE sibling_lft INT(10);
    DECLARE sibling_rgt INT(10);

    SET to_move_lft = (SELECT lft FROM pages WHERE id = to_move_id);
    SET to_move_rgt = (SELECT rgt FROM pages WHERE id = to_move_id);
    SET parent_lft = (SELECT lft FROM pages WHERE id = parent_id);
    SET parent_rgt = (SELECT rgt FROM pages WHERE id = parent_id);
    SET sibling_lft = (SELECT lft FROM pages WHERE id = sibling_id);
    SET sibling_rgt = (SELECT rgt FROM pages WHERE id = sibling_id);

    UPDATE pages 
        SET
            lft = 
                CASE 
                    WHEN sibling_id = parent_id THEN
                        CASE
                            WHEN lft BETWEEN parent_lft+1 AND to_move_lft-1 THEN
                                lft + (to_move_rgt - to_move_lft) + 1
                            WHEN lft BETWEEN to_move_lft AND to_move_rgt THEN 
                                lft - (to_move_lft - (parent_lft + 1))
                            ELSE
                                lft
                        END
                    ELSE
                        CASE 
                            WHEN to_move_lft > sibling_lft THEN
                                CASE
                                    WHEN lft BETWEEN sibling_rgt AND to_move_lft-1 THEN
                                        lft + (to_move_rgt - to_move_lft) + 1
                                    WHEN lft BETWEEN to_move_lft AND to_move_rgt THEN 
                                        lft - (to_move_lft - (sibling_rgt + 1))
                                    ELSE
                                        lft
                                END
                            ELSE
                                CASE
                                    WHEN lft BETWEEN to_move_rgt+1 AND sibling_rgt THEN
                                        lft - ((to_move_rgt - to_move_lft) + 1)
                                    WHEN lft BETWEEN to_move_lft AND to_move_rgt THEN 
                                        lft + (sibling_rgt - to_move_rgt)
                                    ELSE
                                        lft
                                END
                        END
                END,
            rgt = 
                CASE 
                    WHEN sibling_id = parent_id THEN
                        CASE
                            WHEN rgt BETWEEN parent_lft+1 AND to_move_lft-1 THEN
                                rgt + (to_move_rgt - to_move_lft) + 1
                            WHEN rgt BETWEEN to_move_lft AND to_move_rgt THEN 
                                rgt - (to_move_lft - (parent_lft + 1))
                            ELSE
                                rgt
                        END
                    ELSE
                        CASE 
                            WHEN to_move_rgt > sibling_lft THEN
                                CASE
                                    WHEN rgt BETWEEN sibling_rgt+1 AND to_move_lft-1 THEN
                                        rgt + (to_move_rgt - to_move_lft) + 1
                                    WHEN rgt BETWEEN to_move_lft AND to_move_rgt THEN 
                                        rgt - (to_move_lft - (sibling_rgt + 1))
                                    ELSE
                                        rgt
                                END
                            ELSE
                                CASE
                                    WHEN rgt BETWEEN to_move_rgt+1 AND sibling_rgt+1 THEN
                                        rgt - ((to_move_rgt - to_move_lft) + 1)
                                    WHEN rgt BETWEEN to_move_lft AND to_move_rgt THEN 
                                        rgt + (sibling_rgt - to_move_rgt)
                                    ELSE
                                        rgt
                                END
                        END
                END
        WHERE lft BETWEEN parent_lft+1 AND parent_rgt;
END
|
DELIMITER ;

Maybe that's not the most beautiful piece of code we've ever seen, but it works fine and is probably much more efficient than any kind of sorting algorithm, for example.

PJP
  • 756
  • 3
  • 15
0

I've found that, when dealing with nested sets using triggers it is better to:

  1. Use before triggers to lock rows as needed to avoid concurrency problems.
  2. Use after triggers to re-index, to work around mass-insert/mass-update related problems.
  3. Wait for the very last after trigger to fire before re-indexing anything.
  4. If at that point, you detect that more than one node was inserted/moved by a statement, re-index the applicable range altogether -- doing so is much faster than re-indexing entire chunks of the tree multiple times.

For your question in particular, you're moving nodes one by one using a stored procedure if I get things right. It's not very different from changing a node's parent: find its new lft/rgt index and shift it (and its child nodes) accordingly, to the left or to the right. Don't forget to offset the lft/rgt values if it's shifted to the right.

As an aside, I suspect you're only beginning to identify the potential issues you'll need to solve. Node permutations using a single update are the trickiest to deal with in my experience, e.g.:

A - B - C

D - E - F

To:

B - E - C

A - D - F
Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
  • Thanks for your answer, I was thinking about the same kind of thing : shifting the lft/rgt values of the node we want to move, and of the other impacted nodes, of course. I re-writed my stored procedure from scratch, and I came up with something very similar to my previous attempt. But this time it works! I guess I had messed up with some +1 or -1 thing somewhere. I'm going to edit my message with this procedure (which is a bit large and maybe could be optimized). – PJP May 14 '11 at 10:20
  • About the potential issues: yes, using a single update to make multiple permutations seems to be really tricky... But in my case, I only need to move one node after one of his siblings, so in the end, it's pretty simple. I don't think there are any potential issues with this, or at least, I don't see them right now. Do you think I'm missing something here? – PJP May 14 '11 at 15:36
  • Not much: single node insert/update/delete are pretty easy cases. As a side note, consider dropping the (rgt - lft - 1) / 2 = num_children properties, and use rounded floats instead of integers. Doing so makes inserts and updates considerably faster (deletes are trivial: nothing to do); the rounding solves potential precision issues inherent to floats; doing so, you only need to locally reindex here and there when you run out of precision space. – Denis de Bernardy May 14 '11 at 15:49
  • I'm a bit confused... If we let the lft and rgt as-is after a delete, how do we know, for example, how many children a node has? – PJP May 16 '11 at 10:21
  • You can't, per my previous comment: "consider dropping the (rgt - lft - 1) / 2 = num_children properties, and use rounded floats instead of integers". – Denis de Bernardy May 16 '11 at 11:24
  • OK, I guess it's something we have to let go to achieve better performance. I was quite surprised you suggested this because I thought this was an essential feature, but is it? I'll have to think about this. – PJP May 16 '11 at 12:11
  • Well, with a tiny tree (1k nodes) it's irrelevant. But if you're managing millions of nodes, you'll be very happy to be able to insert/update/delete one in milliseconds instead of seconds or minutes. – Denis de Bernardy May 16 '11 at 12:14
  • OK, fine. In the present case, trees will probably be less than 500 nodes, 1k tops (and I would already be very surprised). Thanks for your advice, I'll keep that in mind in the future, if I have to create huge trees. – PJP May 16 '11 at 12:25