Sqlserver how to find leaf on text column

Question

I'm tring to get the leaf node from a SqlServer table like this:

Id       |Text
-----------------
01      |dasdasd
01.01   |asaasa
01.02   |dasdd
01.02.01|ddada
02      |sdad

I'd like as result this table with only leaf node...

Id      |Text
----------------
01.01   |asaasa
01.02.01|ddada
02      |sdad

Thanks for help

What is `Id` ? If it's a `hierarchyid` [this duplicate query](https://stackoverflow.com/questions/10694894/find-all-leaf-node-records-using-hierarchyid) shows how easy it is to get leaf nodes. If not, why not? The data is halfway there already. By using a hierarchical string though, it's not possible to use SQL Server's hiearchyid functions and features. You could use [hierarchyid::Parse()](https://learn.microsoft.com/en-us/sql/t-sql/data-types/parse-database-engine?view=sql-server-2017) to convert the existing data to a `hierarchyid` column — Panagiotis Kanavos, Mar 05 '19 at 10:31

score 0 · Answer 1 · answered Mar 05 '19 at 10:36

Data:

drop table if exists #test
create table #test (id varchar(20), text varchar(30))
insert #test values ('01', 'dasdasd')
                  , ('01.01', 'asaasa')
                  , ('01.02', 'dasdd')
                  , ('01.02.01', 'ddada')
                  , ('02', 'sdad')

Query:

select id, text from #test t1
cross apply (
    select count(*) cnt from #test t2 where t2.id like  t1.id + '%'
) calc
where calc.cnt = 1

Output:

id        text
01.01     asaasa
01.02.01  ddada
02        sdad

score 0 · Accepted Answer · answered Mar 05 '19 at 10:59

You can take advantage of Id's hierarchical representation to find rows that have no other row that starts with their ID :

create table #table1 (id varchar(40) primary key,Text varchar(200))

insert into #table1 (id,text)
values 
('01','dasdasd'),
('01.01','asaasa'),
('01.02','dasdd'),
('01.02.01','ddada'),
('02','sdad')


SELECT t1.*,t2.* 
FROM #table1 t1 LEFT OUTER JOIN #table1 t2 
    ON t2.id LIKE t1.id + '%' 
        AND t1.id!=t2.id
WHERE t2.id IS NULL

t2.id like t1.id + '%' will find descendants of t1 and t1.id!=t2.id ensures the same row won't match itself. Finally t2.id is null discards any matches, leaving behind only leaf rows. The result is :

id       Text
01.01    asaasa
01.02.01 ddada
02       sdad

It's probably better to replace the hierarchical string with an actual hierarchyid column though. First, that string can only work with up to 99 nodes per level. Adding more nodes would require rewriting all keys. Second, string operations are needed to find leaves, ancestors, levels etc.

The equivalent table would look like this :

create table #table2 (id hierarchyid primary key,Text varchar(200))

insert into #table2 (id,text)
values 
('/1/','dasdasd'),
('/1/1/','asaasa'),
('/1/2/','dasdd'),
('/1/2/1/','ddada'),
('/2/','sdad')

Getting leaf nodes would work in a similar way as before, using the GetAncestor() function instead of LIKE .. :

SELECT t1.Id.ToString(), t1.Text
FROM #table2 t1 LEFT OUTER JOIN #table2 t2
  ON t1.Id = t2.Id.GetAncestor(1)
WHERE t2.Id IS NULL;

Adding the node level is trivial and won't even affect the execution plan. Doing the same thing in the previous query though would require counting dots in the ID field :

SELECT t1.Id.ToString() As ID, t1.Id.GetLevel() As Level,t1.Text
FROM #table2 t1 LEFT OUTER JOIN #table2 t2
  ON t1.Id = t2.Id.GetAncestor(1)
WHERE t2.Id IS NULL;

The result this time is :

ID      Level   Text
/1/1/   2       asaasa
/1/2/1/ 3       ddada
/2/     1       sdad

thank you. this solve my problem. Unfortunately I can't change my Id column, because of it's use by application, but your solution is pretty good. — Matteo Fioroni, Mar 07 '19 at 06:27

Sqlserver how to find leaf on text column

2 Answers2