0

I have a sample table [DL_Members] with Distribution Lists and their members

DL_Name            Member_Name
*Test - DL - 1     Joe
*Test - DL - 1     Mike
*Test - DL - 1     Sarah
*Test - DL - 1     *Test - DL - 2
*Test - DL - 2     Brian
*Test - DL - 2     Mary
*Test - DL - 2     *Test - DL - 3
*Test - DL - 3     Steve
*Test - DL - 3     *Test - DL - 4
*Test - DL - 4     Jane
*Test - DL - 4     *Test - DL - 2

It is possible for a Distribution List to have people as members directly or other DL's. Those other DL's can have their own people members or more DL's nested within them.

My challenge is how to write a SQL query to get all members of a DL including any nested DLs (all levels)?

For example if I simply do

SELECT
Member_Name
FROM DL_Members
WHERE DL_Name = '*Test - DL - 1'

The result will be:

Joe
Mike
Sarah
*Test - DL - 2
*Test - DL - 3

So my problem is that a simple query won't parse any nested DL's recursively The result I want to get when running the query to get all members of *Test - DL - 1 is:

Joe
Mike
Sarah
Brian
Mary
Steve
Jane
*Test - DL - 2
*Test - DL - 3
*Test - DL - 4

I haven't been able to create a query that will get all members of a DL recursively.

JetRocket11
  • 302
  • 4
  • 17
  • But your "The result I want to get..." looks like all you need to do is this: select Member_Name from DL_Members ?? – VBoka Jan 03 '20 at 14:05
  • Yes I want to select Member_Name from specific DL used in a filter. So if I filter for DL_Name = '*Test - DL - 4' I should get just 1 member (Jane) whereas if I filter for *Test - DL - 1 then I should get 10 results mentioned in the original question. – JetRocket11 Jan 03 '20 at 14:09
  • You will need to write a [Recursive CTE](https://learn.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-ver15#d-using-a-recursive-common-table-expression-to-display-multiple-levels-of-recursion) to do this properly. They aren't hard to write, but they are kind of mind-melting the first couple of times. – JNevill Jan 03 '20 at 14:20
  • You should consider having a separate table for distribution lists in distribution lists. That way you can avoid having collisions between member names and distribution list names. Is it possible that you'll create a cycle, e.g. DL1 contains DL2 which contains DL3 which contains DL1? Terminating recursion in that case is a little trickier. – HABO Jan 03 '20 at 14:26
  • How do you plan on stopping the recursive part? There is no definition of head or tail. You would have to derive a level and stop at that for this to work. Your DL 2 and DL 4 reference each other via DL 3 this would result in endless recursion 0_O. – Ross Bush Jan 03 '20 at 14:46
  • @RossBush This is exactly the problem I am looking to resolve. I had some recursive queries already and it works OK for simple examples but with the one I posted as example we get this error - "The statement terminated. The maximum recursion 100 has been exhausted before statement completion." – JetRocket11 Jan 03 '20 at 15:08
  • This is a setup issue. Image that Paul-->Mary-->Peter-->Paul. You have a circular reference unless you create a rule that says Peter has no dependencies (exit condition), in which case display the DL_Name and stop. This is generally handled with a NULL child or parent ID. As stated earlier you may need to add another layer of abstraction to the bits above. – Ross Bush Jan 03 '20 at 15:26
  • There are ways of tracking where a recursive query has been and terminating recursion on cycling branches. [This](https://stackoverflow.com/questions/42119057/sql-recursive-cte-finding-objects-linked-by-property/42139978#42139978) answer demonstrates one approach. – HABO Jan 03 '20 at 16:57

2 Answers2

1

Expanding on my comment about using a Recursive CTE here. Yours would look something like:

WITH reccte AS
(
    /*Recursive Seed - This is the first selection that begins the iterations*/
    SELECT DL_Name, Member_Name
    FROM DL_Members
    WHERE DL_Name = '*Test - DL - 1'

    UNION ALL

    /*Recursive Term - This SQL is run over and over again until the Join fails*/
    SELECT DL_Members.DL_Name, DL_Members.Member_Name
    FROM reccte
        INNER JOIN DL_Members
            ON reccte.Member_Name = DL_Members.DL_Name /*join member to DL*/
)
/*Select from the CTE*/
SELECT Member_Name FROM reccte
/*You can use the following to insure that infinite cycling doesn't occur*/
OPTION (MAXRECURSION 20); 

I haven't tested this, but it should be in the ballpark.

The following version keeps track of depth of recursion and kills the recursive term when it hits 20. If you want to go higher than 100 then you will also need that OPTION (MAXRECURSION N) where N is a higher number then your depth.

WITH reccte AS
(
    /*Recursive Seed - This is the first selection that begins the iterations*/
    SELECT DL_Name, Member_Name, 0 as depth
    FROM DL_Members
    WHERE DL_Name = '*Test - DL - 1'

    UNION ALL

    /*Recursive Term - This SQL is run over and over again until the Join fails*/
    SELECT DL_Members.DL_Name, 
        DL_Members.Member_Name,
        reccte.depth + 1
    FROM reccte
        INNER JOIN DL_Members
            ON reccte.Member_Name = DL_Members.DL_Name /*join member to DL*/
    WHERE depth <= 20 --prevent cycling more than 20 iterations.
)
/*Select from the CTE*/
SELECT DISTINCT Member_Name FROM reccte;

Lastly, here is an example where we construct the member path for the searched DL. Every time we add a member, we also add that member to the path. We can then search the path for any member the current iteration is trying to add to see if it's already present. If so, it kills the iteration.

WITH reccte AS
(
    /*Recursive Seed - This is the first selection that begins the iterations*/
    SELECT DL_Name, Member_Name, 0 as depth, CAST(DL_NAME + '>' + Member_Name AS NVARCHAR(50000)) as path
    FROM DL_Members
    WHERE DL_Name = '*Test - DL - 1'

    UNION ALL

    /*Recursive Term - This SQL is run over and over again until the Join fails*/
    SELECT DL_Members.DL_Name, 
        DL_Members.Member_Name,
        reccte.depth + 1,
        reccte.path + '|' + DL_Members.Member_Name
    FROM reccte
        INNER JOIN DL_Members
            ON reccte.Member_Name = DL_Members.DL_Name /*join member to DL*/
    WHERE 
        /*Prevent this from iterating more than 20 times*/
        depth <= 20 
        /*Also we can check to see if this member already exists in the path we collected*/
        AND reccte.path NOT LIKE '%' + DL_Members + '%'

)
/*Select from the CTE*/
SELECT DISTINCT Member_Name FROM reccte;
JNevill
  • 46,980
  • 4
  • 38
  • 63
  • I had that query already and it works for some but with the example table I provided which has multiple DL's as members of each other it results in an infinite loop I believe and end results is SQL error "The statement terminated. The maximum recursion 100 has been exhausted before statement completion." – JetRocket11 Jan 03 '20 at 15:10
  • Yuck. You have DL's that refer to DL's that refer back to the original DLs? You can set how deep you want to iterate. How many iterations would be reasonable for your data? Like how deep would you need to go to insure you picked every unique member for a DL? – JNevill Jan 03 '20 at 15:17
  • I've updated my answer with the `OPTION (MAXRECURSION)` expression to show how to limit. You may want to toss a `SELECT DISTINCT Member_Name FROM reccte` in there as well since you will get duplicates if you have infinite cycles present in your hierarchy. – JNevill Jan 03 '20 at 15:20
  • It's still giving me the same recursion exhausted error and I have tried all the way up to 32767 which appears to be as high as you can go. Unfortunately it seems that Distribution Lists have no limits. They can have many DL's nested at first level and more DL's within those DL's and the same DL's can be members of other nested DL's. Its crazy. – JetRocket11 Jan 03 '20 at 15:30
  • I have tried 1 through 35 in MAXRECURSION statement and same error for all – JetRocket11 Jan 03 '20 at 15:37
  • I've just added a version that adds a `depth` field that is incremented on each cycle. The `WHERE` clause then cuts off the recursive term if the depth of `20` is reached. Feel free to set that where ever is appropriate to insure all members are captured. You can't go higher than 100 though without also monkeying with the `OPTION (MAXRECURSION)` – JNevill Jan 03 '20 at 15:38
  • There are other options here too to get more exact. For instance we can track every `member` we've come across in a varchar field (as a list) and then check that list for our current member and kill the recursive term if the member is already present (for instance). – JNevill Jan 03 '20 at 15:40
  • 1
    I've added that example as well. You'll see the `path` column added to the cte and then we check in the `WHERE` clause of the recursive term to see if that member has already been added. If it has, then we kill the iteration. This may still need some tweaking, but between `depth`, `path`, and `OPTION (MAXRECURSION N)` you should have plenty of control to avoid infinite cycles and get the right data out. – JNevill Jan 03 '20 at 15:48
  • Both examples worked great with my sample data. Then I tried the same against our real DL data and it has been running for nearly 30 minutes now. No errors but just executing with no end. The real DL table I used has 485529 total records but I filtered it for 1 specific DL which has roughly 30 unique members but many nested DL's within parent and child DL's and many have same members in each. – JetRocket11 Jan 03 '20 at 16:22
  • Update: It took 43 minutes but the query against the real production DL data has completed and provided what appears to be the correct result. Thanks! Now the issue is execution time. Ideally we need the results in seconds. Is there anything that can be adjusted for that? – JetRocket11 Jan 03 '20 at 16:48
  • The 20 was just a complete guess since I have no idea how deep your cycles go. My guess is that the path construction and testing in the WHERE clause is VERY expensive and is ballooning the run time. What I would do is select from the CTE like `SELECT max(depth) FROM reccte;` and see what pops up. If the number is lower than 20, then choose a number a bit higher (to allow for growth) and then remove the `path` stuff from the reccte and rerun to see if it helps with execution time. Recursive CTE's are, by their nature, pretty CPU intensive since every branch causes more lookups. – JNevill Jan 03 '20 at 16:59
  • I was able to get it to run in 2 seconds when filtered for one specific DL. Same one that took nearly an hour before. Main issue with slowness is that I was using a view as the source which was processing other things. Re-pointed to table and it was much faster. Set MAX recursion to 6 and depth to <=6. Also left the path filter in the query. Now I just need to figure out how to get all results so its like a view. Right now it can only be used as a procedure where you pass a DL you want to see results for as a parameter. Thanks. – JetRocket11 Jan 08 '20 at 17:38
0

I believe this is what you are looking for:

select  Member_Name 
from DL_Members
where RIGHT ( DL_Name , 1 ) >= RIGHT ( '*Test - DL - 1' , 1 ) ;

Here is a DEMO for you.

VBoka
  • 8,995
  • 3
  • 16
  • 24
  • 1
    Thanks but although this works for the example I listed, this is limited to the example only because among few other things it assumes that all DL's end with a number which in production is not the case. I only used that example for simplicity. We have thousands of DL's with varying name lengths, formats, etc. – JetRocket11 Jan 03 '20 at 15:34