0

I'm wanting to create a subset of data for testing purposes, where I start with a Users table, select X of type a, Y of type b, Z of type c (and so on), and create a new table containing those users. I then need to to create all related tables (with the necessary records), and then all related tables to those...

I'm fairly certain the best way to do this is a recursive common table expression, however I don't have much experience with those and was hoping someone could help me out. So far I have the below, but have noticed two things:

  1. Level 2 seems to contain everything that was in level 1, in addition to the new records I'd expect
  2. This doesn't actually create/insert the records yet (I still need help with that)

    WITH cte AS
    (
        SELECT DISTINCT fk.object_id, fk.schema_id, fk.parent_object_id, fc.parent_column_id, t.schema_id AS referenced_schema_id, fk.referenced_object_id, ic.column_id AS referenced_column_id, 1 AS Level
        FROM sys.foreign_keys fk
            INNER JOIN sys.tables t ON fk.referenced_object_id = t.object_id
            INNER JOIN sys.foreign_key_columns fc ON fk.object_id = fc.constraint_object_id
            INNER JOIN sys.indexes i ON fk.referenced_object_id = i.object_id AND i.is_primary_key = 1
            INNER JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
        WHERE fk.type = 'F' 
            AND fk.referenced_object_id = OBJECT_ID(N'dbo.Users', N'U')
    
        UNION ALL
    
        SELECT fk.object_id, fk.schema_id, fk.parent_object_id, fc.parent_column_id, t.schema_id, fk.referenced_object_id, ic.column_id AS referenced_column_id, cte.Level + 1
        FROM sys.foreign_keys fk     
            INNER JOIN sys.tables t ON fk.referenced_object_id = t.object_id
            INNER JOIN sys.foreign_key_columns fc ON fk.object_id = fc.constraint_object_id
            INNER JOIN sys.indexes i ON fk.referenced_object_id = i.object_id AND i.is_primary_key = 1
            INNER JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
            INNER JOIN cte ON fk.referenced_object_id = cte.parent_object_id
        WHERE fk.type = 'F' 
            AND fk.parent_object_id <> cte.referenced_object_id
    ),
    cteHierarchy AS (
        SELECT DISTINCT
            OBJECT_NAME(cte.object_id) AS ReferringKey,
            SCHEMA_NAME(cte.schema_id) AS ReferringSchema,
            OBJECT_NAME(cte.parent_object_id) as ReferringTable,
            COL_NAME(cte.parent_object_id,cte.parent_column_id) ReferringColumn,
            SCHEMA_NAME(cte.referenced_schema_id) AS ReferencedSchema,
            OBJECT_NAME(cte.referenced_object_id) as ReferencedTable,
            COL_NAME(cte.referenced_object_id,cte.referenced_column_id) ReferencedColumn,
            Level
        FROM cte
    )
    SELECT *
    FROM cteHierarchy
    ORDER BY Level, ReferencedSchema, ReferencedTable, ReferringTable;
    

Thank you for your assistance!

Jake
  • 893
  • 2
  • 9
  • 17
  • 1
    Please share some sample data and your desired results. It will go a long way to helping us understand what you are after instead of having to reverse engineer your non-working SQL to derive your intentions. – JNevill Oct 19 '18 at 13:21
  • While `fk.parent_object_id <> cte.referenced_object_id` in the recursive term isn't a problem, it does seem a little odd. This is essentially saying "Don't bring records through where the grandparent is the same as the child". Do you have cycling issues in this data that this is necessary? – JNevill Oct 19 '18 at 13:25
  • @JNevill The users table has a self join, and I'd read that absent the line you quoted I'd experience infinite recursion. My understanding is that it's saying "don't bring records through where the *parent* is the same as the child". – Jake Oct 19 '18 at 13:28
  • I think the join back on your cte is on the wrong field. – websch01ar Oct 19 '18 at 13:29
  • 1
    The way your cte is written the "Child" here is the CTE record in your Recursive Term (the second SELECT): `INNER JOIN cte ON fk.referenced_object_id = cte.parent_object_id`. "Go get the parent record (fk) for this child (cte)" is what that says; you are climbing up a tree, not down. So the WHERE predicate `AND fk.parent_object_id <> cte.referenced_object_id` essentially says "For this child (cte.referenced_object_id) insure it isn't the same as the grandparent `fk.parent_object_id`. I think you may need to rethink how you are joining here if that logic doesn't sound like what you want. – JNevill Oct 19 '18 at 13:37
  • @jnevill Thank you, spot on about the join. Changed it to `fk.referenced_object_id <> cte.referenced_object_id` and it's now behaving as expected. – Jake Oct 19 '18 at 13:45
  • 1
    That makes more sense "Don't grab the parent, if the parent is the same as the child". I'm glad that did the trick! – JNevill Oct 19 '18 at 13:47

0 Answers0