0

Someone migrated a DB2 database to SQL Server and during the migration all PK and FK were lost. there is no way to have them back.

But thanks to this query I'm now able to Reverse Engineer the database diagram based on the datatype and column name. (I know, is a Reverse Engineer based on assumptions).

I now would like to create a Power BI dashboard with the Chord visual and create a link between tables that have the same column name. (the image is just an example to give you an idea)

enter image description here

But I cannot use From and To because I'm not tracking a change.

Maybe I'm using the wrong visual?

How to track things that are in common?

EDIT FOR BOUNTY:

If I run the query against Adventure Works:

WITH ColumnCount AS(
    SELECT s.name AS SchemaName,
           t.name AS TableName,
           c.name AS ColumnName,
           ct.[name] AS DataType,
           c.max_length,
           c.precision,
           c.scale,
           COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
    FROM sys.schemas s
         JOIN sys.tables t ON s.schema_id = t.schema_id
         JOIN sys.columns c ON t.object_id = c.object_id
         JOIN sys.types ct ON c.user_type_id = ct.user_type_id)
SELECT *
FROM ColumnCount CC
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
         CC.SchemaName,
         CC.TableName;

The query can group columns that have the same: ColumnName, DataType, max_length, precision, scale.

enter image description here

But how to render this in a Power BI Chord visual?

The goal is ti find link between tables.

Chord seems the best visual to archive it but if you have better visual to suggest I'm open to your tips.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Francesco Mantovani
  • 10,216
  • 13
  • 73
  • 113

1 Answers1

1

To relate the Power BI Chord visual, you need a minimum of two data points and a measure;

  • From (foreign key table)
  • To (primary key table)
  • Relationship Count

Depending on the number of tables, you may find the Chord chart a little overwhelmed by data, however if you bring those three values through form Adventure Works, you can produce

Power BI Chord Chart

The dataset you are starting with has a lot more info, and rows than you need, but doesn't naturally contain the relationship. By simplifying the dataset with the following, you can create your Chord data point

WITH ColumnCount AS(
    SELECT  t.name AS TableName,
            c.name AS ColumnName,
            COUNT(c.column_id) OVER (PARTITION BY c.[name],  c.max_length, c.precision, c.scale) AS Duplicates
    FROM  sys.tables t 
         JOIN  sys.columns c ON t.object_id = c.object_id
    WHERE   c.name  like '%Id'
    AND     c.name != 'rowguid'
    
)

SELECT      TableName ForeignTableName,
            LEFT(ColumnName, LEN(ColumnName)-2) PrimaryTableName,
            1 Relationship
FROM        ColumnCount CC
WHERE       CC.Duplicates > 1
AND         LEFT(ColumnName, LEN(ColumnName)-2) != TableName 
ORDER BY    PrimaryTableName, 
            CC.ColumnName,
            CC.TableName

I have made a few assumptions in the above for simplicity.

  • All foreign keys end in ID (to stop incorrect key matches like ActualCost) with

    WHERE c.name like '%Id'

  • Rowguid is not a relationship column and is therefore excluded

    AND c.name != 'rowguid'

  • We don't want a primary key table relationship to itself

    LEFT(ColumnName, LEN(ColumnName)-2) != TableName

  • Scheme is not important (and therefore removed)

The query will give you a result set like follows; Adventure Works table relationships

Then to create the chart, simply add the From, To and Values count with your PrimaryTableName, ForeignTableName and Relationship (count) values as follows

Chord chart configuration

Giving you the required Power BI Chord relationship chart

Power BI Chord by Field Id

As per your comments, if you want to do the same, but use the field names to match, (not assuming name[id]) to denote your fk / pk, then the following would work. I have included a section to force the primary table, based on the Primary Key Check Constraint, that you can comment out if you want to show all relationships, but be warned you will have thousands and they will be bi directional if you don't have a way to identify a primary key/table!

    WITH ColumnCount AS(
    SELECT s.name AS SchemaName,
           t.name AS TableName,
           c.name AS ColumnName,
           ct.[name] AS DataType,
           c.max_length,
           c.precision,
           c.scale,
           COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
    FROM sys.schemas s
         JOIN sys.tables t ON s.schema_id = t.schema_id
         JOIN sys.columns c ON t.object_id = c.object_id
         JOIN sys.types ct ON c.user_type_id = ct.user_type_id
         
         )

SELECT cc.TableName + '.' + cc.ColumnName ForeignTableName, cd.TableName + '.' + cd.ColumnName PrimaryTableName, 1 Relationship --,cc.ColumnName, cc.DataType,  cc.Duplicates
FROM ColumnCount CC 
    /*pk only joins - take ths out if you want all joins between fields, not just pk to fk */
    INNER JOIN  INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE Col on col.COLUMN_NAME = cc.ColumnName
    AND         col.TABLE_NAME = cc.TableName
    INNER JOIN  INFORMATION_SCHEMA.TABLE_CONSTRAINTS Tab on     Col.Constraint_Name = Tab.Constraint_Name
    AND Col.Table_Name = Tab.Table_Name
    AND Constraint_Type = 'PRIMARY KEY'
    /*end of pk only joins */

LEFT JOIN ColumnCount Cd on cc.ColumnName = cd.ColumnName
and cc.DataType= cd.DataType
and cc.TableName != cd.TableName
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
         CC.SchemaName,
         CC.TableName;

Plugged into the same structure, this provides a Chord diagram of Chord relationship from Primary Key Id

Daryl Wenman-Bateson
  • 3,870
  • 1
  • 20
  • 37
  • Thank you for your time @Darly, but the problem is that I don't have PK and FK in the database I have in my hands. No relationship at all. The relation I want to create is based on the columns name and datatype. Your query is searching for PK and FK therefore is out of scope. – Francesco Mantovani Nov 08 '21 at 14:02
  • You dont have a PK / FK, but in the Adventure Works example referenced, you have columns that are named as reference fields (i.e. those with Id at the end). What I've done is taken SalesOrderId and assumed this is a relationship to the SalesOrder table. There is no expectation of an actual relationship in the database, merely named columns. Is the adventure works example representative of your actual data with fields named with [Id] at end? – Daryl Wenman-Bateson Nov 08 '21 at 15:01
  • Btw, the same logic could be used purely on field names to create your Chord chart, as you were suggesting in the op, however you will end up with joins on columns that happen to have the same name (e.g. ActualCost, AccountName, RowGuid etc). With your data, you need to work out a pattern to determine the pk table, from the fk like i've done (by removing Id), and then add that as a to column to give your relationship – Daryl Wenman-Bateson Nov 08 '21 at 15:22
  • ...mmm... I want to base it purely on the column name and datatype. But I see your point. How can I modify your query and use column name & datatype to create 1 relation? I used AdverntureWork as an example but of course the DB I'm on is not AdventureWork. Thank you for your help – Francesco Mantovani Nov 08 '21 at 15:55
  • In your data, is there any way to know from a field whether its a fk or pk field (maybe named Id, [field] + Id or Constraint_Type = 'PRIMARY KEY'? In Adventure Works, the second and third option are true - do you have a similar pattern in your data? – Daryl Wenman-Bateson Nov 08 '21 at 16:16
  • okay @FrancescoMantovani - I've added a substitute query example at the eof the answer performing the join on the name and data type fields only. I've forced the primary key based on the check constraint, but this can be commented out if that's not suitable for your data – Daryl Wenman-Bateson Nov 08 '21 at 17:58
  • your query is amazing and is still helping us so much. Is there a way to click on the Chord diagram and make the diagram change and select only that table.column? Because currently I click on the table.column on hte chord diagram but nothing changes. That's why I have added filters on the side. Thank you – Francesco Mantovani Dec 15 '21 at 10:47
  • Pleased that you are finding it useful. You can click on the coloured bars of the chord diagram to select an entity and to filter. This should then highlight the relationships related only to that entity. Unfortunately, the labels themselves are not selectable – Daryl Wenman-Bateson Dec 15 '21 at 15:45
  • Hello @Darly, here is a video that will explain better: https://imgur.com/SLrf0J9 if I click on the Chord the Chord visual overall is not changing and the 2 filters on the right are not changing neither. But when I click on one of the filters the other filter changes to and the Chord visual too. I would like everything to change as soon as I click on a link on the Chord visual – Francesco Mantovani Dec 15 '21 at 21:21
  • 1
    You can create a 2 way filter, but only by creating a table. Create a new visual (don't convert your filter visual, as it wont work) based on the relationship columns and set the visual to table. The chord chart and the table should now be filterable both ways. Unfortunately, this doesn't work so well with the filter visual, as its only one way. – Daryl Wenman-Bateson Dec 16 '21 at 15:17