0

I have 100-150 Azure databases with same table schema. There are 300-400 tables in each database. Separate reports are enabled on all these databases.

Now I want to merge these database into a centralized database and generate some different Power BI reports from this centralized database.

The approach I am thinking is -

  • There will be Master table on target database which will have DatabaseID and Name.

  • All the tables on target database will have the composite primary key created with the Source Primary key and Database ID.

  • There will be multiple (30-35) instances of Azure data factory pipeline and each instance will be responsible to merge data from 10-15 databases.

  • These ADF pipelines will be scheduled to run weekly.

Can anyone please guide me that the above approach will be feasible in this scenario? Or there could any other option we can go for.

Thanks in Advance.

Joshi
  • 51
  • 6
  • Why you have divided the Azure SQL Databases in the first place? Why are you trying to merge them? If you are merging them just for creating the Power BI report it's a bad idea. – Francesco Mantovani Nov 27 '22 at 20:14

1 Answers1

0

You trying to create a Data Warehouse.

I hope you will never archive to merge 150 Azure SQL Databases because is soon as you try to query that beefy archive what you will see is this:

enter image description here

This because Power BI, as any other tool, comes with limitations:

  • Limitation of Distinct values in a column: there is a 1,999,999,997 limit on the number of distinct values that can be stored in a column.
  • Row limitation: If the query sent to the data source returns more than one million rows, you see an error and the query fails.
  • Column limitation: The maximum number of columns allowed in a dataset, across all tables in the dataset, is 16,000 columns.

A data warehouse is not just the merge of ALL of your data. You need to clean them and import only the most useful ones.

So the approach you are proposing is overall OK, but just import what you need.

Francesco Mantovani
  • 10,216
  • 13
  • 73
  • 113
  • At the end of my actions I want to derive some Power BI reports from more than 200 databases which are having 350+ tables with same schema. What should be the optimal way to achieve this? – Joshi Nov 28 '22 at 08:06
  • just import what you need, not the whole database – Francesco Mantovani Nov 28 '22 at 09:29