None of the options that microsoft recommends helped on my requirements (Rename, Table Switch and Partition Switch).
This is mainly because I want the Data Warehouse containing the important information as soon as I can, also I can't wait until the completion of every query running in the table to finnally do the ingestion. When in production, it is expected to have lots of long running queries.
Because DW has very limited delete options on the actual DELETE operation, I have decided to go around these limitations by changing the table in a way that I could satisfy the requirements of the DELETE operation.
So in order to accomplish that, I have decided to do the following:
- Add a new column to all tables, in which this column is a hash of the "PK"
columns of the table (The ones that together, bring unicity to the rows)
- Create all tables with the distribution on the hash column
- On the clustered columnstore tables, order the clustered columnstore
index by its "PK" columns
All the delete operations in the ingestion process would be just like below:
DELETE FROM [{schemaName}].[{tableName}] WHERE [{hashColumnName}] in (
SELECT [{hashColumnName}] FROM {stagingTableName}
)
With this changes I managed to achieve satisfatory ingestion speed for the daily incoming data. Of course it is way slower than any of the microsoft recommendations when ingesting big loads of data.
In my tests, I could achieve 450k rows per minute in a DW 300, and there are still lots of improvements to be done in the ingestion code.
Hope I have helped!