We find the performance issue while using SQLCLR user defined aggregate function(UDA),
Here are our scenario: We have 2 columns needed to calculate: key and value, whose value would like:
key | value | |
---|---|---|
Row_1 | a/b/c/d/e | 1/2/3/2/1 |
Row_2 | a/b/c/d/e | 2/0/1/2/3 |
Row_3 | a/b/c/d/e | 2/3/4/1/2 |
We need an aggregate function to get the sum of each metric, in the upper example, we want to get the result like this:
key | value | |
---|---|---|
Result | a/b/c/d/e | 5/5/8/5/6 |
there is no native aggregate function we can use to get this kind of result, so we use an SQLCLR UDA to achieve this result. We found that the performance of this UDA is bad when SQLCLR UDA used with a GROUP BY clause.
After some investigation, we find that the reason below:
- When we use a SQLCLR UDA, a StreamAggregate operation must be used, and an expensive sort operator will be introduced, decreasing the performance of the UDA.
- When we use a SQLCLR UDA, only row mode can be used to calculate the results in the sort and aggregate operator.
So, my question:
- Is there any chance for user to force SQL Server to use hash aggregator operator instead of stream operator while using user defined aggregate function?
- Is there any chance for user to get ride of the sort operator while using user defined aggregate function?
- Is there any chance for user to use batch mode while using user defined aggregate function?