2

I need someone who can tell me what I'm missing.

I have this scalar function in SQL Server 2008:

ALTER function [dbo].[SKU](@id1 int, @id2 int)
returns int
begin
return (
 SELECT  SUM(Value)
 FROM Table
 where id_1 = @id1
 and id_2 = @id2)
end

And the table is like this:

 id_1      id_2    Value
 1004       1       10
 1004       1       30
 1004       2       100
 1005       1       90
 1005       1       5
 1005       1       5

If I execute:

select [dbo].[SKU](1004,1)

it returns 40 - That's ok

select [dbo].[SKU](1004,2)

returns 100 - OK

select [dbo].[SKU](1005,1)

returns 100 - OK

At this point all seems ok, but my table has almost a millon rows... the result of SKU goes to the same table (update part).

But I ran it for two hours now, and is still running...

My question: I've never seen such as long time consuming query. It's ok? I'm missing something?

Thanks!, and happy new year ! D:

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Esselans
  • 1,540
  • 2
  • 24
  • 44
  • How is it used in an update? Do you have a covering index on `id_1, id_2` including `Value`? – Lieven Keersmaekers Dec 29 '11 at 02:42
  • Can you post your update statement, I wonder if you've created a deadlock condition... – Sparky Dec 29 '11 at 02:44
  • update table set result = dbo.SKU(id_1, id_2) – Esselans Dec 29 '11 at 02:44
  • @Harzlce - You are updating all columns from a table with a single value? Is the table that is being updated the same as the one used in the function? Did you try using a variable to hold the result of the function and use that variable in the update? – Lieven Keersmaekers Dec 29 '11 at 02:46
  • 1
    As to *No indexes that I know*, you should definitly create a covering index. Something like `CREATE INDEX IX_TALE_ID_1_ID_2_VALUE ON dbo.Table (id_1, id_2) INCLUDE (Value)` – Lieven Keersmaekers Dec 29 '11 at 02:47
  • Yes, it's the same table. I use id_1, id_2 to sum(value) and write it on "result" column. With singles queries, it's fine. But altogheter it dies. – Esselans Dec 29 '11 at 02:48
  • And with indexes, it should be faster? Thanks for your quicks replies. – Esselans Dec 29 '11 at 02:49
  • It's not clear to me why you would need a result column that contains *the same value* for each row or why an aggregated result should be maintained in the table that is being aggregated at all. It looks like you are not telling us the whole story. There *is no point* in updating a table continuously and in the end have *the same* value in every row. Are you sure you are not using a where clause to do the update? – Lieven Keersmaekers Dec 29 '11 at 02:51
  • Then the result is divide by id_2 in each row. (SUM(Value) / id_2), but the SUM is limited to id_1, id_2. If you have clear idea, be my guest. I'll appreciate that. I'm kind of list right now, dealing with this all afternoon. I'm testing with indexes. – Esselans Dec 29 '11 at 02:56
  • 1
    Hey, it seems the indexes do the trick. it took only 3 minutes. Thanks man! i owe you. If you want, move it to an answer, so i can vote it. – Esselans Dec 29 '11 at 03:02
  • Good to know it helped but I'm still not convinced this is the right table design for you. – Lieven Keersmaekers Dec 29 '11 at 03:05
  • Haha, so true, but the person who did this got fired, and everything around here it's like this. Happy new year to you!, thank you! – Esselans Dec 29 '11 at 03:07
  • 1
    I have added my comment as an answer with a kind of a disclaimer attached to it. You have my sympathy for maintaining a system like this. – Lieven Keersmaekers Dec 29 '11 at 03:12

3 Answers3

1

This is not to be interpreted as an answer but an attempt to drill down to the real problem Currently, this is as how I interpretate the actions that get executed

Starting from the initial table

 id_1      id_2    Value  Result
 1004       1       10    NULL
 1004       1       30    NULL
 1004       2       100   NULL
 1005       1       90    NULL
 1005       1       5     NULL
 1005       1       5     NULL

After update table set result = dbo.SKU(1004, 2) this would become

 id_1      id_2    Value  Result
 1004       1       10    40
 1004       1       30    40
 1004       2       100   40
 1005       1       90    40
 1005       1       5     40
 1005       1       5     40

After update table set result = dbo.SKU(1004, 1) this would become

 id_1      id_2    Value  Result
 1004       1       10    100
 1004       1       30    100
 1004       2       100   100
 1005       1       90    100
 1005       1       5     100
 1005       1       5     100

After update table set result = dbo.SKU(1005, 1) this would become (remain)

 id_1      id_2    Value  Result
 1004       1       10    100
 1004       1       30    100
 1004       2       100   100
 1005       1       90    100
 1005       1       5     100
 1005       1       5     100

and somehow after that, the result is divided by id_2

 id_1      id_2    Value  Result
 1004       1       10    100
 1004       1       30    100
 1004       2       100   50
 1005       1       90    100
 1005       1       5     100
 1005       1       5     100

Clearly, my interpretation and what really happens don't match (at least I hope so).

Lieven Keersmaekers
  • 57,207
  • 13
  • 112
  • 146
1

If changing the table design or programming to it is not an option, an easy solution would be to create a covering index on the fields you are using in your function.

Something like

CREATE INDEX IX_TABLE_ID_1_ID_2_VALUE ON dbo.Table (id_1, id_2) INCLUDE (Value)
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Lieven Keersmaekers
  • 57,207
  • 13
  • 112
  • 146
1

This might get you what you need a little quicker if you don't have to use a function.

;with sumVal
as
(
select t1.id_1, t1.id_2, SUM(t1.value) [result]
from [table] t1
group by t1.id_1, t1.id_2
)

select t2.*, s.result
from sumVal s
left join [table] t2 on s.id_1 = t2.id_1 and s.id_2 = t2.id_2

It ran in less than 5 seconds on over 800,000 rows on my test.

Rory
  • 59
  • 1