0

I have a table named MetricTbl with fields as User_Id, Entity, Metric, ModifiedDate, Value. Value, UserId are integer. Entity and Metric are nvarchar. Table has around 30 million rows.

I want to optimize the query - select SUM(Value) from MetricTbl where UserId = 1 and Entity = 'Room' and Metric = 'Temperature'

I have to design a strategy for this kind of query. I am using azure federation. And this is a federated table. I already created unique index on User_Id, Entity, Metric, ModifiedDate. Since it is azure federation so I cant have indexed views. Please suggest some strategy. If there is need I am allowed to create new table. Thanks Manish

manishkr1608
  • 313
  • 1
  • 3
  • 13
  • What exactly you want ???? – The Hungry Dictator Nov 28 '13 at 08:25
  • _whoa 30*10^6 rows_ whatz wrong with this then? _select SUM(Value) from MetricTbl where UserId = 1 and Entity = 'Room' and Metric = 'Temperature'_ – vhadalgi Nov 28 '13 at 08:30
  • 2
    Add index on userid, entity, metric with value as included column – Allan S. Hansen Nov 28 '13 at 08:33
  • As @AllanS.Hansen suggested, try adding the `value` column to your index as the last field. `(User_Id, Entity, Metric, ModifiedDate, Value)` or `(User_Id, Entity, Metric, ModifiedDate) INCLUDE (Value)`. This will mean the query can be answered by looking at just the index, and not need to do what is effectively a join from the index to the table (to find the values to sum). Then, if Azure allows it, investigate how clustered indexes might work and whether they seem appropriate in this case. – MatBailie Nov 28 '13 at 08:40

1 Answers1

0

I'm not really familiar with Azure Federation tech, but in a typical scenario like the one you described above, I would create a single index covering all of the columns used in the WHERE clause, with INCLUDE containing the column in your SELECT. If possible, I'd also create the index as filtered to save up on space and performance, depending on whether there are many values in each WHERE column you're never going to be querying for.

IE, if you only need the query for imaginary values like "Temperature", "Distance", and "Pressure", you could do something like this.

CREATE NONCLUSTERED INDEX Metric_SUM_IDX ON MetricTbl (UserId, Entity, Metric) 
INCLUDE (Value) 
WHERE Metric IN ('Temperature','Distance','Pressure')

But as said, the actual index depends entirely on your particular scenario. And I must remind you of the disclaimer that I don't know what special effects the sharding on Azure Federation might have on this. Probably none, but I can't be sure.

Kahn
  • 1,630
  • 1
  • 13
  • 23
  • Sharding is not having special effect...I have mentioned only one of the sharded database....but my problem is 20k users inserting the record and each entry is a key value pair and there are around 100 key value pair so I have around 2 million entries per day and then for every 15 minutes each record will get updated. and the next day I again have 2 million entries and there will be around 10k users accessing my website which displays the aggregated record /month/quarter and user can ask for custom reports. and I have to optimize this process. – manishkr1608 Nov 29 '13 at 12:12