1

I've a relatively large DB used for IoT data, approximately around 60 million records. Inserts are performed very frequently in BULK Inserts from Stream Analytics.

This is my table schema:

CREATE TABLE [dbo].[NVEControllerReadings](
[DeviceUniqueIdentifier] [nvarchar](100) NOT NULL,
[NVEControllerTimestamp] [datetimeoffset](7) NOT NULL,
[ProcessedInAzureUtc] [datetimeoffset](7) NOT NULL,
[ParameterTypeId] [int] NULL,
[InstanceId] [int] NULL,
[ParameterNumberId] [int] NOT NULL,
[ParameterValue] [float] NULL,
[ParameterText] [nvarchar](255) NULL)

When performing queries, we're always looking for the latest records for a device, so I've the following clustered index:

CREATE CLUSTERED INDEX [IX_NVEControllerReadings] ON [dbo].[NVEControllerReadings](
[DeviceUniqueIdentifier] ASC,
[NVEControllerTimestamp] DESC)WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

I also have a non clustered index for covering indexes targeted against ParameterTypeId, ParameterNumberId and InstanceId.

CREATE NONCLUSTERED INDEX [IX_ParameterTypeId_ParameterNumberId_InstanceId] ON [dbo].[NVEControllerReadings](
[ParameterTypeId] ASC,
[ParameterNumberId] ASC,
[InstanceId] ASC) INCLUDE (     [ParameterValue]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

The ParameterValue is included as a key column, because that's the final result of the query, I'm interested.

However, when executing queries targeted against my nonclustered index, it can take up the 3-5 minutes to return a single result, which I don't understand. According to my Exceution Plan, the nonclustered index is used as intended with a Index Seek.

Here's a link to the execution plan: https://www.brentozar.com/pastetheplan/?id=r1NAwrRUN (The query took 03:32 to complete).

I've tried rebuilding my indexes to achieve a lower fragmentation rate and updating the statistics, but without luck so far.

Can anyone point me in a direction to my problem?

Thanks in advance.

Mortenkp25
  • 171
  • 1
  • 3
  • 14
  • My first idea would be: you're using the wrong type of data store for the data you have to store. Have a look at [Data Lake Store](https://azure.microsoft.com/en-us/services/storage/data-lake-storage/) for instance. – rickvdbosch Mar 07 '19 at 07:53

2 Answers2

2

You are still selecting every row, and then sorting it, but it's only returning the top 1. Try taking it from another direction, use an aggregate function to limit the selection to a single row, something along the lines of:

SELECT [ParameterValue]
FROM [dbo].[NVEControllerReadings] n1
  join (select max(NVEControllerTimestamp) as Mostrecent, DeviceUniqueIdentifier
        from [dbo].[NVEControllerReadings] 
        WHERE DeviceUniqueIdentifier = '04EFB80706A7'
          AND ParameterTypeId = 19 AND ParameterNumberId = 24
          AND InstanceId = 1
        Group by DeviceUniqueIdentifier) n2 on n2.DeviceUniqueIdentifier = n1.DeviceUniqueIdentifier
     and n1.timestamp = n2.Mostrecent

In my experience Azure can be real hit or miss on performance, and you'll often have to try many different permutations on queries. This is because underneath the sql exterior azure is something very different from a local instance of SQL server. For instance your primary key solution probably isn't working because it doesn't store data in pages arranged in order of the clustered on a physical disk. Anyhow, hope this helps!

Randall
  • 1,441
  • 13
  • 19
0

For this query:

SELECT TOP (1) [ParameterValue]
FROM [dbo].[NVEControllerReadings]
WHERE DeviceUniqueIdentifier = '04EFB80706A7' AND
      ParameterTypeId = 19 AND
      ParameterNumberId = 24 AND
      InstanceId = 1
ORDER BY NVEControllerTimestamp desc;

The optimal index is on (DeviceUniqueIdentifier, ParameterTypeId, ParameterNumberId, InstanceId, NVEControllerTimestamp desc). I would try this first.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • What about my clustered index on (DeviceId, NVEControllerTimestamp)? Should this be altered when I now have a non-clustered index with these keys? – Mortenkp25 Mar 07 '19 at 12:27
  • @Mortenkp25 . . . You want an index with all four columns in the `WHERE` (at least) and then the ordering column. – Gordon Linoff Mar 07 '19 at 12:39
  • I found it that it's actually an fragmentation issue. After rebuilding the indexes, they quickly become fragmented again. I'll try introducing a Identity column and redesign my clustered index as a composite key of (DeviceUniqueIdentifer,Id). Thanks :) – Mortenkp25 Mar 07 '19 at 15:10