6

We moved our SQL Server from an Azure VM to an Azure SQL Database. The Azure VM was DS2_V2, 2 core, 7GB RAM, 6400 max IOPS The Azure SQL Database is Standard S3, 100 DTU. I chose this tier after running the Azure DTU Calculator tool on the Azure VM for 24 hours - it suggested this tier for me.

The problem is that queries (mostly SELECT and UPDATE) are painfully slow now, compared to how they were on the Azure VM. One thing I noticed is that while running a query, I went to the Resource Utilization graph under Monitoring in the Azure Portal, and it's pinging 100% throughout the time any query is being run. Does this mean my tier is in fact too low? I would hope not because the next tier up is a pretty big jump in cost.

Just for information, the Azure SQL Database is identical in schema and data to the Azure VM database, and I rebuilt all indexes (including Full-Text) after the migration.

In my research thus far I've read everything from making sure my Azure SQL DB is in the right region on Azure (it is) to network latency (non-existent on Azure VM) causing the issue.

David Makogon
  • 69,407
  • 21
  • 141
  • 189
Stpete111
  • 3,109
  • 4
  • 34
  • 74
  • 1
    I presume it's the same VNET, and same authentication as the VM (?) – Stefano d'Antonio Apr 25 '17 at 15:23
  • 1
    Maybe ensure that your indexes are being used by looking at the execution plan? – EMUEVIL Apr 25 '17 at 15:28
  • 1
    @Stefanod'Antonio Yes both are correct. Thanks for your reply. – Stpete111 Apr 25 '17 at 15:38
  • @EMUEVIL good point. I will check this. – Stpete111 Apr 25 '17 at 15:39
  • Hello, one area to check is sys.dm_db_resource_stats to see if you are indeed reaching your DTU limits for the equivalent workloads (you can use sys.resource_stats as well - which retains data up to 14 days). Look at this first to see if you are hitting DTU limits. If not hitting limits - then compare execution plans - for example serial vs. parallel in DB vs. VM. – Joe Sack Apr 25 '17 at 16:34
  • My Azure SQL db does not have a view called sys.resource_stats - it only has sys.resource_stats_raw, which brings back no results when I query on it. I did also try querying sys.dm_db_resource_stats but it only gave me history from today. Seems I need that 14 days history, but I don't have a sys.resource_stats? – Stpete111 Apr 26 '17 at 17:07
  • Use the master database to view sys.resource_stats. Has a longer history (at 5 minute intervals). – Joe Sack Apr 27 '17 at 15:26
  • It's best to use `sys.dm_db_resource_stats` over the base `sys.resource_stats`. – pim May 21 '18 at 12:04

1 Answers1

5

How long has this system been running now as an Azure SQL Server Database? Presumably if it's more than a few hours old (i.e. some "production" queries have hit it) and it's generated some useful statistics.

Analyzing this and determining the source of your problem will be a multi-pronged strategy.

Service Tier Check

Try the following queries, which determine whether you are at the correct service level:

-----------------------
---- SERVICE TIER CHECK
-----------------------
-- The following query outputs the fit percentage per resource dimension, based on a threshold of 20%.
-- IF the query below returns values greater than 99.9 for all three resource dimensions, your workload is very likely to fit into the lower performance level.
SELECT 
    (COUNT(end_time) - SUM(CASE WHEN avg_cpu_percent >= 20 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'CPU Fit Percent'
    ,(COUNT(end_time) - SUM(CASE WHEN avg_log_write_percent >= 20 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Log Write Fit Percent'
    ,(COUNT(end_time) - SUM(CASE WHEN avg_data_io_percent >= 20 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Physical Data Read Fit Percent'
FROM sys.dm_db_resource_stats

-- Look at how many times your workload reaches 100% and compare it to your database workload SLO.
-- IF the query below returns a value less than 99.9 for any of the three resource dimensions, you should consider either moving to the next higher performance level or use application tuning techniques to reduce the load on the Azure SQL Database.
SELECT 
(COUNT(end_time) - SUM(CASE WHEN avg_cpu_percent >= 100 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'CPU Fit Percent'
,(COUNT(end_time) - SUM(CASE WHEN avg_log_write_percent >= 100 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Log Write Fit Percent'
,(COUNT(end_time) - SUM(CASE WHEN avg_data_io_percent >= 100 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Physical Data Read Fit Percent'
FROM sys.dm_db_resource_stats

Resource Consumption Levels

It would also be useful to check the resource consumption, which you can do using the following query. This will report things like DTU consumption and IO.

-----------------
-- Resource Usage
-----------------
select *
from sys.dm_db_resource_stats 
order by end_time desc

Indexes

It's also worth a quick check whether you have missing indexes or whether some of your existing indexes are getting in the way.

The missing index query is a doozy, but should be taken with a grain of salt. I generally see it as an advisement on how the db is being used and I make my own judgement on which indexes to add, and how. For example, as a general rule of thumb, all foreign keys should have non-clustered indexes to facilitate the inevitable JOIN's they're involved in.

--------------------
-- Find poor indexes
--------------------
DECLARE @dbid int
SELECT @dbid = db_id()

SELECT 'Table Name' = object_name(s.object_id), 'Index Name' =i.name, i.index_id,
        'Total Writes' =  user_updates, 'Total Reads' = user_seeks + user_scans + user_lookups,
        'Difference' = user_updates - (user_seeks + user_scans + user_lookups)
FROM sys.dm_db_index_usage_stats AS s 
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = @dbid
AND user_updates > (user_seeks + user_scans + user_lookups)
ORDER BY 'Difference' DESC, 'Total Writes' DESC, 'Total Reads' ASC;

------------------
-- Missing Indexes
------------------
declare @improvementMeasure int = 100

SELECT
CONVERT (decimal (28,1), 
migs.avg_total_user_cost * 
migs.avg_user_impact * 
(migs.user_seeks + migs.user_scans)) 
AS improvement_measure, 
OBJECT_NAME(mid.object_id, mid.database_id) as table_name,
  mid.equality_columns as index_column,
  mid.inequality_columns,
  mid.included_columns as include_columns, 
'CREATE INDEX IX_' + 
OBJECT_NAME(mid.object_id, mid.database_id) + 
'_' + 
REPLACE(REPLACE(mid.equality_columns, '[', ''), ']', '') + 
' ON ' + 
mid.statement + 
' (' + ISNULL (mid.equality_columns,'') + 
CASE WHEN mid.equality_columns IS NOT NULL 
AND mid.inequality_columns IS NOT NULL 
THEN ',' 
ELSE '' 
END + ISNULL (mid.inequality_columns, '') + 
')' + 
ISNULL (' INCLUDE (' + mid.included_columns + ')',
'') AS create_index_statement, 
migs.user_seeks,
migs.unique_compiles,
migs.avg_user_impact,
migs.avg_total_user_cost

FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs 
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details mid 
ON mig.index_handle = mid.index_handle
WHERE CONVERT (decimal (28,1), 
migs.avg_total_user_cost * 
migs.avg_user_impact * 
(migs.user_seeks + migs.user_scans)) > @improvementMeasure
ORDER BY migs.avg_total_user_cost * 
migs.avg_user_impact * 
(migs.user_seeks + migs.user_scans) DESC

Maintenance

A maintenance plan should also be setup, whereby you are rebuilding indexes and statistics on a somewhat regular basis. Unfortunately there is no SQL Agent in an Azure SQL environment. But Powershell and either an Azure function or Azure WebJob can help you schedule and execute this. For our on-prem and azure servers, we do this weekly.

Note that WebJob's would only help if you have a pre-existing App Service to run it within.

For scripts on helping you with index and statistics maintenance, checkout Ola Hallengren's script offering.

pim
  • 12,019
  • 6
  • 66
  • 69
  • 2
    Incredibly detailed answer - thank you for that. I have a few questions: Service Tier Check query: I'm returning 1.00000 for all 3 columns for both queries. Should I be reading this as 100% (greater than the 99.9 mentioned in the comments) or 1 (way less than the 99.9 percent mentioned in the comments). Resource Usage query - I receive the error "Invalid column name 'start_time'. Find Poor Indexes query - how do I interpret the results of this query? It lists my indexes with Total Writes, Total Reads and Difference. What am I looking for here? (final question in next comment...) – Stpete111 May 21 '18 at 21:16
  • And finally, Missing Indexes query - this is an incredible query. Once I run it, am I simply looking to follow the suggestions returned in the 'create_index_statement' column? – Stpete111 May 21 '18 at 21:19
  • In the comments for the Service Tier query, where it says "if the query above..." in two places, should those say "if the query below..."? I'm confused about what comment refers to what query. – Stpete111 May 21 '18 at 21:25
  • 1
    My pleasure @Stpete111. Apologies for the outdated resource check. I had forgotten they deprecated the dual time stamping. I have updated the query accordingly. Great question, a value of `1.0` indicates 100%. You are absolutely correct, it should be below. – pim May 22 '18 at 11:05
  • 1
    The missing index query is a doozy, but should be taken with a grain of salt. I generally see it as an advisement on how the db is being used and I make my own judgement on which indexes to add, and how. For example, as a general rule of thumb, all foreign keys should have non-clustered indexes to facilitate the inevitable `JOIN`'s they're involved in. – pim May 22 '18 at 11:09
  • Back to my question about the Find Poor Indexes query. How shall I use the information that is returned from this query? – Stpete111 May 22 '18 at 19:29
  • @Stpete111 see my comment above. – pim May 22 '18 at 19:39
  • I only see where you reference the Missing index query. I don't see how to use the info that is returned as part of the poor index query. – Stpete111 May 22 '18 at 20:14
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/171575/discussion-between-mathijspim-and-stpete111). – pim May 22 '18 at 20:22