3

Few weeks back I did run an in-place upgrade from SQL Server 2016 to SQL Server 2017 CU 5 in one of our main environments. I used Slipstream upgrade so it was one shot. This is a 2 nodes shared storage FCI with single instance and 1 replica on a remote datacenter (so total 2 instances, 1 AG , several databases in the AG, async). The upgrade went mostly OK

The next step was to enable Query Store and Automatic Query Tuning in the most critical database (2 TB Data File, 100 GB Tr Log file, heavily loaded). That was one of the main purpose of the upgrade

Few days later I noticed that the transaction log usage for that DB was about 80% and growing. Log_reuse_wait_desc was ACTIVE_TRANSACTION

Running DBCC OPENTRAN and got below

Transaction information for database 'MyDatabase'.

Oldest active transaction: SPID (server process ID): 62s UID (user ID) : -1 Name : QDS nested transaction LSN : (6017515:1461640:1) Start time : Apr 9 2018 3:01:06:073AM SID : 0x0 DBCC execution completed. If DBCC printed error messages, contact your system administrator.

And here is the session ID 62

spid kpid blocked waittype waittime
lastwaittype waitresource dbid uid cpu
physical_io memusage login_time
last_batch ecid open_tran status sid
hostname program_name hostprocess cmd nt_domain
nt_username net_address net_library loginame
context_info sql_handle stmt_start stmt_end
request_id 62 -12076 21 0x0005 454 LCK_M_X
MD: database_id = 17 QDS_STATEMENT_STABILITY(qds_statement_key = 0xd1c26b6), lockPartitionId = 0

17 1 3857719 179918 0
2018-04-09 03:01:05.947 2018-04-09 03:01:05.947
0 1 background
0x0100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 QUERY STORE BACK
sa
0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0x0000000000000000000000000000000000000000 0 0
0

even trying to kill spid 62 is not allowed since it is saying "Only user processes can be killed." even when this is over spid 50

I am not able to do much besides seen the transaction log usage going up so I decided to disable automatic query tuning and query store

Disabling automatic query tuning was quick but not for Query Store. I did it using T-SQL . the statement ran for 3 hours doing nothing more than blocking all the system objects for that db. Transaction log usage was over 95%

disabling or cleaning QDS get blocked by this spid 62

do I cancelled the statement . Finally I restarted the instance and when it came back online I was able to disable QDS right away.

Any ideas ? Have you seen this before ?

Thanks Regards

Javier Villegas

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    seems like something you should report on the SQL Server uservoice board rather than here. – Martin Smith Jun 15 '18 at 22:48
  • I have seen this. I suspect it is QDS processing its async captures (as a worst case, perhaps days) after a large spike in queries that need to be persisted. Such a spike can happen when (as an example) a scalar function is called as a predicate against large table(s). I have more than one example of that poor practice being submitted to our SQL Servers. For us, a scalar function is being called in the low tens of million times. In addition to the QDS nested transaction blocking, for us the query store exceeds its max storage size. – Bill Dec 08 '19 at 20:38

0 Answers0