1

I have a pretty strange problem regarding a MSSQL database, where every now and then a query that is otherwise acceptable fast (<<0.5sec) comes down to an absolutely unacceptable level (always around 7-8minutes). Once the db got so slow, it will stay slow until reboot/restarting the service. It's somehow unpredictable, when this problem occurs, sometimes directly after restart, normally only after 50-100 or even more queries. It seems to occure more often on two presentation laptops with windows 8 (one using MSSQL Express 2k8, one 2k12) compared to my development installation inside a windows 7 virtual machine with 2k8 R2(with lower memory and only one core), though it is happening there, too.

Other facts regarding Configuration:

The client is a Tomcat application, connection is configured as follows in the context.xml:

<Resource name="jdbc/mydb" auth="Container" type="javax.sql.DataSource"
    maxActive="50" maxIdle="30" maxWait="10000" username="user"
    password="password" driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver"
    url="jdbc:sqlserver://localhost:1433;DatabaseName=mydb" />

The query itself is selecting the top 100 rows, 14 columns by joining 3 tables (customers: 32000 rows, 100 columns; customer objects: 47000 rows, 140 columns; customer object properties: 230 rows, 9 columns), both joins have two conditions, in addition i have 6 parameters resulting all together in 5 String comparisons with = and another 5 with LIKE and COALESCE. The result is a list of customers, each could have 0 or more customer objects, while each customer object has exactly one of those properties. Sadly there are no indices, nor explicit primary keys, and those columns that should be PK are Strings, as most other columns. But that's nothing I have any influence on, and as far as I understand I see no reason to impact the performance every now and than, instead of always and forever.

In respect to the source owner I'd have to abstract the schema and the statement, but if it is requested, I could do that. Same goes with the execution plan, once i hit the bottleneck again.

There is nothing else running, that could slow the system down that much.

What I've been looking into, so far:

  • in some java version (can't remember which) the mssql jdbc driver had a bug
  • memory on java or mssql shouldn't be an issue (MSSQL sticks below 300MB, java isn't hit much at all, before/after execution of the query)
  • updated query to use COALESCE instead of null comparisons, didn't matter
  • added TOP 100 instead of PreparedStatement.setMaxRows(100), didn't matter
  • compared the executed statement from the text column I got after exuting this statement with the original/intended statement and (once bugged) tried the same statement from SQL Management Studio, to see if it occurs here as well - it didn't
  • tried to interpret the execution plan while bugged and compared with an execution plan from inside SQL Management Studio - there where differences, but I have no experience to determine what is the fault and why the execution plan could differ from time to time

At last, I have to say sorry for the long post and I think the question is still leading to guesses, but so is the problem for me: I would understand it, if it was somehow reproducable, but I have no idea into what direction I should look. My last resorts seem to be: test other jdbc drivers or split the query into 3 by first determining what rows I want to have and than select the corrosponding values.

Kind regards,

Christian

PS: totally forgot to mention, there are no java exceptions at all during the execution, as long as I don't interrupt anything, I will get the results after 8minutes.

Community
  • 1
  • 1
Christian R.
  • 1,528
  • 9
  • 16
  • This does not fit your case 100%, but one thing you could try is explicitly setting ARITHABORT for your JDBC connection to ON. A quick way to test is to simply execute "SET ARITHABORT ON" before your query. – xpa1492 Dec 19 '14 at 05:51

1 Answers1

1

why the execution plan could differ from time to time

This behavior may be caused by the feature of SQL Server called parameter sniffing.

There is a very good article on this topic by Erland Sommarskog.

In essence, when you run your query for the first time SQL Server optimizer generates an execution plan, which can be optimized using the values of parameters actually used in this first run. This execution plan is cached and reused for future calls of this query. If in the next calls of the query the parameters are different it may turn out that the cached execution plan is no longer optimal. In fact, it can be really bad, which leads to long execution time.

One possible quick fix would be to add OPTION (RECOMPILE) to the query that sometimes runs fast and sometimes runs slow.

For detailed analysis and various approaches to this problem read the Erland's article. He also explains what SET ARITHABORT does in this case and why it is not a solution (if the problem is parameter sniffing).

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • Thanks for your answer, very interesting read that at least showed up, how bad my measurement is (made no sense to compare the execution plans from a long query with the very fast query inside SSMS with declared variables). I'll have to wait for the bug to reoccure until i could test it, thanks again – Christian R. Dec 19 '14 at 12:18
  • @ChristianR., did you eventually figure out what the problem was? – Vladimir Baranov Feb 24 '15 at 00:13
  • Hey, totally forgot to accept your answer. I could not reproduce the issue as long as I added `OPTION (RECOMPILE)`; on the other hand I could reproduce it multiple times in a week of excessive testing without adding it. Many thanks for this fix <3 – Christian R. May 04 '15 at 09:12