0

I have a read query (not a stored procedure) within my Perl script for choosing the latest TestHistoryID value for a Test against the TestID. A Test (or simply a piece of code) runs multiple times as a part of regression process. While the TestID remains same for a test, the test receives a new TestHistoryID after each run. The pass/fail status, time-took-for-completing-run etc is stored against TestHistoryID (TestHistoryID is an integer that increases monotonously).

Since there are multiple TestHistoryID against each TestID (DB keeps history of last 50 runs and deletes the oldest entries if it goes beyond 50), I want to get the resultset containing latest TestHistoryIDs against each TestID.

I used the following partition query to obtain the row with the highest TestHistoryID (followed this question on stackoverflow to create the query - How to query SQL Table and remove duplicate rows from a result set)

SELECT TestID, TestHistoryID, TestLabel
FROM 
(
    SELECT 
        TestID,
        TestHistoryID,
        TestLabel,
    row_number() over(partition by TestID order by TestHistoryID DESC) rn
    FROM TestHistoryView
    WHERE TestID IN (@test_ids)
) content where rn = 1

@test_ids contains huge number array e.g. 504954, 504955, 504956, 504957, 504958, 504959, 504960, 504961, 504962 .... (40k numbers)

Snap of Table columns here that contains duplicates, I need to remove the duplicates and get the latest value of TestHistoryID for each TestID (https://i.stack.imgur.com/nJ59C.png)

The query works for up to approximately 10k TestIDs, if I try with 15k, I get the following error:

The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

I need help creating an efficient query that doesn't eat up all the resources for 40k parameters passed and still returns the intended results. Thanks a lot in advance.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Tejas
  • 21
  • 5

1 Answers1

3

You can't actually pass a list of values to a single parameter. Some tools, like SQL Server Reporting Services, will rewrite your query in this scenario to add multiple literal values or parameters.

In any case using an IN list embeds the data in the query itself, and will eventually cause this issue. So while it may appear to you that you're not embedding 15,000 separate values in the SQL query, you actually are.

To avoid it you need to pass the values separately from the query. You can do this using a Table-Valued Parameter, JSON, XML, or by loading a temp table ahead-of-time. Which to use depends on your client platform. I'm not sure what your options are in Perl.

For XML the batch would look something like this:

declare @ids varchar(max) = '<IDs><id>1</id><id>2</id><id>3</id><id>4</id><id>5</id><id>6</id><id>7</id><id>8</id></IDs>'

declare @doc xml = cast(@ids as xml)

SELECT TestID, TestHistoryID, TestLabel
FROM 
(
    SELECT 
        TestID,
        TestHistoryID,
        TestLabel,
    row_number() over(partition by TestID order by TestHistoryID DESC) rn
    FROM TestHistoryView
    WHERE TestID IN (select n.value('.','int') Id from @doc.nodes('/IDs/id') d(n))
) content 
where rn = 1

Although you would leave out the first line and bind @ids as a varchar(max) parameter. Even pasting the XML in the TSQL query as a literal should avoid your current issue, as parsing a long string literal to an XML document doen't create the same query processing resource issue as embedding a vary large number of string literals in the query.

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67
  • David, If I pass XML, I have to handle it on the server side with a stored procedure which would generate the same error again. What are you suggesting exactly ? could you please give an example in this context ? – Tejas Aug 26 '19 at 13:26