3

I've recently been learning something very new to me - FULLTEXT Indexes.

It seems that I can run off two separate queries (using CONTAINSTABLE) on the same parameters against two separate tables are gain an almost instantaneous answer (sub 10ms) however when I combined the two together, the query takes 1.3 seconds - or 130+ times slower!!

Below are the queries (simplified for the purpose of this question).

Query 1:

SELECT
    *
FROM
    dbo.FooBar FB
    INNER JOIN dbo.FooBalls FBS on FB.ID = FBS.ID
    LEFT JOIN CONTAINSTABLE(dbo.FooBar, (Col1, Col2, Col3), @query) FBCONT ON FB.ID = FBCONT.[KEY]
WHERE
    FBCONT.[KEY] IS NOT NULL

Query 2:

SELECT
    *
FROM
    dbo.FooBar FB
    INNER JOIN dbo.FooBalls FBS on FB.ID = FBS.ID
    LEFT JOIN CONTAINSTABLE(dbo.FooBalls, (Col1), @query) FBSCONT ON FBS.ID = FBSCONT.[KEY]
WHERE
    FBSCONT.[KEY] IS NOT NULL

Query Combined:

SELECT
    *
FROM
    dbo.FooBar FB
    INNER JOIN dbo.FooBalls FBS on FB.ID = FBS.ID
    LEFT JOIN CONTAINSTABLE(dbo.FooBar, (Col1, Col2, Col3), @query) FBCONT ON FB.ID = FBCONT.[KEY]
    LEFT JOIN CONTAINSTABLE(dbo.FooBalls, (Col1), @query) FBSCONT ON FBS.ID = FBSCONT.[KEY]
WHERE
    (FBCONT.[KEY] IS NOT NULL OR FBSCONT.[KEY] IS NOT NULL)

Perhaps my research has missed something but can someone give me an indicator as to why having both clauses together reduces performance by over 130 times?

NOTES:

  • I've checked the relevant indexes for joining exist - verified by the speed of the individual queries.
  • There are actually more joins involved in the process - however they are completely unlinked to the tables being queries and again response are under 10ms when searching for results in 100,000 plus records.
  • I tried replacing the CONTAINSTABLE with individual CONTAINS statements - performance was massively degraded as my research would lead me to expect.
  • A catalog has been set up that references ONLY the four columns from the two tables being queried
  • The @query parameter is set to NVARCHAR (50) at the present. I've read that using NVACHAR is faster as implicit conversions are not required.
  • I know I could do a dirty UNION ALL on both queries separately, but I'd prefer to writer better queries if possible rather than hack it together. Additionally UNION ALL would leave me with potential duplicates if @query value was in two columns from separate tables linked to one record.

Any further suggestions would be greatly received.

Jon Bellamy
  • 3,333
  • 20
  • 23
  • Post the actual query execution plans somewhere and link to them. Also, what version of Sql server are you using and how many rows are in each table you are doing containstable on? – StrayCatDBA Jun 04 '13 at 09:05
  • SET STATISTICS TIME ON. How much time is compile time vs execution time? – StrayCatDBA Jun 04 '13 at 09:08
  • @StrayCatDBA, its Sql Server 2012. Unfortunately I cannot post the query execution - non-disclosures etc. - not ideal, but necessary. However, something did suddenly come to me - the problem actually appeared in the other four joins left out of the question - used to limit results based on user priveledges, I discovered that by removing the joins and limiting results based on a sub-query under the where clause I achieved a 10 fold increase in speed. Will post my answer. Thank you for taking the time to respond. – Jon Bellamy Jun 04 '13 at 11:25
  • Can you post the answer? –  Jun 01 '15 at 20:55
  • I didn't understand one thing: do want to `JOIN` or concatenate both results? – Gerardo Lima Jul 28 '15 at 14:20
  • How you have resolved it ? I have same scenario. – Kiya Nov 14 '19 at 06:01
  • 1
    @Kinjal This is one of those moments where you realise how bad you've been at updating the question with the answer. Unfortunately, the database has changed so much over the years, I don't actually have the original problem to refer to. However, I do remember that it turned out a huge part of the problem was on other tables in the query. Essentially, I had two sub-queries that were efficient on their own, but when joining them into a bigger single query they became inefficient. The issue wasn't with the fulltext index! Sorry I can't help any more! – Jon Bellamy Nov 14 '19 at 10:27
  • I forgot that it was asked years ago. Thank you – Kiya Nov 15 '19 at 10:35

1 Answers1

0

Your question comments suggest you improved performance to a satisfactory level by rewriting an unrelated part of the query (not shown in the question).

This is fair enough if it works, but doesn't explain why the two separate queries and the combined query differ so significantly, when other unrelated parts of the query are kept constant.

It's difficult to say confidently without seeing a query plan and statistics results; however I can think of two possibilities based solely on reasoning about how the SQL queries are written:

  1. One or both of the ID columns (from FooBar and FooBalls) may be non-unique in the row set after these two tables have been inner joined. Doing two, rather than one, join to CONTAINSTABLE result sets may thus be "breeding" rather more records than a single join does; larger result sets take longer to be passed back to the client and displayed. To test this: compare the row counts returned by the two separate queries, and compare these to the row counts of each separate query if the WHERE clauses are omitted. Larger row counts will typically suggest a longer query elapsed time (all other things being equal).

  2. Each of the separate queries has been written with a left outer join, but the result set is then restricted to only include rows where the join has succeeded. This is effectively an inner join: SQL Server's query planner may well be identifying this fact and choosing an execution plan as if an inner join had been specified. Conversely, the combined query requires rows where either join (but not necessarily both) have succeeded, which is a true left join. The execution plan is likely to use different, slower, approaches for these joins. To test this: look at the execution plans, and compare to execution plans for the separate queries with inner joins requested instead of left joins.