0

I'm facing having to write a number of queries which will involve table scans on some fairly large tables (couple of million records).

(I know this is a terrible idea and I'm pushing for an alternative approach that won't involve this, but right now it's what I'm stuck with. And creating indexes on the relevant fields will probably not be an approved option either.)

For some tables, there are multiple fields which may match the target value. I would think, but I'd just like confirmation, that a single query with OR would be quicker than multiple queries. That is:

SELECT keyinfo FROM TABLE1 WHERE field1 = target OR field2 = target OR field3 = target;

would be better than

SELECT whatever FROM TABLE1 WHERE field1 = target; 
SELECT whatever FROM TABLE1 WHERE field2 = target; 
SELECT whatever FROM TABLE1 WHERE field3 = target; 

Intuitively this seems correct, since it only has to scan the table once. But it does have to test multiple fields, so not sure how much of the gain you lose on that. A quick test bears this out, but I'm not sure if there's any temporary indexes being built in the background which are affecting the results.

Further info: I would expect that for most cases no match will be found, but I do need all matches if any exist, so I can't just limit it to FETCH FIRST 1 ROWS ONLY. Also, it's z/OS DB2.

Given that I'll probably need to run this for multiple targets, I'm probably going to end up pulling all the relevant fields out into a temporary table, as I should be able to build an index on that, and overall that should be much faster. But I'm just curious now about the table scan question.

mustaccio
  • 18,234
  • 16
  • 48
  • 57
Mick O'Hea
  • 1,619
  • 2
  • 14
  • 20

2 Answers2

1

The or is not necessarily better. Databases can have a hard time using indexes for or conditions on multiple fields. DB2 might be good enough to optimize for this case. But, you can compare the performance (and execution plan) to:

SELECT whatever FROM TABLE1 WHERE field1 = target
UNION ALL
SELECT whatever FROM TABLE1 WHERE field2 = target AND field1 <> target
UNION ALL
SELECT whatever FROM TABLE1 WHERE field3 = target AND field1 <> targe AND field2 <> target; 
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

The database builds an execution plan based on the query without the values for the parameters. This query is used every time the query is run regardless of what the actual values are supplied.

I think, your best bet would be to look at the execution plan or explain plan of both kind of queries; using OR and using UNION clause.

fabfas
  • 2,200
  • 1
  • 21
  • 21