1

Our code is written in C and DB is Informix. We are doing some code optimization in ESQL program and found the following query:

UPDATE [TABLE] SET [PRIMARY KEY COLUMN] = [NEW KEY] WHERE COL = ? OR COL = ? ...

The number of columns in the where clause is dynamically prepared. We have an upper count check to stop preparing the where clause and do an update, then come back and start preparing the remaining. The UPPER_MARK is "30"

if ( where_parameter_count >= UPPER_MARK )
__ execute update__
__ clean_and_continue; /* start prepare again */__

We thought of replacing the WHERE .. OR with WHERE ... IN

UPDATE [TABLE] SET [PRIMARY KEY COLUMN] = [NEW KEY] WHERE COL IN (?, ? ...)

Will this be faster than "where or"?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Liju Mathew
  • 871
  • 1
  • 18
  • 31
  • 1
    What did you find when you tried this? Either could be faster or it could turn out to make no difference. – Peter Lawrey Sep 12 '11 at 06:59
  • There is no difference when tested over a sample set of data (over 1 million records and where contains 30 OR clause). – Liju Mathew Sep 12 '11 at 09:29
  • 1
    That doesn't surprise me. If you make a major change to your database server, like a version change it could be different. It could also mean that if you optimised the code one way, a major change could mean the other way was faster. – Peter Lawrey Sep 12 '11 at 09:32
  • Thank you Peter, whether the "AND" and "OR" are bit operations on the indexes and expected to be faster? Do you have any thoughts on that? – Liju Mathew Sep 13 '11 at 05:50
  • either could be faster depending on how much work it has to do. AND can be made faster if you query matches your indexes. e.g. if you A = n AND B = m AND C = p, and you have an index on A,B,C in that order it can be very fast. – Peter Lawrey Sep 13 '11 at 06:41
  • Thank you Peter for the explanation. – Liju Mathew Sep 13 '11 at 10:59

1 Answers1

3

Either syntax resolves down to the same thing as far as the Query Optimizer is concerned. It won't make any difference.

The only advantage to the in-list approach is that adding additional predicates won't change the logic, i.e.

x IN (a, b) AND y = z

and

x = a OR x = b AND y = z

are NOT the same thing.

RET
  • 9,100
  • 1
  • 28
  • 33
  • But the difference is (trivially) resolved by enclosing the OR'd terms in parentheses. – Jonathan Leffler Sep 12 '11 at 19:06
  • Well yes, of course. But when we're talking about dynamically generated code, I think it's a useful distinction to make. – RET Sep 12 '11 at 20:47
  • Thank you for the suggestions. The actual code is just "OR" alone and yes, I understand the above example. The testing on the sample data also didn't provide any major time difference. One of the team member shared that "AND" and "OR" are bit operations on the indexes and expected to be faster. Do you guys got any thoughts on that? – Liju Mathew Sep 13 '11 at 05:51
  • Faster than what? I'm not sure I follow what your colleague is suggesting. – RET Sep 13 '11 at 05:56
  • Sorry for creating confusion, the question was "whether OR operation is faster than IN, since OR is a bit operation"? – Liju Mathew Sep 13 '11 at 08:13
  • No. As stated previously, either syntax resolves to the same thing in Query Optimization. I'd suggest that the evaluation of the expressions is where the expense is, rather than the logical OR of the resulting truth or otherwise. – RET Sep 14 '11 at 01:10
  • if you have few possible predicates, example: ("A","B","C","D","E"), it could be faster to say NOT IN ("A","B") versus saying IN ("C","D","E"). – Joe R. Oct 05 '11 at 07:54
  • ..although it might be easier to maintain code if you use the explicit IN ("C","D","E") because new predicates could be added in the future as requirements change. – Joe R. Oct 05 '11 at 22:16
  • @FrankComputer: I think the difference would be infinitesimal. And given the OP's original question pertained to dynamic SQL, pretty much impossible to implement. Furthermore, it is not good design philosophy to use complementary logic in this manner. What happens when there is an 'F' introduced to the column? You should state the business rule as it's defined: one of ("C","D","E") if that's the rule, or not one of ("A","B") if that's the rule. – RET Oct 05 '11 at 22:32
  • @FrankComputer: we're obviously on the same page, figuratively and literally :-) I didn't see your second comment until I'd posted mine. – RET Oct 05 '11 at 22:36
  • I thought the question pertained to "which is faster?..." having to evaluate more predicates vs. less predicates doesn't cost more? – Joe R. Oct 05 '11 at 22:38
  • Well, the original question pertained to the speed of sequential `OR` statements vs an `IN`-list. We're getting off on a bit of a tangent here... – RET Oct 05 '11 at 22:46
  • OK, same difference, I guess?.. just thought that evaluating extra OR's cost more than evaluating less OR's – Joe R. Oct 15 '11 at 02:36