1

I have dozens of tables, that I have to join constantly using three string fields.

f1, f2 and f3 are always not null and each of them have a fixed number of characters.

I know that that's not very optimal but I can only query the database, I am not in charge of the design.

The condition I am using for the queries is:

concat(table1.f1, table1.f2, table1.f3) = concat (table2.f1, table2.f2, table2.f3)

These queries are against a database with millions of registries so queries always take a few minutes.

But bow, I am thinking that perhaps the join is quicker if I write the following?

table1.f1 = table2.f1
and table1.f2 = table2.f2
and table1.f3 = table2.f3

I am thinking that maybe that way the database can make use of indexes to join only the desired rows?

Sometimes I am very restrictive about the rows that must be joined but I think that in a join query all the rows are always matched and the rows are discarded once the tables have been joined.

If I know that table2.f6='whatever' and table2.f7='whatever' etc. does it make sense that the query joins ALL the rows and then discards most of them?

Because I think that the following three queries are the same, in terms of how optimal they are. Am I right?

SELECT ...
FROM table1
INNER JOIN table2
ON table1.f1 = table2.f1
    and table1.f2 = table2.f2
    and table1.f3 = table2.f3
and table2.f6 = 'whatever here'



SELECT ...
FROM table1
INNER JOIN table2
ON table1.f1 = table2.f1
    and table1.f2 = table2.f2
    and table1.f3 = table2.f3
WHEN table2.f6 = 'whatever here'



SELECT ...
FROM table1, table2
WHEN table1.f1 = table2.f1
    and table1.f2 = table2.f2
    and table1.f3 = table2.f3
    and table2.f6 = 'whatever here'

So, should I use concat or not, and what is the best way to restrict the rows in a join query to make it faster?

Thanks!

  • 4
    yes, concatting like that will prevent the use of indexes, meaning the db has to create new strings from the fields, compare those strings, then throw the string away. – Marc B Oct 09 '14 at 17:09
  • could you give me some refereneces of this, to warn my workmates with a reference that cannot be doubt? –  Oct 09 '14 at 17:25
  • @user4126054 The term you are looking for is [sargable](http://en.wikipedia.org/wiki/Sargable). – Andrew Morton Oct 09 '14 at 17:45
  • You tagged your question with Teradata, but CONCAT is no valid syntax in Teradata. Regarding the three variations, all of them will return the same result and should have exactly the same plan (as long as you're not using Outer Joins) – dnoeth Oct 09 '14 at 18:41
  • Definitely, concat(string1, string2, ...) works in my Teradata SQL Assistant. –  Oct 10 '14 at 06:49

1 Answers1

3

Ingnoring any speedups/efficients, note that doing bulk concatentation comparions v.s. individual comparisons WILL give you false/invalid results:

Given four fields and their values:

w      x            y    z
-----------------------------
abcd   ef           ab   cdef

and you're comparing the w/x pair against y/z, then:

concat(w,x) == concat(y,z)                (w == y) && (x == z)
'abcd' + 'ef' == 'ab' + 'cdef'            ('abcd' == 'ab') && ('abcd' == 'ef')
'abcdef' == 'abcdef'                      false && false
TRUE                                      FALSE
Marc B
  • 356,200
  • 43
  • 426
  • 500
  • @Marc B: In my original question I said: "f1, f2 and f3 are always not null and each of them have a fixed number of characters". So that example will never occur in my database: w and y will have the same number of characters, and the same for the pair x and z. –  Oct 10 '14 at 06:50
  • @user4126054 they are not null and fixed-size _now_. data (and table) structure tend to change over time. and when they will (and they will, it's only a matter of time), your query will not break, invalidate, or whatever - and this is the biggest pitfall here - it will just provide you with invalid data, and you will never find it out. – Kirill Leontev Oct 13 '14 at 01:08