2

I have a Database table where I need to validate if a user has entered in the same or partly the same information.

Here is what I'm thinking

The db layout

rec_id (pk), user_id,
name, phone, address_1, address_2, zip, 
company, co_phone, co_address_1, co_address_2, co_zip, 
billing, bi_phone, bi_address_1, bi_address_2, bi_zip

The Query

SELECT rec_id 
FROM tbl_name
WHERE user_id = '123456789'
OR '1112223333' IN (phone, co_phone, bi_phone)
OR 'John Doe' IN (name, business, billing)
OR '12345' IN (zip, co_zip, bi_zip)
OR '123 main street' IN (address_1, co_address_1, bi_address_1)
OR 'po box 123' IN (address_2, co_address_2, bi_address_2)

If any of the data matches (and yes the will be false positives) I need the old rec_id.

Wanted to know if there is a better way to do this?

Thanks

Thorarin
  • 47,289
  • 11
  • 75
  • 111
Phill Pafford
  • 83,471
  • 91
  • 263
  • 383

1 Answers1

2

For this query to perform well you will need separate indexes for each of the columns you are testing. A combined index on all columns won't help at all for OR conditions (it would help if you had ANDs though).

However I would imagine that your query result in a full table scan regardless of which indexes you add. You might want to try using UNIONs instead of OR to see if that makes a difference:

SELECT rec_id FROM tbl_name WHERE tax_id = '123456789'
UNION
SELECT rec_id FROM tbl_name WHERE phone = '1112223333'
UNION
SELECT rec_id FROM tbl_name WHERE co_phone = '1112223333'
UNION
SELECT rec_id FROM tbl_name WHERE bi_phone = '1112223333'
UNION
SELECT rec_id FROM tbl_name WHERE name = 'John Doe'
UNION
SELECT rec_id FROM tbl_name WHERE business = 'John Doe'
UNION
SELECT rec_id FROM tbl_name WHERE billing = 'John Doe'
UNION
-- etc...

The idea of rewriting it like this is that now each subquery will be able to use an index (assuming of course that you have added the necessary indexes).

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • hmm interesting, might have to try this one. – Phill Pafford Jan 19 '11 at 21:44
  • 1
    @Phill Pafford: UNION ALL is faster than UNION because it doesn't remove duplicates, but I imagine that you *do* want to remove duplicates here. – Mark Byers Jan 19 '11 at 22:03
  • Thanks but duplicates isn't a concern but speed it. If I find a record or two it doesn't matter, what matters is there are/is a duplicate account. Thanks for the tips – Phill Pafford Jan 20 '11 at 14:17