2

I have two tables tableA and tableB as described bellow

desc tableA
GUID
PROPERTY_NAME
PROPERY_VALUE
ANOTHER_COL_1
ANTHER_COL_2

desc tableB
GUID
PROPERY_NAME
PROPERTY_VALUE
RANDOM_COL_1
RANDOME_COL_2

I have the following query to fetch the list of entries which have matching values for property_name which here is "URL"

SELECT A.GUID as SOURCE_GUID, 
       B.GUID as DESTINATION_GUID 
FROM
       tableA A,
       tableB B
WHERE
       A.PROPERTY_NAME = "URL" AND
       A.PROPERY_NAME  = B.PROPERTY_NAME AND
       A.PROPERTY_VALUE = B.PROPERTY_VALUE

The issue here is as propery values are URLs, they may or may not have trailing slash. I want the join to happen regardless of trailing slash. Please suggest a better way of achieving the same with out using an PL/SQL procedures to remove the trailing slash.

Ram
  • 3,034
  • 11
  • 38
  • 46
  • "Please suggest a better way of achieving the same with out using an PL/SQL procedures to remove the trailing slash" Translation: *I have poor data quality processes but I want to keep them because I like creating work for myself.* – APC Nov 29 '11 at 10:48

3 Answers3

7

You have two options.

The first is clean your data before use.

Before any URL is stored in the system, add or remove (as prefered) a slash if necessary. In this way ensure that all URLs are stored in the same format for ease of use.

The second is clean your data at run time.

I do not advocate these options as they incur unnecessary overheads and can prevent optimal use of indexes.

RTRIM(A.PROPERTY_VALUE, '/') = RTRIM(B.PROPERTY_VALUE, '/')

Cleanest code, but likely prevent index use.

  • Values from both Table A and Table B are being modified
  • This likely requires a SCAN on both tables

Or...

  • (Ensure B.PROPERTY_VALUE ends with '/', then do the comparison); OR
  • (Ensure B.PROPERTY_VALUE does NOT end with '/', then do the comparison)

If either of these conditions are true, the URLs match.

A.PROPERTY_VALUE = (CASE WHEN RIGHT(B.PROPERTY_VALUE, 1) = '/' THEN B.PROPERTY_VALUE ELSE B.PROPERTY_VALUE + '/' END)
OR
A.PROPERTY_VALUE = (CASE WHEN RIGHT(B.PROPERTY_VALUE, 1) = '/' THEN RTRIM(B.PROPERTY_VALUE, '/') ELSE B.PROPERTY_VALUE END)

Much messier, but May be more index friendly

  • Only the B table values are being altered
  • a SCAN on B is necessary
  • an INDEX SEEK on A is now possible
Allan
  • 17,141
  • 4
  • 52
  • 69
MatBailie
  • 83,401
  • 18
  • 103
  • 137
  • +1 for data cleansing upfront. Otherwise, if we insist on loading shonky data we have to handle it *every time we use it*. – APC Nov 29 '11 at 10:46
  • Then you really, Really, want to start cleaning up your data when it's being loaded. That way you can encapsulate your "clean-up" code in one place, AND using that data then becomes ALOT more performant. – MatBailie Nov 29 '11 at 21:53
3

You can easily remove trailing slashes using the RTRIM function:

...
AND RTRIM(A.PROPERTY_VALUE,'/') = RTRIM(B.PROPERTY_VALUE,'/')
Erich Kitzmueller
  • 36,381
  • 5
  • 80
  • 102
3
SELECT A.GUID as SOURCE_GUID, 
       B.GUID as DESTINATION_GUID 
FROM   tableA A,
       tableB B
WHERE  A.PROPERTY_NAME = "URL"
AND    A.PROPERY_NAME = B.PROPERTY_NAME
AND    RTRIM(A.PROPERTY_VALUE, '/') = RTRIM(B.PROPERTY_VALUE, '/')
Xophmeister
  • 8,884
  • 4
  • 44
  • 87