Multi-dimensional search on a table

Question

I have the following table that stores the preferences of the users in the system

UserId | Product   | Brand     | City      |
-------------------------------------------
A      | Soap      | Tide      | NYC       |
A      | Cereal    | Dont-care | NYC       |
B      | Dont-Care | Tide      | Dont-care |
C      | Shampoo   | Dont-care | Dont-Care |

I would like to search this based on user provided search values. So if one searches for

City: NYC, Brand: Tide

the output should be:

    A    | Soap      | Tide      | NYC       |
    B    | Dont-Care | Tide      | Dont-care |

where as if they search for

Brand: Tide, Product: Soap

the result should be:

    A    | Soap      | Tide      | NYC       |

The current solution I have, is the following query (where null represents 'don't care') going against a MySQL table:

select *
from user_preferences
where (product is null or product = <user provided value>)
and (brand is null or brand = <user provided value>)
and (city is null or city = <user provided value>)

Though it works as expected, the [and + (or)] combination makes me think this is not the right way to do this. I am also fairly certain that once the dataset increases, the query will not perform well.

What would be most efficient way of storing and retrieving such data? Are there any no-sql type approaches that can be used to make this efficient?

Update

After some googling around I figured the approach I have may be the safest bet. One factor I still am ambivalent about with this approach is that adding another 'searchable' attribute would mean adding a new column.

This blog about the EAV anti-pattern provides some good reading material on such a scheme. Also see how friend-feed uses MySQL for another take on storing variable attributes in a table.

score 0 · Answer 1 · answered Jul 02 '11 at 02:42

0

You could use the IFNULL() function.

It would be something like this:

select *
from user_preferences
where ifnull(product, <user provided value>) = <user provided value>
and ifnull(brand, <user provided value>) = <user provided value>
and ifnull(city, <user provided value>) = <user provided value>

answered Jul 02 '11 at 02:42

Julien Ducro

854
1
9
26

Just an FYI... this would require that all of the user provided values are valued with something (i.e. can't be NULL) – Tom H Jul 02 '11 at 04:55
If the is empty it also works and get the lines with NULL, as long as you ask for an empty string and not NULL. – Julien Ducro Jul 02 '11 at 05:18
Yes. As I said, they can't be NULL. – Tom H Jul 03 '11 at 02:51

score 0 · Answer 2 · answered Jul 03 '11 at 10:39

based on @Pompom6784's answer -

but logically it has very different conditions-

select * from user_preferences 
where product = nvl(<user provided value> , product)
and brand = nvl(<user provided value> ,brand)
and city = nvl(<user provided value> ,city)

I am a oracle pl sql developer so please check above query having syntex of oracle sql.. if in your database ifnull is the function to check for null values please replace nvl() with ifnull()

@Pompom6784 you had to apply ifnull on < user provided value > not on column values ..

score 0 · Answer 3 · answered Jul 03 '11 at 13:08

Since you ask how to store such data, here's what I'd probably do:

            USERS
            userid  PK
            name
            etc


            PREFTYPES
            preftypeid  PK
            prefname



           PREFTYPEVALUES
           preftypeid  foreign key references PREFTYPES(preftypeid)
           value

           composite primary key (preftypeid, value)

PREFTYPEVALUES table above lets you constrain the possible value-entries in the USERPREFS table:

            USERPREFS
            userid      foreign key references USERS(userid)
            preftypeid  
            value

           => composite  foreign key(preftypeid, value) references PREFTYPEVALUES(preftypeid, value)

To combine the two sets of users, the first set being those who prefer Tide for their soap and the second set being those who prefer NYC as their city, you could UNION the two sets:

             select U.userid, U.name, UP.value
             from USERS U 
             inner join USERPREFS UP on U.userid = UP.userid
             where 
             (UP.preftypeid = 'soap' and UP.value = 'Tide' )


              UNION

             select U.userid, U.name, UP.value
             from USERS U 
             inner join USERPREFS UP on U.userid = UP.userid
             where                  
             (UP.preftypeid = 'city' and UP.value = 'NYC')

but this structure would also permit queries that do not use UNION, of course.

             select U.userid, U.name, UP.value
             from USERS U 
             inner join USERPREFS UP on U.userid = UP.userid
             where 
             (UP.preftypeid = 'soap' and UP.value = 'Tide' )
              OR
             (UP.preftypeid = 'city' and UP.value = 'NYC')

I bet your next question would be: how to get a single row for each user, with the user's preferences side-by-side?

              user-A, Tide, NYC

That denormalization is better left to the presentation layer, although it can be done in SQL using a variety of inelegant approaches. The SQL key here is composite (user,preftype).

Doesn't the USERPREFS table require another column to bind instances of preferences together? How would you ensure Soap+Tide+NYC constitues one match whereas Cereal+NYC constitues another? — calvinkrishy, Jul 05 '11 at 00:16

Multi-dimensional search on a table

3 Answers3