MySQL intersect on strings?

Question

I do have two tables:

Quest
- (int) id 
- (text) characters

User
- (int) id
- (text) characters

Entries look like this:

Quest

id | characters
1  | abcdefgh
2  | mkorti
3  | afoxi
4  | bac

User

id | characters
1  | abcd

Now I want to select the easiest Quest for User. The easiest quest is the one if the most intersections of quest.characters and user.characters. So in this example the list would look like this (for user.id = 1):

questid | easiness
4       | 100
1       | 50
3       | 40
2       | 0

The easiness simply show how many percent was matched. Is it possible with MySQL to make intersections of columns like this? What's the performance like? In fact I do have relations as well (quest -> character and user -> characters), however I guess it's not very performant. As there are a few thousand quests and also a few thousand characters.

Update #1

Okay, relational still seems the way to go, okay. Now my tables look like this:

CREATE TABLE IF NOT EXISTS `quest` (
  `questid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`questid`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 ;

CREATE TABLE IF NOT EXISTS `questcharacters` (
  `questid` int(10) unsigned NOT NULL,
  `characterid` int(10) unsigned NOT NULL,
  PRIMARY KEY (`questid`,`characterid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `single_character` (
  `characterid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `single_char` varchar(10) NOT NULL,
  PRIMARY KEY (`characterid`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `user` (
  `userid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `usercharacters` (
  `userid` int(10) unsigned NOT NULL,
  `characterid` int(10) unsigned NOT NULL,
  PRIMARY KEY (`userid`,`characterid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

PS.: Don't wonder why single_char has VARCHAR(10) as data type, but I will use multi-byte values and I don't know how MySQL handles them for char(1). So I just was generous there.

Update #2

My query for now is:

SELECT usercharacters.userid, questcharacters.questid
FROM `usercharacters`
LEFT OUTER JOIN questcharacters ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid, usercharacters.userid;

But how to calculate the easiness/overlapping characters? On which field do I have to apply COUNT()?

Update #3

Okay, seems like I got it working with this query (uses a subselect):

SELECT usercharacters.userid as uid, questcharacters.questid as qid, (SELECT COUNT(questcharacters.characterid) FROM questcharacters LEFT OUTER JOIN usercharacters ON questcharacters.characterid = usercharacters.characterid WHERE questcharacters.questid = qid) as questcount
FROM `usercharacters`
LEFT OUTER JOIN questcharacters ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid, usercharacters.userid;

Update #4

SELECT usercharacters.userid as uid, questcharacters.questid as qid, (SELECT COUNT(questcharacters.characterid) FROM questcharacters LEFT OUTER JOIN usercharacters ON questcharacters.characterid = usercharacters.characterid WHERE questcharacters.questid = qid) as user_knows, (SELECT COUNT(questcharacters.characterid) FROM questcharacters WHERE questcharacters.questid = qid) as total_characters
FROM `usercharacters`
LEFT OUTER JOIN questcharacters ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid, usercharacters.userid
ORDER BY total_characters / user_knows DESC;

Only thing missing now: Selecting the easyiness. (As in the ORDER BY clause). Anyone knows how to do this?

You are right, quest #3 was a mistake. I've updated the first post so you can see the relations. — Langdi, Jul 15 '13 at 18:03
A `CHAR(1)` column will accomodate a single character in its character set, irrespective of the number of bytes required to store the encoded value. — eggyal, Jul 15 '13 at 18:11

score 0 · Answer 1 · edited Jul 26 '13 at 05:50

If you actually have questcharacter and usercharacters tables, then that is the best way to go:

SELECT uc.id AS userid, 
       qc.id AS qcid, 
       COUNT(*) AS NumCharacters,
       COUNT(qc.char) AS Nummatches,
       COUNT(qc.char) / count(*) AS Easiness
FROM UserCharacters uc 
   LEFT OUTER JOIN QuestCharacters qc ON uc.char = qc.char
WHERE uc.id = 1
   GROUP BY uc.id, qc.id
   ORDER BY easiness DESC
LIMIT 1

If you have them only as strings -- the SQL is not pretty. You have to do a cross join and lots of string manipulation. The best approach is to have things more normalized in the form of a relational database (one row per list element), rather than having lists embedded in strings.

Thank you. However, my linking tables only contain a compound foreign key and no additional attributes (like the char field in your example). I've tried to create a select myself, but I had difficulties, especially in sorting. I will post the Query, once I got it again. — Langdi, Jul 15 '13 at 18:03

score 0 · Answer 2 · answered Jul 15 '13 at 19:01

So this is my final and working solution:

SELECT usercharacters.userid                  AS uid, 
       questcharacters.questid                AS qid, 
       (SELECT Count(questcharacters.characterid) 
        FROM   questcharacters 
               LEFT OUTER JOIN usercharacters 
                            ON questcharacters.characterid = 
                               usercharacters.characterid 
        WHERE  questcharacters.questid = qid) AS user_knows, 
       (SELECT Count(questcharacters.characterid) 
        FROM   questcharacters 
        WHERE  questcharacters.questid = qid) AS total_characters, 
       (SELECT ( Count(questcharacters.characterid) / (SELECT 
                         Count(questcharacters.characterid) 
                                                       FROM   questcharacters 
                                                       WHERE 
                 questcharacters.questid = qid) ) 
        FROM   questcharacters 
               LEFT OUTER JOIN usercharacters 
                            ON questcharacters.characterid = 
                               usercharacters.characterid 
        WHERE  questcharacters.questid = qid) AS ratio 
FROM   `usercharacters` 
       LEFT OUTER JOIN questcharacters 
                    ON usercharacters.characterid = usercharacters.characterid 
GROUP  BY questcharacters.questid, 
          usercharacters.userid 
ORDER  BY ratio DESC;

Do I really need that many sub-selects?

MySQL intersect on strings?

Update #1

Update #2

Update #3

Update #4

2 Answers2