BK's proposition that lots of self joins are harmful is misleading.
Consider an EAV data set comprising of 10,000 entities, each with 12 attributes as follows:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(entity INT NOT NULL
,attribute INT NOT NULL
,value INT NOT NULL
,PRIMARY KEY(entity,attribute)
);
INSERT INTO my_table VALUES
(1,101,RAND()*100),
(1,102,RAND()*100),
(1,103,RAND()*100),
(1,104,RAND()*100),
(1,105,RAND()*100),
(1,106,RAND()*100),
(1,107,RAND()*100),
(1,108,RAND()*100),
(1,109,RAND()*100),
(1,110,RAND()*100),
(1,111,RAND()*100),
(1,112,RAND()*100);
With this initial seed, I can use a table of integers (0-9) to rapidly populate the rest of the table...
INSERT IGNORE INTO my_table SELECT i4.i*1000+i3.i*100+i2.i*10+i1.i+1, attribute, RAND()*100 FROM my_table,ints i1, ints i2, ints i3, ints i4;
Bill's query...
SELECT SQL_NO_CACHE a.entity
, MAX(CASE WHEN attribute = 101 THEN value END) x101
, MAX(CASE WHEN attribute = 102 THEN value END) x102
, MAX(CASE WHEN attribute = 103 THEN value END) x103
, MAX(CASE WHEN attribute = 104 THEN value END) x104
, MAX(CASE WHEN attribute = 105 THEN value END) x105
, MAX(CASE WHEN attribute = 106 THEN value END) x106
, MAX(CASE WHEN attribute = 107 THEN value END) x107
, MAX(CASE WHEN attribute = 108 THEN value END) x108
, MAX(CASE WHEN attribute = 109 THEN value END) x109
, MAX(CASE WHEN attribute = 110 THEN value END) x110
, MAX(CASE WHEN attribute = 111 THEN value END) x111
, MAX(CASE WHEN attribute = 112 THEN value END) x112
FROM my_table a
GROUP
BY a.entity;
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| entity | x101 | x102 | x103 | x104 | x105 | x106 | x107 | x108 | x109 | x110 | x111 | x112 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 78 | 8 | 4 | 95 | 66 | 43 | 16 | 51 | 9 | 89 | 20 | 33 |
...
| 9998 | 61 | 72 | 67 | 20 | 23 | 10 | 31 | 37 | 69 | 18 | 24 | 32 |
| 9999 | 67 | 91 | 32 | 58 | 77 | 81 | 61 | 22 | 75 | 65 | 91 | 42 |
| 10000 | 52 | 38 | 56 | 32 | 14 | 77 | 10 | 99 | 70 | 70 | 82 | 13 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
10000 rows in set (0.20 sec)
The alternative...
SELECT SQL_NO_CACHE a.entity
, a.value x101
, b.value x102
, c.value x103
, d.value x104
, e.value x105
, f.value x106
, g.value x107
, h.value x108
, i.value x109
, j.value x110
, k.value x111
, l.value x112
FROM my_table a
LEFT JOIN my_table b ON b.entity = a.entity AND b.attribute = 102
LEFT JOIN my_table c ON c.entity = a.entity AND c.attribute = 103
LEFT JOIN my_table d ON d.entity = a.entity AND d.attribute = 104
LEFT JOIN my_table e ON e.entity = a.entity AND e.attribute = 105
LEFT JOIN my_table f ON f.entity = a.entity AND f.attribute = 106
LEFT JOIN my_table g ON g.entity = a.entity AND g.attribute = 107
LEFT JOIN my_table h ON h.entity = a.entity AND h.attribute = 108
LEFT JOIN my_table i ON i.entity = a.entity AND i.attribute = 109
LEFT JOIN my_table j ON j.entity = a.entity AND j.attribute = 110
LEFT JOIN my_table k ON k.entity = a.entity AND k.attribute = 111
LEFT JOIN my_table l ON l.entity = a.entity AND l.attribute = 112
WHERE a.attribute = 101;
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| entity | x101 | x102 | x103 | x104 | x105 | x106 | x107 | x108 | x109 | x110 | x111 | x112 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 78 | 8 | 4 | 95 | 66 | 43 | 16 | 51 | 9 | 89 | 20 | 33 |
...
| 9998 | 61 | 72 | 67 | 20 | 23 | 10 | 31 | 37 | 69 | 18 | 24 | 32 |
| 9999 | 67 | 91 | 32 | 58 | 77 | 81 | 61 | 22 | 75 | 65 | 91 | 42 |
| 10000 | 52 | 38 | 56 | 32 | 14 | 77 | 10 | 99 | 70 | 70 | 82 | 13 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
10000 rows in set (0.23 sec)
So, Bill's query IS fractionally faster. However, as soon as you reduce the number of entities sought (while maintaining the same number of attributes - so same number of joins), the alternative query can overtake Bill's by something approaching the same kind of margin...
Bill's query with WHERE a.entity <= 5000
added
| 4998 | 59 | 55 | 93 | 48 | 72 | 32 | 38 | 36 | 6 | 82 | 23 | 62 |
| 4999 | 23 | 10 | 11 | 29 | 69 | 67 | 92 | 72 | 25 | 49 | 79 | 48 |
| 5000 | 39 | 86 | 77 | 0 | 30 | 38 | 48 | 54 | 9 | 97 | 25 | 54 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
5000 rows in set (0.12 sec)
The alternative with WHERE a.entity <= 5000
added
| 4998 | 59 | 55 | 93 | 48 | 72 | 32 | 38 | 36 | 6 | 82 | 23 | 62 |
| 4999 | 23 | 10 | 11 | 29 | 69 | 67 | 92 | 72 | 25 | 49 | 79 | 48 |
| 5000 | 39 | 86 | 77 | 0 | 30 | 38 | 48 | 54 | 9 | 97 | 25 | 54 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
5000 rows in set (0.11 sec)
So it isn't really the number of joins, but rather the assiduous use of indexes that makes the difference between a slow and a fast query.