0

I've got a query that often shows up in mysqlslow.log like this...

# Query_time: 1  Lock_time: 0  Rows_sent: 12  Rows_examined: 18040
SELECT url, classname, cr_class, HDBclsslnk.cr_cl 
FROM HDBclsslnk 
JOIN wpthillsdatabase
  ON hn_cl = hillnumber AND area = '01d' 
RIGHT JOIN ShortClassList 
  ON HDBclsslnk.cr_cl = ShortClassList.cr_class 
GROUP BY ShortClassList.cr_class 
ORDER BY NULL

The EXPLAIN for the query looks like this...

id| select_type | table                 | type   | possible_keys                                         | key             | key_len | ref                              | rows | Extra
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 |  SIMPLE     |      ShortClassList   | ALL    | NULL                                                  | NULL            | NULL    | NULL                             | 12   | Using temporary
1 |  SIMPLE     |      HDBclsslnk       | ref    | crcl-hncl,hncl-crcl                                   | crcl-hncl       | 6       | haroldst.ShortClassList.cr_class | 332  | Using index
1 | SIMPLE      |  wpthillsdatabase     | eq_ref | PRIMARY,SortIndex1,hillnumber_only,hil_lat_lon,Are... | hillnumber_only | 8       | haroldst.HDBclsslnk.hn_cl        | 1    |      

I can't understand why it examines 18040 rows but the Explain suggests its only looking at 12 x 332 x 1 rows ?

FYI the 3 tables look like this...

CREATE TABLE `wpthillsdatabase` (
  `hillnumber` varchar(6) NOT NULL default '',
  `wikipedia` varchar(100) NOT NULL default '',
  `hillname` varchar(100) NOT NULL default '',
  `meters` char(6) NOT NULL default '',
  `GridZN` char(2) NOT NULL default '',
  `GridEast` char(5) NOT NULL default '',
  `GridNorth` char(5) NOT NULL default '',
  `numeast` char(7) NOT NULL default '',
  `numnorth` char(7) NOT NULL default '',
  `areadecimal` decimal(5,2) NOT NULL default '0.00',
  `area` varchar(3) NOT NULL default '',
  `area2` varchar(100) NOT NULL default '',
  `maps` varchar(100) NOT NULL default '',
  `waypointname` char(6) character set latin1 collate latin1_bin NOT NULL default '',
  `latitude` decimal(10,8) NOT NULL default '0.00000000',
  `longitude` decimal(10,7) NOT NULL default '0.0000000',
  `area3` smallint(2) NOT NULL default '0',
  `dip` decimal(5,1) NOT NULL default '0.0' COMMENT 'col height (m)',
  `climbedbytotal` smallint(6) NOT NULL default '0',
  `trigID` varchar(6) NOT NULL default '',
  `trigEast` varchar(7) NOT NULL default '',
  `trigNorth` varchar(7) NOT NULL default '',
  `trigLat` varchar(10) NOT NULL default '',
  `trigLon` varchar(10) NOT NULL default '',
  `SummitInfo` varchar(290) NOT NULL default '',
  PRIMARY KEY  (`hillnumber`,`area3`),
  UNIQUE KEY `meters` (`meters`,`hillnumber`),
  UNIQUE KEY `SortIndex1` (`hillnumber`,`hillname`,`meters`),
  UNIQUE KEY `NearByHills` (`numeast`,`numnorth`,`hillnumber`),
  UNIQUE KEY `hillnumber_only` (`hillnumber`),
  UNIQUE KEY `Area3_Sort` (`area3`,`hillnumber`,`hillname`,`meters`),
  UNIQUE KEY `GirdZN_sort` (`GridZN`,`hillnumber`,`hillname`,`meters`),
  UNIQUE KEY `hil_lat_lon` (`hillnumber`,`latitude`,`longitude`),
  KEY `trigID` (`trigID`,`hillname`,`meters`),
  KEY `climbedbytotal` (`climbedbytotal`),
  KEY `hillname` (`hillname`,`meters`),
  KEY `area3` (`area3`,`hillnumber`),
  KEY `hillname_only` (`hillname`),
  KEY `area3_trigID` (`area3`,`trigID`),
  KEY `Area_text` (`area`,`area3`),
  KEY `dip_area3` (`dip`,`area3`),
  KEY `lat_lon` (`latitude`,`longitude`,`meters`),
  KEY `trigID_area3` (`trigID`,`area3`),
  KEY `numeast_north` (`numeast`,`numnorth`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 ;

CREATE TABLE `ShortClassList` (
  `classID` mediumint(9) NOT NULL,
  `sortseq` mediumint(9) NOT NULL,
  `cr_class` varchar(6) NOT NULL COMMENT 'class ref - class',
  `classname` varchar(20) NOT NULL,
  `url` varchar(20) NOT NULL default '',
  `areaURL` varchar(10) NOT NULL default '',
  `areatext` varchar(10) NOT NULL default '',
  PRIMARY KEY  (`classname`),
  UNIQUE KEY `sortseq` (`sortseq`),
  UNIQUE KEY `cr-class` (`cr_class`),
  KEY `areaURL` (`areaURL`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

CREATE TABLE `HDBclsslnk` (
  `classlinkID` mediumint(9) NOT NULL auto_increment,
  `hn_cl` varchar(6) NOT NULL default '' COMMENT 'hill number - class link ',
  `cr_cl` varchar(4) NOT NULL default '' COMMENT 'class ref - class link',
  PRIMARY KEY  (`classlinkID`),
  UNIQUE KEY `crcl-hncl` (`cr_cl`,`hn_cl`),
  UNIQUE KEY `hncl-crcl` (`hn_cl`,`cr_cl`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 ;

HDBclsslnk has 22,229 rows, wpthillsdatabase has 9334 rows and ShortClassList has 12 rows.

Is this normal, or can it be optimised further?

Phil Newby
  • 131
  • 8
  • Judging by the EXPLAIN output... it looks like the query if very well optimized. The row numbers in the explain are estimates which is why the rows examined number is higher than expected. – bobwienholt May 17 '13 at 16:05

2 Answers2

1

I suggest you add an index on (area, hillnumber):

ALTER TABLE wpthillsdatabase
  ADD INDEX area_hillnumber
    (area, hillnumber) ;

You can also try this rewriting:

SELECT s.url, s.classname, s.cr_class, 
       ( SELECT h.cr_cl 
         FROM HDBclsslnk AS h
           JOIN wpthillsdatabase AS w
             ON  h.hn_cl = w.hillnumber 
             AND w.area = '01d' 
         WHERE h.cr_cl = s.cr_class 
         LIMIT 1
       ) AS cr_cl
FROM ShortClassList AS s ;

and this:

SELECT s.url, s.classname, s.cr_class, t.cr_cl
FROM ShortClassList AS s 
  LEFT JOIN
    ( SELECT DISTINCT h.cr_cl 
      FROM HDBclsslnk AS h
        JOIN wpthillsdatabase AS w
          ON  h.hn_cl = w.hillnumber 
          AND w.area = '01d' 
    ) AS t
      ON t.cr_cl = s.cr_class ;
Phil Newby
  • 131
  • 8
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • OP changed table names, btw – Ascherer May 20 '13 at 15:55
  • Hi i've added the INDEX area_hillnumber (area, hillnumber), but it just seems to ignore it (at least as far as the explain goes). – Phil Newby May 20 '13 at 15:58
  • The 1st rewrite gives the following EXPLAIN `id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY s ALL NULL NULL NULL NULL 12 2 DEPENDENT SUBQUERY w ref PRIMARY,SortIndex1,hil_lat_lon,hillnumber_area,hil... Area_text 5 const 1 Using where 2 DEPENDENT SUBQUERY h eq_ref crcl-hncl,hncl-crcl crcl-hncl 14 haroldstreet.s.cr_class,haroldstreet.w.hillnumber 1 Using where; Using index` which looks promising. – Phil Newby May 20 '13 at 16:05
  • The 2nd gives this EXPLAIN `id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY system NULL NULL NULL NULL 0 const row not found 1 PRIMARY s ALL NULL NULL NULL NULL 12 2 DERIVED w ref PRIMARY,SortIndex1,hil_lat_lon,hillnumber_area,hil... Area_text 5 1 Using where; Using temporary 2 DERIVED h ref hncl-crcl hncl-crcl 8 haroldstreet.w.hillnumber 2 Using index` – Phil Newby May 20 '13 at 16:08
  • I'll see what the mysqlslow.log makes of them and get back to you - ta – Phil Newby May 20 '13 at 16:08
  • Is it faster with the rewritings? – ypercubeᵀᴹ May 20 '13 at 16:12
  • Yes Much faster! - Thanks i've ticked this as the correct answer – Phil Newby May 20 '13 at 16:39
0

You can try this:

SELECT
    url, 
    classname, 
    cr_class
FROM 
    ShortClassList 
WHERE
    cr_class IN (
        SELECT
            cr_cl
        FROM 
            HDBclsslnk h
        WHERE
            hn_cl IN (
                SELECT
                    hillnumber
                FROM 
                    wpthillsdatabase
                WHERE
                    area = '01d'
            )
    )
GROUP BY
     cr_class
ORDER BY 
    NULL
Phil Newby
  • 131
  • 8
Stephan
  • 8,000
  • 3
  • 36
  • 42