1

Below the query that takes almost 7's and Im confused if its normal behaviour in mysqlserver(ndb storage engine), in explain out put it also shows it is using table indexes

    SELECT radgroupreply.groupname, 
    count(distinct(radusergroup.username)) AS users 
    FROM radgroupreply                 
    JOIN radusergroup ON radgroupreply.groupname=radusergroup.groupname                                 
    WHERE
    (radgroupreply.groupname NOT LIKE 'FB-%' 
    AND radgroupreply.groupname <> 'Dropped Corporate Users' 
    AND radgroupreply.groupname <> 'Dropped Broadband Users')
    GROUP BY radgroupreply.groupname 
    UNION 
    SELECT distinct(radgroupcheck.groupname), 
    count(distinct(radusergroup.username))  
    FROM radgroupcheck 
    JOIN radusergroup ON radgroupcheck.groupname=radusergroup.groupname                
    WHERE
    (radgroupcheck.groupname NOT LIKE 'FB-%' 
    AND radgroupcheck.groupname <> 'Dropped Corporate Users' 
    )
    GROUP BY radgroupcheck.groupname ORDER BY groupname asc;

The explain output for the query is :-

  +----+--------------+---------------+-------+---------------+-----------+---------+----------------------------------+------+---------------------------------+
    | id | select_type  | table         | type  | possible_keys | key       | key_len | ref                              | rows | Extra                           |
    +----+--------------+---------------+-------+---------------+-----------+---------+----------------------------------+------+---------------------------------+
    |  1 | PRIMARY      | radgroupreply | range | groupname     | groupname | 66      | NULL                             |   47 | Using where; Using MRR          |
    |  1 | PRIMARY      | radusergroup  | ref   | groupname     | groupname | 66      | ctradius.radgroupreply.groupname |   64 | NULL                            |
    |  2 | UNION        | radgroupcheck | range | groupname     | groupname | 66      | NULL                             |   20 | Using where; Using MRR          |
    |  2 | UNION        | radusergroup  | ref   | groupname     | groupname | 66      | ctradius.radgroupcheck.groupname |  120 | NULL                            |
    |NULL| UNION RESULT | <union1,2>    | ALL   | NULL          | NULL      | NULL    | NULL                             | NULL | Using temporary; Using filesort |
    +----+--------------+---------------+-------+---------------+-----------+---------+----------------------------------+------+---------------------------------+

Below is the table structure and indexed column information for tables involved in join

Table: radgroupreply; # total 192 rows

    +-----------+------------------+------+-----+---------+----------------+
    | Field     | Type             | Null | Key | Default | Extra          |
    +-----------+------------------+------+-----+---------+----------------+
    | id        | int(11) unsigned | NO   | PRI | NULL    | auto_increment |
    | groupname | varchar(64)      | NO   | MUL |         |                |
    | attribute | varchar(32)      | NO   |     |         |                |
    | op        | char(2)          | NO   |     | =       |                |
    | value     | varchar(253)     | NO   |     |         |                |
    +-----------+------------------+------+-----+---------+----------------+

Table: radusergroup #total: ~13000 rows

    +-----------+-------------+------+-----+---------+----------------+
    | Field     | Type        | Null | Key | Default | Extra          |
    +-----------+-------------+------+-----+---------+----------------+
    | id        | int(11)     | NO   | PRI | NULL    | auto_increment |
    | username  | varchar(64) | NO   | MUL |         |                |
    | groupname | varchar(64) | NO   | MUL |         |                |
    | priority  | int(11)     | NO   |     | 1       |                |
    +-----------+-------------+------+-----+---------+----------------+

Table: radgroupcheck #totalrows: ~ 100

    +-----------+------------------+------+-----+---------+----------------+
    | Field     | Type             | Null | Key | Default | Extra          |
    +-----------+------------------+------+-----+---------+----------------+
    | id        | int(11) unsigned | NO   | PRI | NULL    | auto_increment |
    | groupname | varchar(64)      | NO   | MUL |         |                |
    | attribute | varchar(32)      | NO   |     |         |                |
    | op        | char(2)          | NO   |     | ==      |                |
    | value     | varchar(253)     | NO   |     |         |                |
    +-----------+------------------+------+-----+---------+----------------+


    #radusergroup# CREATE TABLE `radusergroup` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `username` varchar(64) NOT NULL DEFAULT '',
      `groupname` varchar(64) NOT NULL DEFAULT '',
      `priority` int(11) NOT NULL DEFAULT '1',
      PRIMARY KEY (`id`),
      KEY `groupname` (`groupname`),
      KEY `username` (`username`)
    ) ENGINE=ndbcluster AUTO_INCREMENT=12380 DEFAULT CHARSET=latin1

   #show index from radusergroup
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table         | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | radgroupreply |          0 | PRIMARY   |            1 | id          | A         |         192 |     NULL | NULL   |      | BTREE      |         |               |
    | radgroupreply |          1 | groupname |            1 | groupname   | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

    #radgroupreply# CREATE TABLE `radgroupreply` (
    `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
    `groupname` varchar(64) NOT NULL DEFAULT '',
    `attribute` varchar(32) NOT NULL DEFAULT '',
    `op` char(2) NOT NULL DEFAULT '=',
    `value` varchar(253) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `groupname` (`groupname`)
    ) ENGINE=ndbcluster AUTO_INCREMENT=2410 DEFAULT CHARSET=latin1
    mysql> show index from radgroupreply;

    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table         | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | radgroupreply |          0 | PRIMARY   |            1 | id          | A         |         192 |     NULL | NULL   |      | BTREE      |         |               |
    | radgroupreply |          1 | groupname |            1 | groupname   | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

   #radgroupcheck#  CREATE TABLE `radgroupcheck` (
      `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
      `groupname` varchar(64) NOT NULL DEFAULT '',
      `attribute` varchar(32) NOT NULL DEFAULT '',
      `op` char(2) NOT NULL DEFAULT '==',
      `value` varchar(253) NOT NULL DEFAULT '',
      PRIMARY KEY (`id`),
      KEY `groupname` (`groupname`)
    ) ENGINE=ndbcluster AUTO_INCREMENT=588 DEFAULT CHARSET=latin1

     mysql> show index from radgroupcheck;
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table         | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | radgroupcheck |          0 | PRIMARY   |            1 | id          | A         |         103 |     NULL | NULL   |      | BTREE      |         |               |
    | radgroupcheck |          1 | groupname |            1 | groupname   | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
    +---------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+


mysql>  SELECT @@GLOBAL.tx_isolation, @@tx_isolation, @@session.tx_isolation; 
+-----------------------+----------------+------------------------+
| @@GLOBAL.tx_isolation | @@tx_isolation | @@session.tx_isolation |
 +-----------------------+----------------+------------------------+
| READ-COMMITTED        | READ-COMMITTED | READ-COMMITTED         |
+-----------------------+----------------+------------------------+

Updated: Query

(SELECT radgroupreply.groupname, 
    count(distinct(radusergroup.username)) AS users 
    FROM radgroupreply                 
    JOIN radusergroup ON radgroupreply.groupname=radusergroup.groupname                                 
    WHERE
    (radgroupreply.groupname NOT LIKE 'FB-%' AND radgroupreply.groupname <> 'Dropped Corporate Users' AND radgroupreply.groupname <> 'Dropped Broadband Users')
    GROUP BY radgroupreply.groupname )
    UNION 
    (SELECT radgroupcheck.groupname, 
    count(distinct(radusergroup.username))  
    FROM radgroupcheck 
    JOIN radusergroup ON radgroupcheck.groupname=radusergroup.groupname                
    WHERE
    (radgroupcheck.groupname NOT LIKE 'FB-%' AND radgroupcheck.groupname <> 'Dropped Corporate Users')
    GROUP BY radgroupcheck.groupname ORDER BY groupname asc);

Explain:-

+----+--------------+---------------+-------+---------------+-----------+---------+----------------------------------+------+------------------------+
    | id | select_type  | table         | type  | possible_keys | key       | key_len | ref                              | rows | Extra                  |
    +----+--------------+---------------+-------+---------------+-----------+---------+----------------------------------+------+------------------------+
    |  1 | PRIMARY      | radgroupreply | range | groupname     | groupname | 66      | NULL                             |   47 | Using where; Using MRR |
    |  1 | PRIMARY      | radusergroup  | ref   | groupname     | groupname | 66      | ctradius.radgroupreply.groupname |   64 | NULL                   |
    |  2 | UNION        | radgroupcheck | range | groupname     | groupname | 66      | NULL                             |   20 | Using where; Using MRR |
    |  2 | UNION        | radusergroup  | ref   | groupname     | groupname | 66      | ctradius.radgroupcheck.groupname |  121 | NULL                   |
    |NULL| UNION RESULT | <union1,2>    | ALL   | NULL          | NULL      | NULL    | NULL                             | NULL | Using temporary        |
    +----+--------------+---------------+-------+---------------+-----------+---------+----------------------------------+------+------------------------+
sherpaurgen
  • 3,028
  • 6
  • 32
  • 45
  • Don't focus too much on the phrase "sending data." Threads sometimes report themselves as being in that state before they are in fact "sending" anything. Is there a reason for using `UNION` (which in modern versions implicitly means `UNION DISTINCT`) rather than explicitly `UNION ALL`? In the 2nd query, `SELECT distinct(radgroupcheck.groupname),` should not be `DISTINCT` -- it's a group by so it is necessarily distinct -- and unnecessary use of `DISTINCT` can be a real performance killer and you appear to be doing that twice. Something of an antipattern red flag to the eye of a DBA. – Michael - sqlbot Nov 18 '16 at 11:30
  • `DISTINCT(a,b)` -- `DISTINCT` is not a function. – Rick James Nov 18 '16 at 20:25
  • 1
    Why check for 'Corporate' twice in the second part? – Rick James Nov 18 '16 at 20:26
  • Please provide `SHOW CREATE TABLE`; `DESCRIBE` is not descriptive enough to be clear on the index(es). – Rick James Nov 18 '16 at 20:28
  • Did you want the `ORDER BY ` to apply to the second `SELECT`? Or the `UNION`? – Rick James Nov 18 '16 at 20:31
  • @RickJames i have updated the query,index and explain output. – sherpaurgen Nov 20 '16 at 11:16
  • The `ORDER BY` is now explicitly tied to the second `SELECT`; it belongs _outside_ the parentheses. – Rick James Nov 21 '16 at 06:16
  • Can there be duplicates between the two `SELECTs`? If not change to `UNION ALL`, which is faster than `UNION DISTINCT`. – Rick James Nov 21 '16 at 06:19

2 Answers2

2

Since you are already GROUPing BY that column change SELECT distinct(radgroupcheck.groupname) to SELECT radgroupcheck.groupname

Add parentheses to clarify that the UNION is being sorted, not merely the last SELECT:

( SELECT ... ) UNION ( SELECT ... ) ORDER BY ...

Change the many:many table according to the hints here (except for the Engine).

group_reply and group_check seem to have identical schemas; some reason for them to be separate?

Is there a unique column (or pair of columns) in group_reply and group_check that could be used for the PRIMARY KEY?

Revamp

SELECT  groupname, 
        ( SELECT  count(distinct username)
            FROM  radusergroup  WHERE groupname = u.groupname 
        ) AS users
    FROM  ( 
            (
                SELECT  r.groupname
                    FROM  radgroupreply AS r
                    WHERE  r.groupname NOT LIKE 'FB-%'
                      AND  r.groupname <> 'Dropped Corporate Users'
                      AND  r.groupname <> 'Dropped Broadband Users' 
            )
            UNION  DISTINCT --  or  UNION  ALL ? 
            (
                SELECT  c.groupname
                    FROM  radgroupcheck AS c
                    WHERE  c.groupname NOT LIKE 'FB-%'
                      AND  c.groupname <> 'Dropped Corporate Users'
            ) 
          ) AS u
    ORDER BY  groupname asc 
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • There are no unique columns in groupreply and groupcheck, i dont know why groupreply and groupcheck are kept separate as those tables were already there (legacy system related to free radius). I have changed the query (updated in post with explain output) but the execution time is almost identical 4.4-5.5 seconds . Thank you for the link its very helpful – sherpaurgen Nov 21 '16 at 06:11
  • @satch_boogie - I reformulated the query; see if it works faster. – Rick James Nov 21 '16 at 06:31
  • wow! this query finshed in < 2 secs on several runs , this query returned more rows than the query im using currently but im checking the results if they match – sherpaurgen Nov 21 '16 at 06:56
  • The resulting output is same as old one :) but performance is faster, (there are some extra records) is it due to excluding the joins ? – sherpaurgen Nov 21 '16 at 07:19
  • In the "Revamp", `UNION DISTINCT` gets rid of dup groupnames. What "extra records" are you seeing? – Rick James Nov 21 '16 at 16:43
1

I would rewrite that query to this one:

SELECT radgroupreply.groupname, 
       count(distinct radusergroup.username  ) AS users,
       count(distinct CASE 
                      WHEN radgroupreply.groupname <> 'Dropped Broadband Users'
                      THEN radusergroup.username
             END )  As users_without_dropped_broadband
FROM radgroupreply                 
JOIN radusergroup ON radgroupreply.groupname=radusergroup.groupname                                 
WHERE
     (radgroupreply.groupname NOT LIKE 'FB-%' 
     AND radgroupreply.groupname <> 'Dropped Corporate Users' )
GROUP BY radgroupreply.groupname 
ORDER BY groupname asc
;

The above query displays results in a slightly different form:

| gropname | users | users_without_dropped_broadband |
| A        |    20 |                              13 |
| B        |    33 |                              11 |

than the layout of the oryginal query:

| gropname | users |
| A        |    20 |
| A        |    13 |
| B        |    33 |
| B        |    11 |

but this new query performs all operations (reads tables, joins, counts etc.) only once,
not twice as the oryginal query, and should be at least 50% faster (<3,5 sec).

krokodilko
  • 35,300
  • 7
  • 55
  • 79