11

I have simple but long query which count the content of the result it takes about 14 seconds. the count itself on the main table takes less than a second but after multiple join the delay is too high as follow

Select  Count(Distinct visits.id) As Count_id
    From  visits
    Left Join  clients_locations  ON visits.client_location_id = clients_locations.id
    Left Join  clients  ON clients_locations.client_id = clients.id
    Left Join  locations  ON clients_locations.location_id = locations.id
    Left Join  users  ON visits.user_id = users.id
    Left Join  potentialities  ON clients_locations.potentiality = potentialities.id
    Left Join  classes  ON clients_locations.class = classes.id
    Left Join  professions  ON clients.profession_id = professions.id
    Inner Join  specialties  ON clients.specialty_id = specialties.id
    Left Join  districts  ON locations.district_id = districts.id
    Left Join  provinces  ON districts.province_id = provinces.id
    Left Join  locations_types  ON locations.location_type_id = locations_types.id
    Left Join  areas  ON clients_locations.area_id = areas.id
    Left Join  calls  ON calls.visit_id = visits.id 

The output of explain is

+---+---+---+---+---+---+---+---+---+---+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+---+---+---+---+---+---+---+---+---+---+
| 1 | SIMPLE | specialties | index | PRIMARY | specialty_name | 52 | NULL | 53 | Using index |
| 1 | SIMPLE | clients | ref | PRIMARY,specialty | specialty | 4 | crm_db.specialties.id | 143 |  |
| 1 | SIMPLE | clients_locations | ref | PRIMARY,client_id | client_id | 4 | crm_db.clients.id | 1 |  |
| 1 | SIMPLE | locations | eq_ref | PRIMARY | PRIMARY | 4 | crm_db.clients_locations.location_id | 1 |  |
| 1 | SIMPLE | districts | eq_ref | PRIMARY | PRIMARY | 4 | crm_db.locations.district_id | 1 | Using where |
| 1 | SIMPLE | visits | ref | unique_visit,client_location_id | unique_visit | 4 | crm_db.clients_locations.id | 4 | Using index |
| 1 | SIMPLE | calls | ref | call_unique,visit_id | call_unique | 4 | crm_db.visits.id | 1 | Using index |
+---+---+---+---+---+---+---+---+---+---+

Update 1 The above query used with dynamic where statement $sql = $sql . "Where ". $whereFilter but the i submitted it in simple form . So do not consider the answer just eleminate the joins :)

Update 2 Here is example of dynamic filtering

$temp = $this->province_id;
if ($temp != null) {
        $whereFilter = $whereFilter . " and provinces.id In ($temp) ";
    }

But in startup case which is our case no where statement

Rick James
  • 135,179
  • 13
  • 127
  • 222
Eslam Sameh Ahmed
  • 3,792
  • 2
  • 24
  • 40
  • Why do you need to join if you're just counting unique ID in visits? – ebyrob Aug 12 '16 at 17:34
  • 1
    You seem to have no filtering, so why do you need the `count(distinct)`. I mean, there *is* an `inner join` or two sprinkled among the `join`s, but it seems that the joins aren't necessary. – Gordon Linoff Aug 12 '16 at 17:34
  • I use later this query with dynamic where statement 'Where $whereFilter' – Eslam Sameh Ahmed Aug 12 '16 at 17:35
  • It might be better to use sub-queries in the where clause so you can include only relevant tables to the filter. Also, more detail about how you use/generate the where clause might help us understand better. Also, use separate queries or in-memory tables to convert to ID's actually on the visit_id index before applying it will be much faster. – ebyrob Aug 12 '16 at 17:38
  • In simple case where user is not submitting any criteria the filter is where 1 =1 – Eslam Sameh Ahmed Aug 12 '16 at 17:42
  • yes in most cases there is no exact relation. i.e. Some visits has no calls but must be counted. – Eslam Sameh Ahmed Aug 12 '16 at 18:08
  • In Update 2, I stated i want to consider the case where no filter is needed or where filter is always true – Eslam Sameh Ahmed Aug 12 '16 at 19:13
  • 1
    When a "visit" has 0 "calls", does it count as 1? When "visit" has 3 "calls", does it count as 1? Or as 3? This is critical to understanding and possibly eliminating the `LEFT JOINs`. – Rick James Aug 13 '16 at 19:13
  • When visit has 0 calls or more it is one visit. Calls used in filtering for example Visits without calls yet (call.id is null) or done visits (calls.id != null) – Eslam Sameh Ahmed Aug 13 '16 at 22:03
  • 1
    If you are dynamically adding where clauses that use fields from these left joins you are changing them to inner joins. That will then give you incorrect results. – HLGEM Aug 17 '16 at 17:12
  • http://wiki.lessthandot.com/index.php/WHERE_conditions_on_a_LEFT_JOIN – HLGEM Aug 17 '16 at 17:14
  • Please show your table definitions including indexes. – philipxy Aug 19 '16 at 07:38
  • You can build a table to prevent the need to run this query all the time. – hxtree Aug 23 '16 at 18:27

6 Answers6

8

Left joins always return a row from the first table, but may return multiple rows if there are multiple matching rows. But because you are counting distinct visit rows, left joining to another table while counting distinct visits is the same as just counting the rows of visits. Thus the only joins that affect the result are inner joins, so you can remove all "completely" left joined tables without affecting the result.

What I mean by "completely" is that some left joined tables are effectively inner joined; the inner join to specialty requires the join to clients to succeed and thus also be an inner join, which in turn requires the join to clients_locations to succeed and thus also be an inner join.

Your query (as posted) can be reduced to:

Select Count(Distinct visits.id) As Count_id
From visits
Join clients_locations ON visits.client_location_id = clients_locations.id
Join clients ON clients_locations.client_id = clients.id
Join specialties ON clients.specialty_id = specialties.id

Removing all those unnecessary joins will however greatly improve the runtime of your query, not only because there are less joins to make but also because the resulting rowset size could be enormous when you consider that the size is the product of the matches in all the tables (not the sum.

For maximum performance, create a covering indexes on all id-and-fk columns:

create index visits_id_client_location_id on visits(id, client_location_id);
create index clients_locations_id_client_id on clients_locations(id, client_id);
create index clients_id_specialty_id on clients(id, specialty_id);

so index-only scans can be used where possible. I assume there are indexes on the PK columns.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • What about calls. Should i keep it left joined with visits? – Eslam Sameh Ahmed Aug 17 '16 at 08:14
  • @eslam no, don't join to `calls`. A left join to another table will always return every row from visits. It may return *multiple* rows if there are multiple matching rows in the left joined table, but because you are counting only *distinct* visit ids, joining to multiple rows won't give you any more distinct visit ids. – Bohemian Aug 17 '16 at 15:01
4

You don't seem to have any (or much) intentional filtering. If you want to know the number of visits referred to in calls, I would propose:

select count(distinct c.visit_id)
from calls c;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
3

in order to optimize the whole process you can dynamically construct the pre-where SQL according to the filters you are going to apply. Like:


    // base select and left join 
    $preSQL = "Select  Count(Distinct visits.id) As Count_id From  visits ";
    $preSQL .= "Left Join  clients_locations  ON visits.client_location_id = clients_locations.id ";

    // filtering by province_id
    $temp = $this->province_id;
    if ($temp != null) {
            $preSQL .= "Left Join  locations ON clients_locations.location_id = locations.id ";
            $preSQL .= "Left Join  districts ON locations.district_id = districts.id ";
            $preSQL .= "Left Join  provinces ON districts.province_id = provinces.id ";
            $whereFilter = "provinces.id In ($temp) ";
        }

    $sql = $preSQL . "Where ".   $whereFilter;
    // ...

If you are using multiple filters you can put all inner/left-join strings in an array and then after analysing the request, you can construct your $preSQL using the minimum of joins.

Geo Halkiadakis
  • 370
  • 1
  • 7
1

Use COUNT(CASE WHEN visit_id!="" THEN 1 END) as visit.

Hope this will help

premi
  • 93
  • 10
1

Isn't it just:

SELECT COUNT(id)
FROM visits

because all the left outer joins also return a visits.id when theres no matching clients, ..., calls and id's ought to be unique?

Different hint: The one inner join also is only effective when a client exists. Generally when needing inner joins they must be put as high/near as possible to the source table, so in your example it would have been best in the line after "left join clients".

Alim Özdemir
  • 2,396
  • 1
  • 24
  • 35
0

I didn't understand too much your idea, specially your INNER JOIN that will tranform some LEFT in INNER JOINs, it seems strange, but lets try a solution:

Usually the LEFT JOINs has a very bad performance, and I think you'll need them only if you'll use them in WHERE clause, then you can include them with INNER JOIN only if you'll use them. For example:

$query = "Select Count(Distinct visits.id) As Count_id  From  visits ";

if($temp != null){
    $query .= " INNER JOIN  clients_locations  ON visits.client_location_id = clients_locations.id ";
    $query .= " INNER JOIN  locations  ON clients_locations.location_id = locations.id  ";
    $query .= " INNER JOIN  locations  ON clients_locations.location_id = locations.id ";
    $query .= " INNER JOIN  districts  ON locations.district_id = districts.id "
    $query .= " INNER JOIN  provinces  ON districts.province_id = provinces.id ";
    $whereFilter .= " and provinces.id In ($temp) ";
}

I think it'll help your performance and it'll works as you need.