2

I have 4 tables: places, brands, categories, locations.

The relationship are Places belongTo Brands and Places ManytoMany with Categories and Locations.

I want to get search results Places with certain category and location, but only showing 1 place per brand.

Tables Info places table contains about 100k+ rows

place_category pivot table contains 650k+ rows with place_category.place_id and places.brand_id column are indexed

place_locations pivot table contains about 550k+ rows, with place_location.place_id and place_location.location_id column are indexed

The query I got so far

Place::join('place_location', function ($join) use ($city) {
    $join->on('place_location.place_id', '=', 'places.id')
         ->where('place_location.location_id', '=',  $city->id);
})
->join('place_category', function ($join) {
    $join->on('place_category.place_id', '=', 'places.id')
         ->where('place_category.category_id', '=',  $category->id);
})
->groupBy('places.brand_id')
->take(5)
->get();

The groupBy causing the slow, query time is about 2 sec.

The Explain result looks like this

id | select_type | table          | possible_key            | key            | key_len | ref                | rows | Extra

1  | SIMPLE      | places         | PRIMARY                 | brand_id       | 4       | NULL               | 50   | Using where

1  | SIMPLE      | place_location | place_id,place_location | place_location | 4       | const,db.places.id | 1    | Using index

1  | SIMPLE      | place_category | place_category          | place_category | 4       | db.places.id,const | 1    | Using where; Using index

The Raw Mysql query looks like this

select 
    `places`.`id`, 
    `places`.`name`, 
    `places`.`display`, 
    `places`.`status_a`, 
    `places`.`status_b`, 
    `places`.`brand_id`, 
    `places`.`address` 
from `places` 
inner join `place_location` 
    on `place_location`.`place_id` = `places`.`id` 
    and `place_location`.`location_id` = 4047 
inner join `place_category` 
    on `place_category`.`place_id` = `places`.`id` 
    and `place_category`.`category_id` = 102 
where 
    `places`.`status_a` != 1 
    and `status_b` = 2 
    and `display` >= 5 
group by `places`.`brand_id` 
limit 4

Show Create Table looks like this

CREATE TABLE `places` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `user_id` int(11) unsigned DEFAULT NULL,
 `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `desc` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
 `city_id` int(11) unsigned NOT NULL DEFAULT '102',
 `state_id` int(11) unsigned NOT NULL DEFAULT '34',
 `location_id` int(11) unsigned NOT NULL DEFAULT '15',
 `landmark_id` int(10) unsigned NOT NULL DEFAULT '1',
 `postcode` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
 `country_id` int(4) unsigned NOT NULL,
 `lat` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
 `long` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
 `phone` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
 `sec_phone` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
 `third_phone` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
 `fourth_phone` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
 `brand_id` int(10) NOT NULL DEFAULT '1',
 `display` int(11) NOT NULL DEFAULT '0',
 `view` int(10) unsigned NOT NULL DEFAULT '0',
 `status_b` tinyint(3) unsigned NOT NULL DEFAULT '2',
 `status_a` tinyint(4) NOT NULL DEFAULT '2',
 `company_name` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
 `slug` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `lock_status_id` tinyint(3) unsigned DEFAULT '1',
 `created_at` timestamp NULL DEFAULT NULL,
 `updated_at` timestamp NULL DEFAULT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `slug` (`slug`),
 KEY `city_id` (`city_id`),
 KEY `location_id` (`location_id`),
 KEY `user_id` (`user_id`),
 KEY `landmark_id` (`landmark_id`),
 KEY `name` (`name`),
 KEY `brand_id` (`brand_id`),
 KEY `groupby_brandid` (`status_b`, `display`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=116070 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci


CREATE TABLE `place_location` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `location_id` int(10) NOT NULL,
 `place_id` int(10) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `place_location` (`place_id`,`location_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=564259 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `place_category` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `category_id` int(11) unsigned NOT NULL,
 `place_id` int(11) unsigned NOT NULL,
 `branch_id` int(11) unsigned NOT NULL,
 `created_at` timestamp NULL DEFAULT NULL,
 `updated_at` timestamp NULL DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `place_id` (`place_id`),
 KEY `place_category` (`category_id`,`place_id`)
) ENGINE=InnoDB AUTO_INCREMENT=905384 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Any idea how to improve the query ? wrong indexes? or bad query?

IGP
  • 14,160
  • 4
  • 26
  • 43
Tom Kur
  • 2,248
  • 1
  • 18
  • 28
  • As an exercise, can you also include the raw MySQL query which you are currently running from your Laravel code? It is somewhat difficult to see what that query is, just by seeing your PHP code. – Tim Biegeleisen Nov 08 '19 at 07:17
  • just added the raw MySQL – Tom Kur Nov 08 '19 at 07:25
  • is the index on place_location is it an index with 2 keys or 2 indexes with 1 key? – mrhn Nov 16 '19 at 15:46
  • its a index with 2 column, the column order of the index is place_id and location_id columns. I think the problem cause by groupBY – Tom Kur Nov 16 '19 at 16:23
  • 4
    Note(not related to performance)Grouping by `brand_id` and all other columns are not wrapped with aggregate function will most likely return undeterministic results(I doubt that all coulmn in product table are functionaly depenent on brand_id) More: [Group by clause in mySQL and postgreSQL, why the error in postgreSQL?](https://stackoverflow.com/a/33629201/5070879) – Lukasz Szozda Nov 16 '19 at 19:13
  • Please provide `SHOW CREATE TABLE` for each table. – Rick James Nov 17 '19 at 02:16
  • The `EXPLAIN` mentions "place", but the raw sql does not have such. What's up?? – Rick James Nov 17 '19 at 02:22
  • I would claim that 2 seconds is not too bad for this kind of problem and data size. That doesn't mean, that it can't be improved. But one would need to know more about the data - especially the selectivity of the conditions. I would guess that `place_location.location_id = 4047` is the most selective single condition, so the engine should start from that table. But you don't have a supporting index, which has already been suggested in a deleted answer. – Paul Spiegel Nov 17 '19 at 15:37
  • "but only showing 1 place per brand". How do you decide which place to show? – Salman A Nov 21 '19 at 19:21

4 Answers4

0

First of all try executing below query

explain 
select  `products`.`id`, `products`.`name`, `products`.`display`,
        `products`.`status_a`, `products`.`status_b`, `products`.`brand_id`,
        `products`.`address`
    from  `products`
    inner join  `product_location`  ON `product_location`.`product_id` = `products`.`id`
      and  `product_location`.`location_id` = 4047
    inner join  `product_category`  ON `product_category`.`product_id` = `products`.`id`
      and  `product_category`.`category_id` = 102
    where  `products`.`status_a` != 1
      and  `status_b` = 2
      and  `display` >= 5
    group by  `products`.`brand_id`
    limit  4 

EXPLAIN SELECT statement shows how the MySQL query optimizer will execute the query

Indexes can improve performance, indexes can also negatively impact performance if there are too many of them. This is because the more indexes a table has, the more work MySQL must do to keep them updated. The trick is to find the right balance between enough indexes to improve performance, but not so many that they negatively impact performance.

Again try to add the index and execute the same explain statement

Hope it will help Thanks...

Rick James
  • 135,179
  • 13
  • 127
  • 222
Shrirang Kadale
  • 472
  • 1
  • 4
  • 16
0
`products`.`status_a` != 1 
and `status_b` = 2 
and `display` >= 5 

invites this

INDEX(status_b, display)

Is this a 0/1 flag?

`products`.`status_a` != 1 

If so, then change to

`products`.`status_a` = 0

Then you can make an even better

INDEX(status_b, status_a, display)

product_location and product_category sound like many-to-many mapping tables. They need composite indexes as discussed here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table Be sure the datatypes match.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I am using composite index for both pivot table product_location and product_category now, (I just updated the EXPLAIN on the question) but its still slow, it takes about 10-12s with groupby, if I remove groupby, it takes below 1s for `products`.`status_a` != 1 , there are 3 values so its not 0/1 – Tom Kur Nov 17 '19 at 02:44
  • I tried your answer to index INDEX(status_b, status_a, display), the EXPLAIN shows the mysql doesnt use the index, it keeps using the single column brand_id index. – Tom Kur Nov 17 '19 at 02:49
  • @TomKur - `!=` usually cannot be indexed, hence the failure of that 3-column index. I really need to see `SHOW CREATE TABLE` for each table. – Rick James Nov 17 '19 at 03:03
  • I updated the questions with SHOW CREATE TABLE for the places table and both pivot table – Tom Kur Nov 17 '19 at 03:22
  • Please follow my advice for the many-to-many tables and `INDEX(status_b, display)`. – Rick James Nov 17 '19 at 03:28
  • And be aware that `INDEX(a,b)` is _not_ the same as `INDEX(b,a)`, nor `INDEX(a),INDEX(b)`. – Rick James Nov 17 '19 at 03:29
  • I tried your index, but the mysql keep using the brand_id index, it wont using the groupby_brandid index with INDEX(status_b, display), should I remove the brand_id index ? – Tom Kur Nov 17 '19 at 03:39
  • Yes, I am using INDEX(status_b, display), same columns order – Tom Kur Nov 17 '19 at 03:42
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/202502/discussion-between-tom-kur-and-rick-james). – Tom Kur Nov 17 '19 at 03:56
0

Why you are using join query, you can use laravel eloquent relations for that

Place::whereHas('placeLocation', function ($query) use ($city) {
    $query->where('location_id', '=',  $city->id);
})
->whereHas('placeCategory', function ($query) {
    $query->where('category_id', '=',  $category->id);
})
->groupBy('brand_id')
->take(5)
->get();

placeLocation (HasMany) and placeCategory(BelongsToMany) both relations are you have to write in Place model.

0

Usually the database will not be able to merge multiple indexes during query execution. This means, that it won't help to create a single column index for everything on your table.

You seem to be using single column indexes a lot. Try to combine them in such a way that it satisfies your query.

How do multi indexes work?

When creating database indexes I always try to explain them based on an index in a cook book: those are often nested indexes. First they are categorized in meal types, like soup, dish, salad and so on. Inside of those categories they are sorted alphabetically.

This index would look like that in SQL:

KEY `recipe` (`meal_type_id`, `name`)

As a human, you would now be able to find a cheesecake by first going to the dishes (first index) and then go to the letter "C" (second index).

So in this case a multi column index is very helpful.

Optimize table places:

You're using status_a, status_b and display in the WHERE clause, and also brand_id in GROUP BY. For your multi column index try to find a meaningful combination of those columns.

Order is important here!

If, for example, only 10 % of your data matches on status_b = 2 then you should use that field as first column in the index, as it would eliminate 90 % of the rows. The second index column has a lot less to do then. Think of the cheesecake example above: we already knew that the cheesecake is a dish, so we looked directly at the dishes and were able to eliminate the other 90 % of the recipes.

This is called "cardinality".

Optimize table place_location and place_category:

Don't forget to also look at the joined tables and ensure that they are properly indexed as well.

Just like mentioned before try to find a useful index on those tables also. Have a look on what your query is asking for and try to satisfy it with a helpful index.

How many indexes should a table have?

To answer this, you'll only need to understand that indexes have to be updated on every INSERT or UPDATE. So if it's a write intensive table, you should use as little indexes as possible.

If the table is read intensive but will not be written to very frequently, it should be okay to have a little more indexes. Just keep in mind that they will need memory. If your table has millions of rows, your index can get quite large if you're using many indexes.

wolfgang_rt
  • 154
  • 1
  • 7