select COUNT(p.id) AS `num`, cat.id, cat.name, cat.parent_id AS `parent_id`
from products p
INNER JOIN `products_categories` AS `pc` ON p.id=pc.products_id
INNER JOIN `categories` AS `cat` ON pc.categories_id=cat.id
WHERE p.status = 1 AND p.gender IN ('female','neutral')
group by cat.id
Explain query:
1 SIMPLE p ref PRIMARY,gender,status status 1 const 139107 Using where; Using temporary; Using filesort
1 SIMPLE pc ref products_id,categories products_id 4 mydb.p.id 1 Using index
1 SIMPLE cat eq_ref PRIMARY,categoryname PRIMARY 4 mydb.pc.categories_id 1 Using where
Relevant indexes:
products 0 PRIMARY 1 id A 299339 BTREE
products 1 title 1 title A 299339 BTREE
products 1 sku 1 sku A 299339 BTREE
products 1 body 1 body A 299339 200 BTREE
products 1 short_description 1 short_description A 299339 200 YES BTREE
products 1 keywords 1 keywords A 2 200 BTREE
products 1 gender 1 gender A 10 BTREE
products 1 status 1 status A 2 BTREE
products 1 brand_id 1 brand_id A 3741 YES BTREE
products 1 merchant 1 merchant_id A 52 BTREE
products 1 title_2 1 title,body,keywords 299339 FULLTEXT
products 1 title_3 1 title 299339 FULLTEXT
products 1 body_2 1 body 299339 FULLTEXT
products_categories 0 PRIMARY 1 id A 514054 BTREE
products_categories 1 products_id 1 products_id, categories_id A 514054 BTREE
products_categories 1 categories 1 categories_id A 266 BTREE
categories 0 PRIMARY 1 id A 154 BTREE
categories 1 categoryname 1 name A 154 BTREE
This is a DB with products, categories and an N:N relationship between them. Products can be in 1 or more category.
I basically need a query that tells me, for the current product filters I have (in this case the status and gender) if the category has any product or not (so I can hide categories without products). Currently I count the products inside each category to know this.
The query WHERE parameters will change according to the user selected filters so that part is not very important in this optimization.
I don't need the exact number of products a category, just if they have products or not. Products table has quite a few indices, has the products_categories and categories table. Products table has about 400k products, 150 categories and 500k products_categories.
MySQL 5.6.22 hosted on AWS RDS, all tables in InnoDB.
I understand my explain query shows why this is slow (going through a lot of products) but I don't have any ideas how to optimize this... maybe a different way to think about this?