-1

I have a product filter based on data from multiple tables.

I make query using IN 2 and more times in one query. This works great and only selects products that have

  • and needed attributes
  • and needed options
  • and needed categories
SELECT COUNT(*) FROM `test_oc_product` 
WHERE product_id IN ( 
                        SELECT product_id 
                        FROM test_oc_product_option 
                        WHERE option_id = '21' AND value = 'Red'  
                    )
AND product_id IN ( 
                        SELECT product_id 
                        FROM test_oc_product_attribute 
                        WHERE attribute_id = '10' 
                )

Also I found INTERSECT operator. But there are some nuances

-- works ok

(SELECT product_id FROM `test_oc_product`)
INTERSECT
    (SELECT product_id 
    FROM test_oc_product_attribute 
    WHERE attribute_id = '10')
INTERSECT
    (SELECT product_id 
    FROM test_oc_product_option 
    WHERE option_id = '21' 
    AND value = 'Red')

-- doesn't work

(SELECT * FROM `test_oc_product`)
INTERSECT
    (SELECT product_id 
    FROM test_oc_product_attribute 
    WHERE attribute_id = '10'
    )
INTERSECT
    (SELECT product_id 
    FROM test_oc_product_option 
    WHERE option_id = '21' AND value = 'Red'
    )

-- doesn't work

(SELECT COUNT(*) FROM `test_oc_product`)
INTERSECT
    (SELECT product_id 
    FROM test_oc_product_attribute 
    WHERE attribute_id = '10'
    )
    
INTERSECT
    (SELECT product_id 
    FROM test_oc_product_option 
    WHERE option_id = '21' AND value = 'Red'
    )

I not found such examples with IN operator on the Internet. Is it OK to use IN 2 or more times in one sql-query?

I need to select only those products that match the given parameters. And these parameters are stored in different tables

There is a dump with some tables in an abbreviated form

CREATE TABLE test_oc_product (
  product_id int(11) NOT NULL AUTO_INCREMENT,
  model varchar(64) NOT NULL,
  sku varchar(64) NOT NULL,
   PRIMARY KEY (product_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO test_oc_product (product_id, model, sku) VALUES
(1, 'A1', 'A1'),
(2, 'A2', 'A2'),
(3, 'A3', 'A3'),
(4, 'AA', 'AA'),
(5, 'A5', 'A5'),
(6, 'A6', 'A6'),
(7, 'A7', 'A7'),
(8, 'A8', 'A8');

CREATE TABLE test_oc_product_attribute (
  product_id int(11) NOT NULL,
  attribute_id int(11) NOT NULL,
  text text NOT NULL,
  PRIMARY KEY (product_id, attribute_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO test_oc_product_attribute (product_id, attribute_id, text) VALUES 
('1', '10', 'Attribute Value 1'), 
('2', '11', 'Attribute Value 2');


CREATE TABLE test_oc_product_option (
  product_option_id int(11) NOT NULL AUTO_INCREMENT,
  product_id int(11) NOT NULL,
  option_id int(11) NOT NULL,
  value text NOT NULL,
   PRIMARY KEY (product_option_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


INSERT INTO test_oc_product_option (product_option_id, product_id, option_id, value) VALUES 
(NULL, '1', '21', 'Red'), 
(NULL, '2', '21', 'Red'),
(NULL, '3', '21', 'Blue'),
(NULL, '4', '10', 'S');
SergeTkach
  • 35
  • 3
  • 2
    You can use the operator IN as many times as you need. – forpas Apr 14 '23 at 12:22
  • If the above comment does not answer your question, what other doubts do you have here? – Tim Biegeleisen Apr 14 '23 at 12:23
  • SQL does not restrict how much `IN`'s you can use, so you _can_ use. _Should_ you use it - depends on database and query performance – Justinas Apr 14 '23 at 12:23
  • ok. I can use. I understand. But is it the right way? Or how to do it in other way if INTERSECT not works – SergeTkach Apr 14 '23 at 12:25
  • The INTERSECT operator didn't work in the last two examples because the number and types of columns in the queries must be the same. In the second example, you tried to combine a query with one column (product_id) and a query with all columns from test_oc_product. In the third example, you tried to combine a query with one count value and queries with product_id values. – Horaciux Apr 14 '23 at 12:29

2 Answers2

2

As I mentioned in the comments, you need the same number of columns to use intersect.

Why not try using JOIN instead like this:

SELECT COUNT(DISTINCT p.product_id) FROM test_oc_product AS p
JOIN test_oc_product_option AS po ON p.product_id = po.product_id
JOIN test_oc_product_attribute AS pa ON p.product_id = pa.product_id
WHERE po.option_id = '21' AND po.value = 'Red'
AND pa.attribute_id = '10';

NOTE

Kudos to @TimBiegeleisen for noticing and provided a fix

Horaciux
  • 6,322
  • 2
  • 22
  • 41
  • Thank you so much for your answer! Found in other discussions that IN will be slow on a large amount of data. So I will use JOIN – SergeTkach Apr 18 '23 at 05:58
  • @SergeTkach I'm glad it helps. Could you mind select the answer as accepted, thanks – Horaciux Apr 18 '23 at 12:25
2

There is nothing wrong per se in using IN more than once in the same query. For reference, I would probably use your first version, refactored to use EXISTS instead of IN:

SELECT COUNT(*)
FROM test_oc_product op
WHERE EXISTS (
    SELECT 1
    FROM test_oc_product_option opp
    WHERE opp.product_id = op.product_id AND opp.option_id = '21' AND opp.value = 'Red'  
) AND EXISTS (
    SELECT 1
    FROM test_oc_product_attribute opa
    WHERE opa.product_id = op.product_id AND opa.attribute_id = '10' 
);

For performance, the following two indices should speed up the two exists subqueries:

CREATE INDEX idx_opp ON test_oc_product_option (product_id, option_id, value);
CREATE INDEX idx_opa ON test_oc_product_attribute (product_id, attribute_id);
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • After several experiments, I nevertheless settled on the option with EXISTS. I didn't see any difference in speed compared to JOIN. And EXISTS is more convenient to use when it comes to a query that is assembled in parts - especially when it comes to combining several attributes. Thank you for the answer! – SergeTkach Apr 28 '23 at 10:06