0

I'm curious in how is MySQL implemented it's range columns partitioning, especially in how MySQL extracts the values from SQL, to compare with partition values range.

For example, create a table like this:

CREATE TABLE t (a INT, b INT, c INT)
PARTITION BY RANGE COLUMNS(a, b) (
  PARTITION p0 VALUES LESS THAN (10, 10),
  PARTITION p1 VALUES LESS THAN (10, 15),
  PARTITION p1 VALUES LESS THAN (20, 20),
  PARTITION p2 VALUES LESS THAN (MAXVALUE, MAXVALUE),
)

When provided with a query with complex filters like SELECT * FROM t WHERE a < 10 AND b > 10, MySQL should scan only partition p1; when provided with SELECT * FROM t WHERE a < 10 AND b > 100, partitions p0 and p1 are scanned.

It's clearly we can't naively break the filter into sub expressions and run the partition selection each separately. For example, we can't break a < 10 AND b > 10 into a < 10 and b > 10, compare them with ranges (10, 10), (10, 15), (20, 20), (MAXVALUE, MAXVALUE), find partitions each, then joins the result.

PS: This often cause misunderstanding of partition selection.

Instead, we have to use both a and b as a tuple (a, b) to compare with partition ranges. However, it seems very hard to extract their values to form the tuple from SQL filters mixed with sub-queries, joins and complex logical operations like a < 10 AND (a < 20 OR b < 20) OR b > 100. I can't help wondering how MySQL implemented it. Thanks in advance!

bayinamy
  • 487
  • 4
  • 14

1 Answers1

0

I believe your interpretation of the ranges is mistaken.

Let's switch to lastname and firstname instead of numbers. Then, thinking of a list of people sorted by last and first:

('James', 'Rick') comes after  ('James', 'Adam')
('James', 'Rick') comes before ('James', 'Zack')

and also

('Smith', '...') comes after ('James', '...')

regardless of firstname.

In other words, x comes after y if and only if

x.lastname >= y.lastname
    AND ( x.lastname > y.lastname OR
          x.firstname > y.firstname )

or, equivalently,

x.lastname > y.lastname
    OR ( x.lastname = y.lastname AND
         x.firstname > y.firstname )

It does not simply test each column separately.

This is also the way that

ORDER BY lastname, firstname

works.

What is the goal of the table?? There is likely to be no performance advantage in Partitioning with tuples.

To find out whether the tuples are handled efficiently, simply provide a EXPLAIN SELECT ... and see what the "Partitions" column says.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • yes, we can't simply test each column separately, we have to use both lastname and firstname as a tuple in comparison. I'm not interpretated it wrong in the question, I've said that separately testing is "**can't**" – bayinamy Oct 19 '22 at 05:12
  • @bayinamy -My apologies for reading your question too fast. I don't have an answer about the implementation, but I added what might be a way to get the answer for specific `WHERE` clauses. If you discover the complete answer, write an Answer and 'accept' it. – Rick James Oct 19 '22 at 05:21