-1

I have an innoDB table named "transaction" with ~1.5 million rows. I would like to partition this table (probably on column "gas_station_id" since it is used a lot in join queries) but I've read in MySQL 5.7 Reference Manual that

All columns used in the table's partitioning expression must be part of every unique key that the table may have, including any primary key.

I have two questions:

  1. The column "gas_station_id" is not part of unique key or primary key. How could I partition this table then?
  2. even if I could partition this table, I am not sure which partitioning type would be better in this case? (I was thinking about LIST partitioning (we have about 40 different(distinct) gas stations) but I am not sure since there will be only one value in each list partition like the following : ALTER TABLE transaction PARTITION BY LIST(gas_station_id) ( PARTITION p1 VALUES IN (9001), PARTITION p2 VALUES IN (9002),.....)
  3. I tried partitioning by KEY, but I receive the following error (I think because id is not part of all unique keys..):

#1053 - a UNIQUE INDEX must include all columns in the table's partitioning function

This is the structure of the "transaction" table:

enter image description here

EDIT and this is what SHOW CREATE TABLE shows:

CREATE TABLE `transaction` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `terminal_transaction_id` int(11) NOT NULL,
 `fuel_terminal_id` int(11) NOT NULL,
 `fuel_terminal_serial` int(11) NOT NULL,
 `xboard_id` int(11) NOT NULL,
 `gas_station_id` int(11) NOT NULL,
 `operator_id` varchar(16) NOT NULL,
 `shift_id` int(11) NOT NULL,
 `xboard_total_counter` int(11) NOT NULL,
 `fuel_type` tinyint(2) NOT NULL,
 `start_fuel_time` int(11) NOT NULL,
 `end_fuel_time` int(11) DEFAULT NULL,
 `preset_amount` int(11) NOT NULL,
 `actual_amount` int(11) DEFAULT NULL,
 `fuel_cost` int(11) DEFAULT NULL,
 `payment_cost` int(11) DEFAULT NULL,
 `purchase_type` int(11) NOT NULL,
 `payment_ref_id` text,
 `unit_fuel_price` int(11) NOT NULL,
 `fuel_status_id` int(11) DEFAULT NULL,
 `fuel_mode_id` int(11) NOT NULL,
 `payment_result` int(11) NOT NULL,
 `card_pan` varchar(20) DEFAULT NULL,
 `state` int(11) DEFAULT NULL,
 `totalizer` int(11) NOT NULL DEFAULT '0',
 `shift_start_time` int(11) DEFAULT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `terminal_transaction_id` (`terminal_transaction_id`,`fuel_terminal_id`,`start_fuel_time`) USING BTREE,
 KEY `start_fuel_time_idx` (`start_fuel_time`),
 KEY `fuel_terminal_idx` (`fuel_terminal_id`),
 KEY `xboard_idx` (`xboard_id`),
 KEY `gas_station_id` (`gas_station_id`) USING BTREE,
 KEY `purchase_type` (`purchase_type`) USING BTREE,
 KEY `shift_start_time` (`shift_start_time`) USING BTREE,
 KEY `fuel_type` (`fuel_type`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1665335 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
mOna
  • 2,341
  • 9
  • 36
  • 60
  • 1
    If that table has a billion rows, beware. `id INT SIGNED` is limited to about 2 billion. – Rick James Sep 18 '16 at 05:28
  • 1
    Many of those `INTs` don't need to be a full 4 bytes; shrink them to space space, thereby improve cacheability, hence better speed. – Rick James Sep 18 '16 at 05:29

1 Answers1

1

Short answer: Don't use PARTITION. Let's see the query to help speed it up.

Long answer:

  • 1.5M rows is only marginally big enough to consider partitioning.
  • PARTITION BY LIST is probably useless for performance.
  • You have not given enough info to give you answers other that vague hints. Please provide at least SHOW CREATE TABLE and the slow SELECT.
  • It is possible to add the partition key onto the end of the PRIMARY or UNIQUE key; you will lose the uniqueness test.
  • Don't index a low-cardinality column; it won't be used.

More on PARTITION

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thanks Rick for your help. I edited my question and added `SHOW CREATE TABLE`. Actually at the moment the transaction table has 1.5M rows. We will have similar table with much more records (around 1 billion rows). I increased the `innodb_buffer_pool_size` and now queries are much faster, but I am not sure whether changing this variable alone would be enough for the bigger table that we will have in the near future. that is why I was thinking about partitioning this table. – mOna Sep 18 '16 at 05:05
  • The buffer_pool is an important factor in performance; it should be about 70% of available RAM. – Rick James Sep 18 '16 at 05:28
  • Can't judge whether a query will run faster without seeing the query. – Rick James Sep 18 '16 at 05:30
  • Thanks for your answers. Right now all queries are fast (just by increasing the buffer_pool). Do you think even if this table had 1 billion rows, we may not need partitioning only with increasing the buffer_pool (I increased it to ~70% of available RAM, i.e., 16G) – mOna Sep 18 '16 at 07:17
  • Partitioning does not intrinsically provide any speedup. See my link for the only 4 use cases where partitioning helps. – Rick James Sep 18 '16 at 14:13