I have installed postgres 12 and made a test with partitions (shown below). My question is, why there is no significant performance gain inserting data, even with ¼ billion rows in the table, when comparing 5 partitions vs 5 partitions with 5 subpartitions.
My goal is to import data fast in usually big tables with + ¼ billion rows, and I was of the impression, that having more partitions would reduce the size of the indexes and increase the speed of inserting data.
This was the setup for the test:
Machine: Local pc, 16 GB ram
Postgres version: 12
Partition test: Table a) 5 hash partitions with 5 hash subpartitions. Table b) 5 hash partitions
Tablesetup (example with only 5 partitions)
CREATE TABLE public.only_5_partitions
(
id integer NOT NULL,
title character varying COLLATE pg_catalog."default",
project_id integer
) PARTITION BY HASH (id) ;
--INDEXES ON ALL 3 COLUMNS NOT ADDED IN THE CODE EXAMPLE, BUT THERE ARE INDEXES ON ALL 3.
--THE PARTITIONS
CREATE TABLE public.only_5_partitions_0 PARTITION OF public.only_5_partitions
FOR VALUES WITH (modulus 5, remainder 0)
PARTITION BY HASH (id);
CREATE TABLE public.only_5_partitions_1 PARTITION OF public.only_5_partitions
FOR VALUES WITH (modulus 5, remainder 1)
PARTITION BY HASH (id);
CREATE TABLE public.only_5_partitions_2 PARTITION OF public.only_5_partitions
FOR VALUES WITH (modulus 5, remainder 2)
PARTITION BY HASH (id);
CREATE TABLE public.only_5_partitions_3 PARTITION OF public.only_5_partitions
FOR VALUES WITH (modulus 5, remainder 3)
PARTITION BY HASH (id);
CREATE TABLE public.only_5_partitions_4 PARTITION OF public.only_5_partitions
FOR VALUES WITH (modulus 5, remainder 4)
PARTITION BY HASH (id);
Rows inserted: Generated rows with this sample code:
INSERT INTO tableb
SELECT generate_series(1,10000000), 'someting new', generate_series(1,10000000);
As you can see from the test, inserting data in table A (only 5 partitions) more or less are the same as for table B. In some runs the small number of partitions even performed better.
In the last insert, I increased the insert to 50 mio rows, to detect a performance change.