2

I created an index in Postgres like this:

CREATE UNIQUE INDEX my_index_name 
       ON my_table USING btree (custid, 
                                date_trunc('day'::text, timezone('UTC'::text, somedate)),
                                firstname,
                                middlename,
                                lastname);

I monitored the free disk space to get an estimate on the progress of the index creation, I expected to see the available space going down indicating the process was doing its job. The problem is that after 40 minutes of going down, it got stuck for 25 minutes, and then it started consuming disk space again:

enter image description here

When it seemed stuck, I checked the long running processes to see if something was blocking it (unlikely, this is a DB copy nobody else is using), and I saw there were 3 different identical "CREATE INDEX" processes.

That is what I want to ask about:

  • Why does Postgres show 3 different processes?
  • What was it doing during this period where it seemed stuck?

This is the command I issued to see the long running processes, after the process got unstuck, only the process 18511 continued running:

my_user => SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
    FROM pg_stat_activity
    WHERE (now() - query_start) > interval '5 minutes' AND state != 'idle'
    ORDER by 2 DESC;

-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------
pid      | 18511                                                                                                                                                          
duration | 01:04:37.969599                                                                                                                                                
query    | CREATE UNIQUE INDEX my_index_name ON my_table USING btree (custid, date_trunc('day'::text, timezone('UTC'::text, somedate)), firstname, middlename, lastname); 
state    | active                                                                                                                                                         
-[ RECORD 2 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------
pid      | 12712
duration | 01:04:37.969599
query    | CREATE UNIQUE INDEX my_index_name ON my_table USING btree (custid, date_trunc('day'::text, timezone('UTC'::text, somedate)), firstname, middlename, lastname);
state    | active
-[ RECORD 3 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------
pid      | 12713
duration | 01:04:37.969599
query    | CREATE UNIQUE INDEX my_index_name ON my_table USING btree (custid, date_trunc('day'::text, timezone('UTC'::text, somedate)), firstname, middlename, lastname);
state    | active

Some other information

  • I'm running Postgres 11.5 in Amazon's RDS
  • This table's size is 255 GiB
  • In that graph, the Y axis is given in MiB, so at the top we have 800GiB
Daniel
  • 21,933
  • 14
  • 72
  • 101
  • I guess you changed column names and missed on some ```org_id```s in the query result? Anyways, I have no idea why there are multiple backends running.. Perhaps somoething is done in parallel? Did you see the processes ```12712``` and ```12713``` only while it was stuck or also before? – Islingre Dec 06 '19 at 02:45
  • Is it a partitioned table? – JGH Dec 06 '19 at 03:13
  • @Islingre Yes, I mangled the column names and forgot that one, fixed. I only saw those 2 processes during the issue. At the very beginning they were not there, and after the issue, they were not there either. – Daniel Dec 06 '19 at 05:12
  • @JGH This `my_table` is the default partition of a partitioned table. – Daniel Dec 06 '19 at 05:13

1 Answers1

5

If you see more processes creating an index, you have PostgreSQL v11 or later, and there are parallel worker processes building the index. This is nothing to worry about; it will consume more resources, but build the index faster.

There are several steps in building an index: scanning the table, sorting the entries and so on. Not all of these steps will consume disk space. For example, sorting should not consume increasing amounts of storage.

In short, everything looks as it should.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263