2

I had a table of 200GB with an index of 49GB. Only insert and update operation happens to that table. I dropped the existing index and created a new one on the same columns. New index size is only 6GB. I am using postgres database

Can someone explain how index size got reduced from 50GB to 6GB?

Shailendra
  • 528
  • 2
  • 10
  • 21
  • It's not "fragmentation", that's known as "table bloat": See e.g. here https://www.keithf4.com/checking-for-postgresql-bloat/ and here: https://wiki.postgresql.org/wiki/Index_Maintenance –  Jul 04 '18 at 18:43

1 Answers1

4

The newly created index is essentially optimally packed sorted data. To put some more data somewhere in the middle, while still maintaining the optimal packed sorted data you'd have to rewrite half of the index with every insert on average.

This is not acceptable, so the database uses some complicated and clever format for indexes (based on a b-tree data structure) that allow for changing the order of index blocks without moving them on disk. But the consequence of this is that after inserting some data in the middle some of the index data blocks are not 100% packed. The space left can be used in the future but only if the values inserted match to the block with regards of ordering.

So, depending on your usage pattern, you can easily have index blocks only 10% packed on average.

This is compounded by the fact that when you update a row both old and new version have to be present in the index at the same time. And if you do a bulk update of the whole table then the index will have to expand to contain twice the number of rows, although briefly. But it will not shrink back as easily, as this requires basically a rewrite of it all.

The index size tend to grow first and then stabilize after some usage. But the stable size is often nowhere near the size of a newly created one.

You might want to tune the autovacuum to be more aggressive - so the not needed anymore space in table and indexes is recovered faster and therefore can be reused faster. This can make your index stabilize faster and smaller. Also try to avoid too big bulk updates or do a vacuum full tablename after a huge update.

Tometzky
  • 22,573
  • 5
  • 59
  • 73