1

I have a table with columns and constraints:

height smallint,
length smallint,
diameter smallint,
volume integer,
idsensorfragments integer,
CONSTRAINT sensorstats_idsensorfragments_fkey FOREIGN KEY (idsensorfragments)
  REFERENCES sensorfragments (idsensorfragments) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE CASCADE

(no primary key). There is currently 28 978 112 records in it, but the size of the table is way too much in my opinion.

Result of the query:

select pg_size_pretty(pg_total_relation_size('sensorstats')), pg_size_pretty(pg_relation_size('sensorstats'))

is:

"1849 MB";"1226 MB"

There is just one index on idsensorfragments column. Using simple math you can see that one record takes ~66,7 B (?!?!). Can anyone explain me where this number comes from?

5 columns = 2 + 2 + 2 + 4 + 4 = 14 Bytes. I have one index, no primary key. Where additional 50B per record comes from?

P.S. Table was vacuumed, analyzed and reindexed.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129

2 Answers2

2

You should take a look on how Database Physical Storage is organized, especially on the Page Layout.

PostgreSQL keeps a bunch of extra fields for each tuple (row) and also for each Page. Tuples are kept in the Pages, as Page is the item that database operates on, typically 8192 bytes in size. So the extra space usage comes from:

  • PageHeader, 24 bytes;
  • Tupleheader, 27 bytes;
  • “invisible” Tuple versions;
  • reserved free space, according to the Storage Parameters of the table;
  • NULL indicator array;
  • (might have missed something more).

The layout of physical storage changes between major releases, that's the reason you have to perform a full dump/restore. In the recent versions pg_upgrade is of great help in this process.

vyegorov
  • 21,787
  • 7
  • 59
  • 73
  • You missed the null-indicator array. (4 of the OP's columns are nullable) I don't know about alignment: personally, I would opt to align and pad the shorts to 4 byte boundaries. – wildplasser Jul 03 '12 at 12:05
  • Thanks a lot! I'm just thinking how to solve it, because I assume my app will insert aprox. 2 - 5 mln records per day(!) :D I will of course aggregate this, but for some time I need to store row data. Any thoughts? :) Thanks! – user1414355 Jul 03 '12 at 13:16
  • @user1414355, You need to look into partitioning your table, as quite soon you'll hit performance issues. [This answer](http://stackoverflow.com/a/1011329/1154462) shows an example and more details are [in docs](http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html). – vyegorov Jul 03 '12 at 13:20
  • Thanks! Now I am thinking to move to amazon clouds with all server so I will seriously thing of this solution there. Thanks mate one more time :) – user1414355 Jul 04 '12 at 09:46
  • @user1414355, you're welcome! it is a good practice to accept answers in order to close them. – vyegorov Jul 04 '12 at 11:10
0

Did you do a VACUUM FULL or CLUSTER? If not, the unused space is still dedicated to this table and index. These statements rewrite the table, a VACUUM without FULL doesn't do the rewrite.

Frank Heikens
  • 117,544
  • 24
  • 142
  • 135