6

I have a table that holds two 32bit integers. These two integers are used to create the clustered index. The SQL for creating the table is as follows,

CREATE TABLE `a` (
  `var1` int(10) unsigned NOT NULL,
  `var2` int(10) unsigned NOT NULL,
  PRIMARY KEY (`var2`,`var1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
  • There are 6921338 rows in this table (retrieved by using SELECT COUNT(*) FROM a)
  • The table uses 358580224 bytes of space (retrieved by using SHOW TABLE STATUS)

According to my calculations, each row uses 51.8 bytes. I am aware that InnoDB has overheads while storing rows. However, using the MySQL web site, I have calculated that the row size should be 26 bytes (5 byte header, 8 bytes for integers, 6 bytes transaction id and 7 bytes roll pointer fields).

I am trying to reduce the row size because the table may get populated up to 160 billion records.

What am I missing with my calculations? How can I optimize my table to use less space? Or should I switch to a different database engine?

Update

The queries I make for this table;

INSERT INTO a(var1,var2) VALUES(INTEGER,INTEGER),(INTEGER,INTEGER),...,(INTEGER,INTEGER);
SELECT var1 FROM a WHERE var2=INTEGER;
DELETE FROM a WHERE var2=INTEGER;
Utku Zihnioglu
  • 4,714
  • 3
  • 38
  • 50
  • Remember that InnoDB is stored as a b-tree; this means there are additional non-leaf parts of the tree that take up space that aren't directly related to the data being stored. It is not a linear calculation based on the number of rows. – Joe Jul 29 '11 at 00:53
  • 1
    https://stackoverflow.com/questions/34211377/average-row-length-higher-than-possible seems to be a better answer – dakrpssngr Mar 29 '18 at 21:00

1 Answers1

5

This is because the index you created takes up space as well. In the case of a clustered index in InnoDB, the row data is actually stored inside the index leaf nodes, so when the index is used the index does not point to the row but instead points to the index leaf node that contains the row data along with the PKEY.

This caused the storage to be much larger.

Take a look here

bash-
  • 6,144
  • 10
  • 43
  • 51
  • So from what I understand, rows are stored in the index leaf nodes of the b-tree. So is the space efficiency of b-trees are the real problem here? If so, I guess I might have to pass to another database storage engine. – Utku Zihnioglu Jul 29 '11 at 17:58
  • yes pretty much. May I ask why storage space is such a problem? Harddrives are cheap these days and the extra storage space doesn't decrease performance, but rather increases it. – bash- Jul 30 '11 at 01:48
  • Well storage is not a big problem. I was more curious about why storing 8 bytes row would take 52 bytes. Now I know the answer. Thanks – Utku Zihnioglu Jul 31 '11 at 22:07