19

Note: I've searched for other similar Qs here, and none of the other answered questions are even remotely similar.. With that... I have a question about MySql tables (more precisely, on specific fields from a table- i.e. tweets or updates ).

So the question... what are the maximum amount of rows on an InnoDB table? That is if there is a significant amount of difference between the amounts MyIsam, InnoDB, and others can hold, if there isn't, then, in general. Secondly, if the table gets really large, what are the best practices for storing the data- (same one table, or split/multiple tables/dbs)?

I read that twitter gets something like 100 million tweets a day. In the same context, how would my second question apply to something like twitter?

Fabio
  • 18,856
  • 9
  • 82
  • 114
avon_verma
  • 1,229
  • 3
  • 12
  • 17

3 Answers3

23

There isn't a row limit, but a size limit on an InnoDB database:

The minimum tablespace size is 10MB. The maximum tablespace size is four billion database pages (64TB). This is also the maximum size for a table.

You could always horizontally partition your tables by storing rows in multiple partitions of the same table, in multiple files.

atp
  • 30,132
  • 47
  • 125
  • 187
  • Is horizontal partitioning reffered as Sharding? @ash – edam May 26 '15 at 09:27
  • No, "horizontal partitioning" is "Partitioning" if in a single table on a single server. It is Sharding if it is spread across multiple MySQL instances across multiple servers. – Rick James Sep 05 '17 at 23:45
17

http://dev.mysql.com/doc/refman/5.5/en/innodb-restrictions.html will allow you to calculate restrictions from your key size.

But I humbly suggest that you don't want to store info like tweets in a transactional engine with referential integrity and other features of InnoDB, and you definitely don't store them all in one table in a single DB instance. You put them into lots and lots of independent non-SQL databases that allow you to append fast, and then extract meta-information (like hashtags, RTs, etc) into a more complex database. Presentations on Twitter DB architecture are easy to google (e.g. http://www.slideshare.net/nkallen/q-con-3770885).

If you must store large amounts of data in one table, partitioning is your friend, and possibly Postgres has better support for it. A partitioned table is physically several tables that logically look as one table. You put these sub-tables (partitions) to different disks, independently run maintenance on them, etc. Also, a "star schema" with one very long table that contains only essential columns and a number of tables that store bigger but optional columns may help.

9000
  • 39,899
  • 9
  • 66
  • 104
  • As of now (end of 2015), MySQL seemingly has better support for partitioning than Postgres, and something like RethinkDB should be considered as a nosql solution. – 9000 Dec 15 '15 at 03:52
5

The 64TB limit for a table has the following exceptions:

  • That assumes innodb_page_size = 16K (the default). That can be set to powers of 2 between 4K and 64K, changing the tablespace limit proportionately.
  • A PARTITIONed table is essentially a bunch of 'sub-tables' organized together and acting as one big table. The pre-5.6.7 limit on number of partitions was 1024. After that, it has been 8192. So multiply the 64TB by that.

OK, that only gives you a byte limit. Furthermore, it includes overhead, and indexes. You then need to divide by how big an average record is to get the max number of rows.

(It is not easy to compute the average record size.)

Simple answer:

You can probably easily get 1 trillion "small" records in an InnoDB table. With some effort, you might get to 1000 trillion. But I suspect you budget for disk drives would be exhausted before that. Furthermore, it would take years to do all the INSERTs to fill it up!

So, realistic answer: MySQL can handle an 'unlimited' number of rows.

Real life? I have heard of a few tables with more than a billion rows, even as much as 15 billion.

See my Limits, but it does not say more on the question of Rows.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • 1
    At my last job, one table had 5+ billion rows, and was still growing rapidly. It was a DW recording every phone call our service had placed since 2007. We used RANGE partitioning by day for recent data, and by month for older data. – Bill Karwin Sep 06 '17 at 00:12