2

An article in The Independent covering recent developments in Big data contains the following, suspiciously precise, claim:

By 2020, it's thought that the number of bytes will be 57 times greater than all the grains of sand on the world's beaches.

Is this claim remotely plausible? Is there even any remotely plausible way to validate it?

Note: There is a related claim in this question: Has 90% of the world's data been created in the last two years?. And, since part of the question is about sand I've used the tag. ;-) Any better suggestions?

Clarification I don't want the focus of this to be about our inability to predict the future or about extrapolation (unless there is strong evidence that argues against projecting data trends 6 years into the future). The focus should be on whether our knowledge of the current trends is good enough to give us a plausible estimate on the data side and our knowledge of the world is good enough to give a plausible estimate on the "grains of sand" side.

matt_black
  • 56,186
  • 16
  • 175
  • 373
  • 1
    @Sklivvz My intention was to focus on whether the numbers add up not to have a debate about forecasting as such. I've added a clarification. – matt_black Jan 19 '14 at 15:38
  • 1
    there are some [answers here](http://skeptics.stackexchange.com/questions/4508/can-every-grain-of-sand-be-addressed-in-ipv6?rq=1) that give a decent estemate about amount of sand – ratchet freak Jan 19 '14 at 21:27
  • Is the storage long-term? What's it stored on? If it's stored on silicon (sand) or on disk drives (rust), if there are more bytes than grains of sand, I'm trying to picture a world where a large fraction of the sand and rust has been put to use for data storage. – Mike Dunlavey Jan 22 '14 at 21:20

1 Answers1

6

The claim is at least possible, although there is a great uncertainty both in the estimate of number of grains of sand on the world's beaches and the world-wide number of bytes stored. What is immediately obvious is that the text in the news article you're linking to seem to indicate a correlation between the daily number of bytes created and the number of bytes actually stored. I won't even guess a ratio, but I would assume that the larger part of the "created bytes" are for immediate use and not actually permanently stored anywhere.

The answer ratcher freak mentioned in his comment is already linking to estimates of the number of sand grains on earth. The sources for the numbers (ranging from 7.5*10^18 for the world's beaches to 10^20-10^24 for the total number of sand grains) are obviously just rough estimates. I am not able to find any more reliable numbers and I doubt that any exists.

The "International Data Corporation" regularly publishes reports on the estimated global data storage and predicted development. Their most recent reports "The Digital Universe in 2020" estimates the global number of bytes stored in 2012 to be 2.9x10^21 and estimated to increase to 4x10^22 in 2020. These numbers are of course not exact values either, but the report has at least some details on how the numbers are produced.

Since the article only states the expected ratio between number of grains of sand and number of stored bytes (57) and not an absolute number for any of the compared values, it is of course difficult to tell if they have done their math correctly. The estimates for the number of grains of sand vary by a factor larger than 100.000 and the estimates for the number of stored bytes (both for 2012 and 2020) are placed somewhere in between.

Tor-Einar Jarnbjo
  • 6,460
  • 1
  • 37
  • 34