-1

When I upload a file on google drive, does google segment that file into smaller pieces and upload them on different servers. For eg. I upload a fileA on my drive. Does google divide fileA into smaller chunks and upload them on different servers serverA, serverB, ...etc ?

Or is it so that a given file is uploaded on a particular server without breaking it into smaller chunks. For eg. I upload fileA then it will be stored on serverA or any other server, without segmentation.

Or is it so that all the files I upload on my drive are stored at one particular server?

Cheryl Simon
  • 46,552
  • 15
  • 93
  • 82
Rahul Patel
  • 147
  • 1
  • 2
  • 10
  • This is more of a sysadmin question so this will likely get moved to serverfault.stackoverflow.com. I'm sure someone over there will be more than happy to answer! – Tom Studee Aug 06 '15 at 03:03

1 Answers1

1

The service hosted by google is supported by GFS, the google filesystem.

I was not sure until I spot this post from whom seems to be a google employee.

Edit:

The fact that Google use GFS is actually an inference based on the fact that the filesystem fits perfectly the model of google drive. An article has been published on arstechnica, which describe that in more details. Of course no one expect former and current google employees knows what is behind, but the fact that someone who claims being one of them specifically points the OP of the question I linked to GFS, is another strong incentive to believe that GFS (and maybe BigTable) are backing up Google Drive.

Edit 2:

As I was seeing downvotes and that the topic actually interest me a lot, I decided to enrich this answer with the following arguments:

1) Google infrastructure strategy is to build clusters of inexpensive commodity hardware, with in mind the fact that it will fail. This was one of the motivation behind he GFS infrastructure.

2) The intuition that, as GFS totally fits Google's infrastructures, services and internal softwares (among which MapReduce is an important actor), has lead other people to the conclusion that it is backing up pretty much all their services. See : http://www.slideshare.net/hasanveldstra/the-anatomy-of-the-google-architecture-fina-lv11. Also, an interview of Jeff Dean, backup a lot this intuition, and explain 1) better

3) That doesn't have any actual meaning, but I found it fun: some user actually ended up having the extension .gfs in Drive (It'd be unexpected, but somehow unfortunate that this gfs actually refer to MS Groove files)

I don' think we'll ever see an actual former/current employee validate this statement, and of course much of the details of what is backing up Drive will stay hidden, but the intuition has strong roots.

Community
  • 1
  • 1
Bacon
  • 1,814
  • 3
  • 21
  • 36
  • hmm i read that link and it doesnt say anything particular to google drive. seems instead to refer to data intensive usage like storing bigquery or search databases. – Zig Mandel Aug 06 '15 at 06:45
  • They use it for a various range of services. It doesn't specifically mention Drive for the reason GFS was not made for it, but rather as a general purpose distributed datastore for (general purpose and thus failure prone) hardware. – Bacon Aug 06 '15 at 11:41
  • Btw GFS was initially conceived for MapReduce (the paper was published one year later) – Bacon Aug 06 '15 at 11:42
  • then how do you know that Drive uses it? evidence? – Zig Mandel Aug 06 '15 at 13:36
  • ok makes sense based on the other s.o. post. arstecnica doesnt mention drive either. – Zig Mandel Aug 06 '15 at 14:07
  • That's right. I linked Arstechnica article to highlight the relationship between GFS model and Drive. It is not obvious that GFS is supporting drive, but the fact that the model is so well fitted plus that some google dev actually mention it, are incentives to believe so. – Bacon Aug 06 '15 at 14:11