0

I'm having my project for my "distributed system development" class, and my project is making a minimal version of cloud storage (sth like Google Drive).

my approach here is using 2 backend server written in Rails, with 1 proxy server to control requests sent to servers, 2 two Postgres server with master-slave replication relationship.

But problem here is how to storage real assets (video, pdf , mp3 ...). I have no experience in this.

example: if 1 user opens 2 browser tab, then in each tab he uploads 1 video with same name to 1 directory, what will happend?

enter image description here

user3448806
  • 897
  • 10
  • 22
  • What do you expect the result to be when you upload the two videos at the same time? One to succeed and one to fail? Both succeed video_file.mp4 and video_file(2).mp4 ...? – zyglobe Sep 17 '16 at 03:05
  • Rename, or just lock uploading for just 1 request, I'm still deciding. So, which is better way (or easier to archive) ? – user3448806 Sep 17 '16 at 03:26

1 Answers1

1

Since you probably want to upload asynchronously, this is pretty easy to handle: generate some sort of token before uploading (i.e. filename + hash), then hand the upload off to the delayed job. If the user tries uploading the second file, it will generate the same token and be rejected.

Example for keeping track of the uploads in the DB. Generate a record before upload starts and save the filename and the hash.

Asset.create(filename: ..., hash: ...)

Once the upload finishes you can update the record with the S3 URL or whatever you use for storage (pass the asset id to the delayed job). The validation then is easy:

validates uniqueness: { scope: :filename }
Michael Kohl
  • 66,324
  • 14
  • 138
  • 158
  • Thanks :D But what about this situation: 2 brower tab (or 2 browsers) make requests in 2 ways ( proxy -> rails server 1 -> postgres 1 ; proxy -> rails server 2 -> postgres 2), and due to network issue, then there'll be 2 tokens ? Sr for this question but i'm really new to this :( – user3448806 Sep 17 '16 at 02:53
  • 1
    The way you described your database I assumed that the slave will be readonly and writes will only happen to the master. Whether that's the case or not, your upload tracking needs a canonical source, or you let your file storage handle it (i.e. if you upload to S3 and generate the same key for each upload attempt your file will not be stored twice). – Michael Kohl Sep 17 '16 at 03:05
  • oh yes, with master slave only the master is writeable, sr forgot it :) This is just a class project so I'm only doing all this in local network with some virtual machine to demonstrate. ( No external service like S3) – user3448806 Sep 17 '16 at 03:24