1

After over a month, I have managed to piece together how to setup an AWS EC2 server. It has been very hard to upload files as there are very conservative (size) limits when done via the upload button in Rstudio Server. The error message when this is attempted is "Unexpected empty response from server".
I am not unique in this respect e.g. Trouble Uploading Large Files to RStudio using Louis Aslett's AMI on EC2

I have managed to use the following commands through putty and this has allowed me to upload files via either filezilla or winscp.

sudo chown -R ubuntu /home/rstudio

sudo chmod -R 755 /home/rstudio

Once I use these commands and log out, I can no longer access rstudio on the instances in future logins. I can relogin to my instances via my browser, but I get the error message: Error Occurred During Transmission

Everything is fine other than once I use Putty I lose browser access to my instances.
I think this is because the command is change of ownership or similar. Should I be using a different command? If I don't use a command I cannot connect between filezilla/winscp and the instance. If anyone is thinking of posting a comment that this should be closed as it is a hardware issue, I don't have a problem with hardware. I am interested in the correct coded commands. Thank you :)


Ok so eventually I realised what was going on here. The default home directory size for AWS is less than 8-10GB regardless of the size of your instance. As this as trying to upload to home then there was not enough room. An experienced linux user would not have fallen into this trap, but hopefully any other windows users new to this who come across this problem will see this. If you upload into a different drive on the instance then this can be solved. As the Louis Aslett Rstudio AMI is based in this 8-10GB space then you will have to set your working directory outside this, the home directory. Not intuitively apparent from Rstudio server interface. Whilst this is an advanced forum and this is a rookie error I am hoping no one deletes this question as I spent months on this and I think someone else will too.

Community
  • 1
  • 1
Joey
  • 137
  • 2
  • 13
  • Sigh, at the moment there are alot of questions about this on this site and unfortunately the answers are not complete, but piece together a partial picture of what to do, with holes. I really don't see how this is different to e.g. http://stackoverflow.com/questions/24891861/trouble-uploading-large-files-to-rstudio-using-louis-asletts-ami-on-ec2 which for example has a line of code that does not appear to work (I have tried - which does not of course mean it does not work) and suggests cygwin which is no longer maintained properly and shuts down win 8.1. ..tbc – Joey Jan 23 '16 at 05:36
  • 1
    In searching SO, I often find many people asking exactly the questions I am looking for an answer for, only to see it was a question that was closed by a moderator. You may say that it was because it was off topic, but it was because it was a trivial thing for the moderator. What is helpful is when someone suggests another site to move a question as most of you have done or a change to the format of the question. I want this to work as I use r/rstudio server as a programming tool and this is an important step to be able to access it. tbc2 – Joey Jan 23 '16 at 05:49
  • I think a question on Rstudio Server in EC2 (specifically for computing, not hosting) is not a general hardware and software question and it is not professional server- or networking-related infrastructure administration. However that is my personal opinion. Either way I am thankful to Tom for his advice, even though I am not all the way there yet with being able to just get on with my ML – Joey Jan 23 '16 at 05:49
  • Also I should note that searching rstudio aws ec2 on Super User gives 0 matches, on SO it gives 17 matches. I am happy to have my question moved to the best place, but I am not sure it is there. – Joey Jan 23 '16 at 09:44
  • This question was closed by someone, but I received a popular question boost as it was viewed so many times. Perhaps you should be more cautious closing questions as sometimes they might actually help people, maybe people that don't have the 'points' to comment or like posts, but they are still people and they possibly might benefit from the help. – Joey Mar 31 '18 at 11:57
  • Also I posted an answer to the other question asking similar to this on this site and my answer now is the most upvoted. I felt too shy to post on my own question, but if you are looking for an answer you can look at that one (mentioned in the question). But if you are having problems with this too, have a look at that, am happy if I have helped anyone out. Good luck – Joey Mar 31 '18 at 12:01
  • This question is closed, but still getting alot of traffic so maybe instead of closing it we should consider how to make it suitable – Joey Apr 12 '20 at 05:10

1 Answers1

1

Don't change the rights of /home/rstudio unless you know what you are doing, this may cause unexpected issues (and it actually does cause issues in your case). Instead, copy the files with filezilla or winscp to a temporary file (let say /tmp), then ssh to your instance with putty and move the file to the rstudio directory with sudo (e.g sudo mv /tmp/myfile /home/rstudio).

Tom
  • 2,689
  • 16
  • 21
  • Thank you Tom -very exciting for me after so much time on this problem. I have created a fresh instance, zipped the files and successfully managed to upload a 1GB file this way and a 4GB file is now slowly getting to the tmp folder (should take about 4hrs) via winscp. – Joey Jan 22 '16 at 10:07
  • It was instant to move the 1GB file to the rstudio directory. There may be a direct Putty line to grant privileges to upload without causing the instance to corrupt, but this solution you gave worked for me. After a little bit of time to doubled check everything in case anyone else has this problem and reads this, I will mark as answered :) I should now be able to download files too via sudo mv /home/rstudio /tmp/myfile :) :) – Joey Jan 22 '16 at 10:08
  • This is a common problem when uploading files, and this can be automated if needed. You could for sure modify the "o" rights of your directory or even create a dedicated user on your ec2 instance with the correct rights, but this is a bit more advanced and if you only need to punctually upload files this would be overkill – Tom Jan 22 '16 at 10:12
  • One advise if you uploaded very big files: do an AMI of your machine when you are done, so if you need to recreate a new machine you don't have to re-upload all your files. Also please note that there are advanced techniques if uploading too big files, e.g parting the big file into several smaller parts and upload each part separately ("multipart upload"). No idea how to do so on Windows, but I think this could be a way to look at. This can be done easely when uploading to S3 for example, so you could do: local computer > S3 bucket > ec2 instance (if you are ready to pay few $ for it :) ) – Tom Jan 22 '16 at 10:17
  • Thank you Tom, that makes perfect sense. I did also try the following code ssh -i "thenameofmy key.pem" ubuntu@InsertthepublicDNSaddress but I did not get any luck with that. I think I had to put something in front of the key name, but the main thing is we have a fairly simple solution - very exciting - been banging my head against a wall for days :) – Joey Jan 22 '16 at 10:21
  • "ssh -i keyname.pem user@machine" does work to ssh your machine, but the key has to be chmod 600 at maximum (400 is better). same for scp on linux to copy files over ssh. If your upload times out and that the file is too big, here is a good example of parting a tar file on linux, i m sure you could find similar way on windows: http://superuser.com/questions/198857/how-can-i-create-multipart-tar-file-in-linux – Tom Jan 22 '16 at 10:29
  • "AMI of your machine when you are done" is fascinating. I have the big 244GBram EC2, cheaper with spot prices, but I have to use fixed pricing or I would spend ages re-initiating it each time, Maybe creating an AMI on shutdown would be a solution. I will research "multipart upload" as I think that is also a good idea. I am happy to pay, though I am not sure what S3 yet, but my homework -what is an AMI, "multipart upload" and what is S3. it may be some time before I can incorporate them into what I do, but it is great to be put on the right track with these ideas. Thank you very much :) – Joey Jan 22 '16 at 10:30
  • p.s. I voted up your answers, but the votes won't appear until i get 15 score or something, just to say I was not being rude by not doing so I just don't have privileges :) – Joey Jan 22 '16 at 10:32
  • yep sure no worries :) – Tom Jan 22 '16 at 10:41
  • Hi Tom, I have accepted your answer. for anyone reading this please note, for some reason, I have managed to load the files into the rstudio folder this way, but it takes a long time (more than 20hrs) and I then cannot read them into Rstudio to actually use them. Perhaps they all got corrupted during upload. I can write the code I used to try and do this, but it was just standard and I am worried this question will be closed anyway. As moderators have put it on hold. – Joey Jan 23 '16 at 05:53