1

What is the best way to work from a Mac-laptop and, send an .Rda-file to an Ubuntu-desktop with the input-data, run the processing, and then later get a new .Rda with the results back to the Mac-laptop?

My R-workflow is mostly about tweaking plots and modifying text in reports (knitr), which I do with a relatively weak Mac-laptop. But a few steps in, I sometimes need to run RJAGS or similar heavy jobs that can take many hours (longer than laptop can stay connected). The input-.Rda can be hundreds of MB large. I also have powerful Ubuntu desktop in another location. It would be great if one could submit also the function to be run.

I thought OpenCpu could be a way, but it seems the laptop must stay connected. Rredis also might be a way forward, but it seems limited in data volume. I already have an SSH-connection between the computers, so perhaps it is best to have some sort of script to send data, send R-script, start R-script, wait, retrieve data. I have already installed RStudio Server on the Ubuntu, and that works well, requires constant connection to the Ubuntu. There are also several multi-computer systems, but as I understand they require computation also on the starting-machine.

I need to do this on an almost daily basis, which is why a robust automatic process would be good.

Chris
  • 2,256
  • 1
  • 19
  • 41

1 Answers1

4

The simplest trick is to make sure your sessions persist. Which you get "for free" with byobu.

It was originally written for Ubuntu, is of course available on Ubuntu but now also on most other Linux distros as well as OS X. (And it wraps around tmux provided a nicer interface; tmux itself is a retake on screen. Google for 'byobu tmux screen' and you will find countless tutorials.)

To use it, just ssh to the machine in question, laucnh byobu (and optionally have multiple screens and panes--see the video at the site linked above). When it is time to leave just 'detach'. Once you reconnect later from the same or another machine just 're-attach'. Presto.

Edit: Here are a few other answers to R and byobu which will give a general flavour. The tool is absolutely worth it and a key part of the workflow for many advanced users.

Community
  • 1
  • 1
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • 2
    Exactly what I was thinking. @Chris, RPC-like mechanisms such as OpenCPU, RServe, and even RStudio-server won't work (well) since they'll require persistent connections. (Additionally, I don't know how well OpenCPU will deal with long-long-running calculations.) The only other method I can think of would be a batch-processing facility which, AFAIK, (a) does not exist, and (b) has the potential for security problems. This (@DirkEddelbuettel's) answer is by far the simplest and usable *now*. – r2evans Nov 30 '15 at 18:07
  • @r2evans Since when does RStudio Server require a persistent connection? – Roland Nov 30 '15 at 18:49
  • It was an assumption, I admit, that since it uses a browser for interaction, it would not continue (significant) processing without interaction with the browser. I had added that late since @Chris mentioned having it, but you are right, I should verify that before putting it in there. Do you know that it will continue processing scripts after the browser has been closed? – r2evans Nov 30 '15 at 18:51
  • 1
    @r2evans Yes, it will continue until the task is completed, save your session and restore it when you reconnect. – Roland Nov 30 '15 at 21:04
  • Ah, thanks, learned something new. (Should have done this research previously.) With RStudio server, you may need to tweak `session-timeout-minutes` ([reference](https://support.rstudio.com/hc/en-us/articles/200552316-Configuring-the-Server)) if your computation will take longer than the default of 2 hours. Thanks for the nudge, @Roland. – r2evans Nov 30 '15 at 22:31