3

I want to do something that seems dead simple, but none of the options I've found are quite right (e.g. Dropbox).

The question is: what cloud sync service can I use to sync a folder on my workstation with the filesystem in an EC2 instance? Note these requirements:

  • It must have a unattended/scriptable installation and configuration that happens on init of the EC2 (since EC2 instances are ephemeral)
  • And thus it may only depend on EC2 environment variables for any service installation credentials
  • The service on EC2 needs read-only, recursive synchronization (not plain downloading; there are too many files to simply download a directory archive and expand it periodically).
  • Both workstation and EC2 are syncing with a shared source cloud repository like Dropbox, since that workstation is not always on/publicly accessible
  • The app on my EC2 instance is nodeJS, for what it's worth!

The Dropbox Linux client, for example, (or nodejS libraries I've found) require attended installation, to visit a Dropbox URL every time the instance needs to log its Dropbox client in. Same is true for Bittorrent sync, requiring visiting a localhost URL to link with devices.

Even if another tiny EC2 instance is to sync Dropbox for example with Elastic File System. It might be longer-lived, but is still ephemeral and needs an unattended init-script installation.

Thanks in advance.

  • If you would consider investing some of your time into learning little bit of another language, I would suggest writing '[expect](https://en.m.wikipedia.org/wiki/Expect)' scripts in case readily available solutions offered can't fulfil what you need. –  Aug 15 '16 at 13:37

3 Answers3

1

You can use a Dropbox Uploader script to send data in either direction. It doesn't need you to log in manually each time you want to do this. My Amazon Linux EC2 instance does a backup and uses this script to send it to dropbox every night.

You can also run Bittorrent Sync on Unix (second link). This constantly mirrors data from one machine to another, and of course you can turn it on and off as required. I've found this works well on Amazon Linux on EC2, but I stopped using it as Dropbox was more convenient for my use case.

Update - you can also use the official Dropbox client for Linux, which does something like rsync for dropbox. There's more information about that approach here.

Tim
  • 31,888
  • 7
  • 52
  • 78
  • Thanks for the comments. Both are not quite enough. Dropbox Uploader script requires a lot of effort on my part to get full "synchronization" functionality, and Bittorrent Sync's installation can not be unattended, but requires you to be able to see the link it creates in the headless installer and go into a browser. This doesn't work for ephemeral EC2 instances that will have to have the install done on init at any time. – Brandon Arnold Aug 15 '16 at 05:24
  • The dropbox uploader script is actually really easy, if all you want to do is upload from EC2 to Dropbox or vice versa. Quite suitable for a headless install from a pre-made AMI image or something like a CloudFormation setup. – Tim Aug 15 '16 at 05:37
  • Would the Dropbox-Uploader automatically update files locally when they are changed remotely? If not, I'd have to write my script to poll/list the remote Dropbox directory tree, and compare it to the local file tree, and prune/update files or whatever whenever the files are deleted/changed, which is too much. – Brandon Arnold Aug 15 '16 at 05:45
  • If you have a Windows PC you'd use the Dropbox client on the source PC. If you have a Linux machine then yes the dropbox script will upload what's changed only. If you need two way sync you set it up twice. – Tim Aug 15 '16 at 08:14
  • Thanks Tim. I have just reviewed the Dropbox-Uploader script, and I don't see why you're saying it only syncs changes. In dropbox-uploader.sh on the Github repo, the download routine is on line 661. It acquires an archive of the remote directory and expands it indiscriminately into the destination folder, performing no file-by-file date comparison between the remote and local file trees. Note my use case on EC2 is read-only; there will never be any upload on the EC2, only download. The file count is prohibitively high and archive extraction would be prohibitively long to warrant this approach. – Brandon Arnold Aug 15 '16 at 18:09
  • The "-s" flag is "Skip already existing files when download/upload. Default: Overwrite", as per the link. You're right that that's not a proper sync, if files change it won't upload them, it will only upload new files. For my use case that was adequate. I've updated my answer with another option - using the official Dropbox client. I included another link which suggests how to use rsync with dropbox. – Tim Aug 15 '16 at 19:35
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/44018/discussion-between-brandon-arnold-and-tim). – Brandon Arnold Aug 15 '16 at 20:23
1

Look about rsync. It can sync data across any platform and can use ssh for log in and transfer encrypting. And you're not using any 3rd party service.

Ondra Sniper Flidr
  • 2,653
  • 12
  • 18
  • Thanks, Ondra. I like rsync, only I'd rather not have to roll my own cloud destination in between the EC2 and my workstation. Just updated my question to reflect this requirement. Something like Dropbox seems best, if only it does not have the attended installation requirement. – Brandon Arnold Aug 15 '16 at 18:24
0

For my simple use case - managing an EC2 Linux based WordPress instances from my local Windows box -, rsync and ssh look like a good solution. I can have the WordPress files open in VS Code and sync on demand. Ssh will execute commands remotely and integrate with my local clipboard.

This looks a hell of a lot better than Virtual Box!

EC2 seems to be headless, though. IDK if that's an issue.

WRONG - PowerShell doesn't support rsync

Anna Q

Anna Naden
  • 101
  • 1