0

I have a process that works in the following way:

downloading data from the internet.

  • executing a program → creating output A.
  • output A → executing a program → creating output B
  • output B → executing another program → creating output C
  • output C → executing yet another program → creating output D

All this is automated via a bash script. I know how to use crontab to automate execution.

I now want to have it running every 6 hours and upload output D to an FTP server, accessible via the internet. I do not need a nice-looking HTML website, just an FTP. I already have a domain.

My questions are: What is the least costly way to do this? I basically need to rent a CPU 24/7. How do I bring output D to the FTP server? Does the FTP server have to run on the same CPU or on a second one?

As You see I do not know a lot about web stuff. I know a bit about Amazon EC2.

Uwe Keim
  • 2,420
  • 5
  • 30
  • 47
  • If the machine on which the script runs is running 24/7 and is available from the Internet, you can just setup a FTP server on the same machine. In that case you only need to put the output D files into appropriate directory available via FTP. Otherwise, you need to have some external FTP server and upload output D files to it. You can do this using `-n` parameter to `ftp` command together with input redirection. – raj Jan 04 '21 at 13:33
  • Thanks! What do I need to look for now? Cloud computing? Server? Hosting? I do not quite understand the difference between all of these. – Max H. Balsmeier Jan 04 '21 at 13:55
  • and where you plan to run the script? – raj Jan 04 '21 at 13:59
  • Not on my personal hardware, if that is the question. – Max H. Balsmeier Jan 04 '21 at 14:01
  • So look for any virtual server. An Amazon EC2 instance will do, I think. Make sure a FTP daemon is installed (for example `vsftpd`, it's probably most commonly used). If you need non-authenticated ("anonymous") access to FTP, make sure it's enabled in the FTP daemon configuration file (`/etc/vsftpd.conf`). Your script should copy output D files to the home directory of user who will access them via FTP (for anonymous access, it's the user `ftp`). – raj Jan 04 '21 at 14:06
  • I have worked with EC2 in another context and I think it is too small. The computation itself is quite costly (a simulation, takes hours, plus plotting, also takes hours) and quite a bunch of data is created. – Max H. Balsmeier Jan 04 '21 at 14:10
  • If this is brand new solution, why FTP? Use SFTP. It's automatically build into any Linux (cloud) machine. – Martin Prikryl Jan 04 '21 at 14:10
  • Thanks, I'll check SFTP. And what is the actual difference between a cloud machine, a server and a host? The companies have different options on their websites for hosting, servers and cloud computing. Isn't it all just a computer connected to the internet? – Max H. Balsmeier Jan 04 '21 at 14:15
  • There are various instance types available on EC2. For example, 'm5.8xlarge' instance type has 32 virtual CPUs, 128 GB of RAM and you can add as many disk storage as you wish. There are also "compute-optimized" instance types if you do a lot of computation. I'm sure you can find something that is sufficient for your needs. – raj Jan 04 '21 at 14:17
  • Thanks, didn't know EC2 is so flexible. – Max H. Balsmeier Jan 04 '21 at 14:19
  • As for the differences between a cloud computing, server and hosting: yes, it's all basically a computer connected to the Internet, the difference is how you can use it. With hosting, you can usually only set up a website on a shared server, you don't get eg. ssh access. With virtual server, you have a full access to operating system and can do whatever you wish, but there's usually no simple way to eg. scale up your machine when the load on it grows. Cloud computing allows for the latter. That's of course very simplified explanation. – raj Jan 04 '21 at 14:21

0 Answers0