I have a data science application that I need to run once every 2-3 hours, where I need to use 64 cores for 6 minutes in an embarrassingly parallel fashion. Each of the cores needs to load 3GB of data from disk for a total of 192GB of disk data.
To achieve this in a cost-effective way, my plan is to spin up a 64-core EC2 spot instance using a script whenever I need to run one of these jobs. I also plan to have a 200GB AMI with my required data. Then, when the EC2 instance starts, I can run my 64 jobs and they can each load their 3GB of data off the SSD there.
Will this work, and how long will it take to spin up the EC2 spot instance with the large AMI? If it takes multiple minutes to start the instance then that's not good since these are only 6 minute jobs that I want to run quickly. Is there a better way to achieve my workflow?