1

So, I've been wrestling with this issue for days...I need to get a file from S3 and write it to a directory in my Rails app on Heroku. I must have a misunderstanding of the ephemeral file system on Heroku because I can't figure out why it's not working.

I am running s3.bucket('bucket').object('file.csv').get(response_target: 'file.csv') to get a file from S3 and write it to my app. Initially I just wrote a .rb to do this and ran it using the Heroku Scheduler, but to no avail. I then turned the script into a rake task and ran that on the scheduler, again, to no avail. I am able to run both the .rb script and the rake task flawlessly in my dev environment.

After reading this and this on how the ephemeral file system works, I am thinking that the task actually is working, but the file gets destroyed (or is actually there but I can't see it?) when I use ls in heroku run bash.

Can someone please explain what is going on to me? If my efforts to get a file from S3 written to my app on Heroku are futile? And if there are any alternatives?

If I can't figure it out after this then I am going to set up my own env in EC2.

Community
  • 1
  • 1
jdesilvio
  • 1,794
  • 4
  • 22
  • 38
  • It is definitely possible. You can take a look at the famous Hartl tutorial [here](https://www.railstutorial.org/) to see it in action with his micropost example. If I recall, I wrestled a bit with this myself when I went through the tutorial and the issue I had was with policies (make sure you have one in place and it is sufficiently broad to allow the action you desire). – steve klein May 23 '15 at 02:21
  • Heroku does not support what you want to do. You can write tmp files to the local file system, but such files will be deleted automatically. – spickermann May 23 '15 at 02:23
  • @steve can you post a simple example of this working? – jdesilvio May 23 '15 at 02:40
  • @spickerman pulling this file from S3 is critical and the conents will be used every time a user uses the app...as such, would you recommend pulling the file from S3, writing as a temp file, serving the result to the user, then destroying it every time or is that not good practice. The file is a rather small `.csv` table. – jdesilvio May 23 '15 at 02:43
  • My bad - I misread the question and now understand it after reading @spickermann's comment. I thought you just wanted to use S3 for file storage with a Heroku app. Missed the part about writing the file to a directory in your Rails app... sorry. – steve klein May 23 '15 at 02:44
  • I am new to building app, but am a bit perplexed by this...I feel like this is a fairly common thing, to run a computation intensive task on one machine, then pull the results into an app and serve them quickly. – jdesilvio May 23 '15 at 02:48
  • 1
    Why are you not able to serve the file directly from Amazon S3, just like you can serve bootstrap or javascript from a content delivery network. You will find this approach much more scalable. – practicalli-john May 24 '15 at 00:11

1 Answers1

3

With Heroku you don’t have a single app running, rather you have several dynos each with a copy of your code and running some aspect of your app, and each independent from the others. In particular each dyno’s file system is separate from the others.

In your case you push your app and this creates one (or maybe more) web dynos, which runs your Rails app – handling web requests.

You also have a scheduled task using the Heroku Scheduler that downloads the file. When this runs a new one-off dyno is created and the file is downloaded into that dyno’s file system. When the task is completed that dyno, along with the downloaded file, is discarded.

When you run heroku run bash you create yet another one-off dyno, and obviously the file isn’t in that dyno’s filesystem.

The solution will depend on exactly what you are trying to do, but one suggestion would be to put the data from the file into your database where the other dynos can access it easily.

matt
  • 78,533
  • 8
  • 163
  • 197
  • Thanks. I've heard this before, but for some reason it just clicked. So basically, Heroku is the wrong platform for this app. – jdesilvio May 23 '15 at 02:53
  • 3
    @Dixon11111: I would argue your design is wrong. A web server should be stateless. Your app is not able to scale horizontally if you depend on files downloaded to the local file system. Heroku just makes this very obvious. – spickermann May 23 '15 at 05:19
  • I had a suspicion that was the case. – jdesilvio May 23 '15 at 15:31