1

app.crash index: 0, reason: CRASHED, exit_description: Instance became unhealthy: hijack: Backend error: Exit status: 500, message: {"Type":"","Message":"no space left on device","Handle":"","ProcessID":""} ; connection: process error: could not find the exitcode file for the process: stat /var/vcap/data/garden/depot/XXX/processes/XXX/exitcode: no such file or directory; connection:

Lately, the above error has been persistently occurring on several apps running in the AppCloud. There is no connection between the apps and the apps are very different with different buildpacks.

Can someone help me or has someone else the same problems?

Setup Example: PHP Webapp deployed with the cloudfoundry php buildpack in Swisscom Application Cloud, the same App twice... one app crashes with the error above and the other app doesn't do the error (so very strange, one is running and one not with the exactly same config/environment), installed HTTPD and PHP with all dependencies in the linux container, Stack cflinuxfs2 (https://github.com/cloudfoundry/cflinuxfs2)

seinol
  • 71
  • 10
  • 2
    you should probably give a bit more context, especially with regards to the setup, and what you have tried to remedy the situation – rmalchow Nov 13 '17 at 09:56
  • @rmalchow did it above... I have tried nothing special so far to prevent this, because the error occurs only sporadically and the app starts again immediately after crashing. – seinol Nov 13 '17 at 10:11
  • It says there is no space left. Have you checked that? – M. Prokhorov Nov 13 '17 at 10:26
  • @M.Prokhorov yes I checked that, the app has more than enough space left ( size:1008M used:178M free:764M 19% mounted-on: / ) – seinol Nov 13 '17 at 10:38
  • anything else mounted on /var maybe? – rmalchow Nov 13 '17 at 11:57
  • well ... i missed that message ... but for sure someone somewhere is running out of disk space? the host maybe? some other component? – rmalchow Nov 13 '17 at 11:59
  • I could not imagine that something is going out of space... the app is running and only stores data in a db and there is nothing which is storing data on the container. – seinol Nov 13 '17 at 12:51
  • I found some apps which only crash with the second part of the error message: "index: 0, reason: CRASHED, exit_description: Instance became unhealthy:; connection: process error: could not find the exitcode file for the process: stat /var/vcap/data/garden/depot/XXX/processes/XXX/exitcode: no such file or directory; cancelled  " – seinol Nov 13 '17 at 12:53

1 Answers1

1

We have investigated these crashes and have discovered, that they are due to an issue in our configuration of Cloud Foundry which causes the VMs which host the app containers to run out of iNodes. This then manifests itself with the OS reporting "no space left on device" which is arguably a bit misleading and had us fooled for a while.

We are currently working on a new release which will fix this issue. We plan to deploy this release to production as soon as possible. We will keep you updated via this post.

Update: we have rolled out a new platform release which has fixed the issue. Please check your event logs to confirm this.

  • Thank you Mathis. I tried to explain this issue to the Developer Support but they weren't willing to track this issue. – seinol Nov 14 '17 at 08:59
  • @seinol Developer Support are the same guys who also troubleshooted and will rollout a fix for this issue. It's the same App Cloud team :-) – Sybil Nov 17 '17 at 12:17
  • @FyodorGlebov Okey :-) Unfortunately, it was not the first time that I received incomplete answers from Developer Support. – seinol Nov 28 '17 at 08:10
  • @Mathis At first I thought, everything seems to work without crashes until I found 3 apps which crashed on November 22 and 23. When did the release rollout take place? Since 23. I didn't noticed another crash. exit_description: Instance became unhealthy: hijack: Backend error: Exit status: 500, message: {"Type":"","Message":"mkdir /var/vcap/data/garden/depot/XXX/processes/XXX: no space left on device","Handle":"","ProcessID":""} – seinol Nov 28 '17 at 08:23
  • @seinol are you working on the public Application Cloud instance or on the Swisscom internal one? The internal instance was fixed on the evening of November 23. The public instance was fixed on evening of November 21. – Mathis Kretz Nov 29 '17 at 16:18
  • @Mathis on the internal one. It now seems to run without crashes since 23. Nov. Thank you. – seinol Dec 01 '17 at 15:50