0

I am not a programmer, but would like to know how the following task can be performed. I am working with an experienced developer and would like help from the programming community. If you can guide us or provide as a script, that will be awesome (and thank you ahead of time!)

Here is the scenario:

I have a JAVA application running on Tomcat 7 at Amazon's cloud EC2 instance (Linux).

I do not want any downtime for the web application (or very little).

What I want is if the web application crashes due to any reason, for it to get auto-restarted immediately as I try to find the cause of the crash.

Please note the scenario that we are trying to deal with: Tomcat is still running, but the web application is down. So you can't check for Tomcat being down as a parameter to restart the application

Jason
  • 21
  • 6
  • Can you edit your questions to include more details about your EC2 instance? Are you using a Windows or Linux AMI? I suspect you will get lots of answers using cron and ps, but that may not be applicable to you if you use a Windows environment. – Justin Garrick Aug 16 '12 at 13:23
  • Linux, Tomcat 7. Tried a script but that looked for Tomcat to be down. Discovered that Tomcat is really not down, but our app died because there is a memory leak and getting sudden spikes in connections to database (RDS). Working to find out the cause for that. No luck yet. But in general, would like to have an auto-restart script in place so that at least the site comes back up as we look for reason why the app went down. – Jason Aug 16 '12 at 13:55

1 Answers1

0

I don't know enough about EC2 to fully answer this (and I only have a few minutes before work) but if it's running in a scriptable environment and you can setup a cron job, it should be pretty easy.

  1. Have your app serve up a page with some XML or JSON that is parseable. This technique is commonly called a "canary" page -- like a canary in the mine (detects gas and dies first kind of deal)
  2. Create a script that accesses the canary page using curl or wget or something. If the response fails (aka your app is down), restart tomcat (or kill tomcat and start it again if it's still running, which it sounds like it will be)
  3. Set the job to run every minute or so via cron

Obviously, your best option is to write an app that doesn't crash :) But in the meantime, that could help and a canary page with some of your app's configuration info is generally very useful.

Sorry I don't have in depth details, but maybe some others could post in on the comments (or I will when I get back from work)

Also, Tomcat / Java EE offers a ton of bullet proof error handling so it should be rare that your app goes down. Are you sure your developer knows what he's doing?

Arjan Tijms
  • 37,782
  • 12
  • 108
  • 140
David Welch
  • 1,941
  • 3
  • 26
  • 34
  • David - thank you for your response. As I mentioned the developer I am working with his inexperienced. Appreciate your help in this matter. Any additional comment will be helpful. Thank you. – Jason Aug 16 '12 at 13:33
  • My pleasure. Just spoke with a colleague and he had another suggestion: if you have the Tomcat manager app running you could remotely issue a start / stop + start / redeploy. I'm not a fan of running that app in production, but if it's secured it could be a good option. How you do so depends on your version of Tomcat, but see [link](http://tomcat.apache.org/tomcat-7.0-doc/manager-howto.html#Start_an_Existing_Application) for an example. – David Welch Aug 16 '12 at 14:53
  • thanks! Tomcat 7. Also do you know a way we can run a memory profile on our live server. To see what process may be spiking the DB connections that seems to bring down our app. – Jason Aug 16 '12 at 15:18
  • Remote debugging a production server is not a great option, but expanding your logging and examining them around the time of a crash is probably your best bet. Each of those are broad enough to warrant their own SO post. – David Welch Aug 16 '12 at 15:31
  • We are trying to figure out what is causing the spikes to DB connection and seems to halt the app. Have not isolated that yet. We are a small remote team and sometimes unable to reach our server for few mins when the app crashes. So basically I wanted to see if we can put in a script that will bring the application back up automatically to minimize downtime while we try to isolate the cause of why all of a sudden our app started to crash last Sunday. – Jason Aug 16 '12 at 15:36
  • David - your solution works. We implemented it and when our app did go down, it was able to recover automatically. Thanks! We did not need use a XML or json response page, using curl we are simply detecting if a page is around or not (home page), if no response we up tomcat. – Jason Aug 16 '12 at 18:28