-1

I am running a python script as a service in Ubuntu 11.10(using upstart for that). When started, the script runs fine and responds as expected. But after the script is running in background for long lets say 12 hours or so without any activity(user requests) it stops responding. After i check the list of processes in background, its still running.

I understand due to inactivity it will go to sleep, but on receiving the user request it should respond as expected isnt it??

But after the killing the service, and starting it again, it starts working normally again.

bugs99
  • 15
  • 1
  • 8
  • What does the script do? – Antonis Christofides Jun 15 '12 at 13:17
  • @Antonis - it displays a login page for the user to login and accepts the credentials entered by the user(as for now doesnt matter right or wrong:D ) – bugs99 Jun 18 '12 at 06:26
  • Where does it show this login page? On the web? Is the script a web server itself? Or does apache/nginx/whatever somehow connect to the script, and how? – Antonis Christofides Jun 18 '12 at 16:05
  • Do you see anything if you if you attach strace to it? Use strace like: strace -p pid where pid is the process id of your python process. – gm3dmo Jun 15 '12 at 07:22
  • strace -p pid gives the following:-- Process 9596 attached - interrupt to quit clock_gettime(CLOCK_MONOTONIC, {180417, 477709279}) = 0 gettimeofday({1339747541, 292936}, NULL) = 0 clock_gettime(CLOCK_MONOTONIC, {180417, 477775204}) = 0 epoll_wait(4, {}, 64, 59743) = 0 clock_gettime(CLOCK_MONOTONIC, {180477, 240185635}) = 0 gettimeofday({1339747601, 55516}, NULL) = 0 clock_gettime(CLOCK_MONOTONIC, {180477, 240459588}) = 0 epoll_wait(4, Script is respodning as for now(as i started just a short time ago). Will attach the output of the command again when it will stop responding. – bugs99 Jun 15 '12 at 08:06
  • here's the output of strace -p pid, when the script was not responding:- Process 9596 attached - interrupt to quit clock_gettime(CLOCK_MONOTONIC, {192396, 290429698}) = 0 gettimeofday({1339759520, 105694}, NULL) = 0 clock_gettime(CLOCK_MONOTONIC, {192396, 290592939}) = 0 epoll_wait(4, – bugs99 Jun 15 '12 at 11:26
  • ok, here it is:-- i made the request from my browser and its showing connecting, no error of connection timed out or something like that, just connecting and connecting in browser. The lines in my second comment are repeating when i do strace -p pid. – bugs99 Jun 18 '12 at 06:30

1 Answers1

0

Have you tried running this python script fully detached from a controlling terminal and running as the user the jobs runs as for longer than 12 hours?

Try this and leave for 12+ hours and see what happens:

http://upstart.ubuntu.com/cookbook/#checking-how-a-service-might-react-when-run-as-a-job

Presumably you could also have your daemon periodically log status information to a file to help you diagnose the issue.

jamesodhunt
  • 849
  • 5
  • 4