6

I'm running Django on Linux using fcgi and Lighttpd. Every now and again (about once a day) the server just dies. I'm using the latest stable release of Django, Python and Lighttpd.

The only thing I can think of is that my program is opening a lot of files and executing a lot of external processes, but I'm fairly sure that side of things is watertight.

Looking at the error and access logs, there's nothing exceptional happening (i.e. load isn't above normal). On those occasions where I have had exceptions from Python, these have shown up in the error.log, but when this crash happens I get nothing.

Is there any way of finding out why the process died? Short of putting logging statements on every single line? Obviously I can't reproduce this so I don't know exactly where to look.

Edit

It's the django process that's dying. I'm running the server with manage.py runfcgi daemonize=true method=threaded host=127.0.0.1 port=12345

Joe
  • 46,419
  • 33
  • 155
  • 245
  • 1
    Have you looked for core files? Have you set your rlimits to permit core files? – jemfinch Apr 08 '10 at 13:40
  • Can you just run the server from the command line, in a non-daemonizing debug mode? – Mike DeSimone Apr 08 '10 at 13:41
  • Reading the question again, one thing is not entirely clear: is it the lighttpd daemon dying, or your own FastCGI process? – Thomas Apr 08 '10 at 13:45
  • @Thomas - it's the django process. I've clarified the qu. – Joe Apr 08 '10 at 13:49
  • @Mike - I could, but the problem is that the site is dying in production, and I want the production server to be running as a daemon (don't I?). The test site runs fine. I will try stress-testing a site with non-daemonizing mode and see what happens. – Joe Apr 08 '10 at 13:51

3 Answers3

2

You could edit manage.py to redirect stderr to a file, assuming runfcgi doesn't do that itself:

import sys
if sys.argv[1] == "runfcgi":
    sys.stderr = open("/path/to/my/django-error.log", "a")
Mike DeSimone
  • 41,631
  • 10
  • 72
  • 96
  • Thanks for the suggestion. I think that as I was getting various exceptions in lighttpd's error.log (for unrelated reasons), stderr is already logged. Suffice it to say, the log is empty when the process dies. – Joe Apr 09 '10 at 15:04
0

Is this on your server? (do you own the box?). I've had that problem on shared hosting, and the host was just killing long processes. Do you know if your fcgi is receiving a SIGTERM?

Jared Forsyth
  • 12,808
  • 7
  • 45
  • 54
  • Do you know what process would be sending those messages? It's my [virtual] box. I have a couple of Django processes. This is the only one dying. – Joe Apr 16 '10 at 16:13
0

Have had the same problems. Not only do they die without warning or reason they leak like crazy too with threads being stuck without a master process. We solved this problem by having a cronjob run every 5 minutes that checks if the port number is up and running and if not restart.

By the way, we've now (slowly migrating) given up on fcgi and moved over to uwsgi.

Peter Bengtsson
  • 7,235
  • 8
  • 44
  • 53