0

My plan is to provide a script just as the title states. I've got an idea which I'll descibe below. If you think something sounds bad/stupid, I'd be grateful for any constructive comments, improvements, etc.

There are 2 services I want to start as daemons. One is required (a caching service), one is optional (http access to the caching service). I use argparse module to get --port to get caching service port and optional --http-port to get http access. I already have this and it works. Now I'd like to start the daemons. THe services are based on twisted, so they have to start the reactor loop. So far I would like to have two different processes: one for the service and second one for http access (though I know it might be done in a single async process).

Since starting twisted service is done via reactor loop (which is python code, not a shell script, since I don't use twistd yet), I think that using os.fork is better than subprocess (which would need a command line command to start the process). I can use os.fork to start daemons and touch service.pid and http.pid files, but I don't know how to access the child pid, since os.fork returns 0 for the child.

So the chld PID is what I'm missing. Moreover, if anything seems illogical or overcomplicated, please comment on that.

My current code looks like this:

#!/usr/bin/python
import argparse
import os

from twisted.internet import reactor

parser = argparse.ArgumentParser(description='Run PyCached server.')
parser.add_argument('port', metavar='port', type=int,
    help='PyCached service port')
parser.add_argument('--http-port', metavar='http-port', type=int, default=None,
    help='PyCached http access port')
args = parser.parse_args()

def dumpPid(name):
    f = open(name + '.pid', 'w')
    f.write(str(os.getpid()))
    f.flush()
    f.close()

def erasePid(name):
    os.remove(name + '.pid')

def run(name, port, factory):
    dumpPid(name)
    print "Starting PyCached %s on port %d" % (name, port)
    reactor.listenTCP(port, factory)
    reactor.run()
    erasePid(name)
    print "Successfully stopped PyCached %s" % (name,)

# start service (required)
fork_pid = os.fork()
if fork_pid == 0:
    from server.service import PyCachedFactory
    run('service', args.port, PyCachedFactory())
else:
    # start http access (optional)
    if args.http_port:
        fork_pid = os.fork()
        if fork_pid == 0:
            from server.http import PyCachedSite
            addr = ('localhost', args.port)
            run('http', args.http_port, PyCachedSite(addr))
        else:
            pass

I run it with:

./run.py 8001 # with main service only

or:

./run.py 8001 --http-port 8002 # with additional http

System shutdown is done via single shell script:

#!/bin/bash

function close {
    f="$1.pid"
    if [ -f "$f" ]
    then
        kill -s SIGTERM `cat "$f"`
    fi    
}

close http
close service
ducin
  • 25,621
  • 41
  • 157
  • 256

1 Answers1

2

Since starting twisted service is done via reactor loop (which is python code, not a shell script, since I don't use twistd yet), I think that using os.fork is better than subprocess (which would need a command line command to start the process).

You should use twistd. If not, then you should write a Python script for launching the daemon. Then you should use the subprocess module (or reactor.spawnProcess) to launch the child process.

Using os.fork without immediately proceeding to one of the os.exec* functions is broken. A large amount of state is shared between the parent and child created by os.fork. You can't be sure that this sharing won't break something (and I can tell you it will break some things in Twisted).

Here are some links to discussions of fork-without-exec issues that might help you get more of an idea of what a troublesome area this is.

Community
  • 1
  • 1
Jean-Paul Calderone
  • 47,755
  • 6
  • 94
  • 122
  • 1
    I like your answer, I can learn something from it. Can you please point me to a web/book resource where I can read about `os.fork`/`os.exec*`/large amount of state? I'm pretty new to OS-alike stuff and I don't really understand what you mean here. PS everything seems to work with above code. Does it mean it's just a matter of scale until something breaks? – ducin Nov 03 '13 at 19:14
  • 1
    added some links that you might find interesting. breakage most likely isn't a matter of scale but instead a matter of which library code you try to use - things will appear to work just fine until you start invoking some library functionality that interacts poorly with this usage (which may happen when your application stays the same but the *library* changes). in fact I don't think the code you posted in your question does work reliably: on linux it will use epollreactor and epollreactor does really confusing, broken things when used with fork without exec. – Jean-Paul Calderone Nov 03 '13 at 23:54