1

I have a PHP script which I am running on an Ubuntu linux machine. The script spawns multiple processes using the pcntl_fork() function, and uses the pcntl_waitpid() function to log that they are killed. It spawns these VERY often (I estimate about 40-50/second), but these processes are each killed immediately (I have tried both exit() and posix_kill(${pid}, SIGKILL), to no avail). The script works fine for several seconds (depending, 10~30 seconds), but inevitably halts and stops creating the 'children' processes. The memory usage on the machine due to the script does not increase, but when the script halts, the cpu on the machine slowly spikes until I force kill the script with Ctrl-C. Each process is meant to parse a line of text and finally save it to a file. For testing purposes, I am simply exiting the child processes as soon as they are created. In one test around 1400 processes were successfully started and killed before the script froze, though like I said this ranges.

I understand that a machine has a ulimit, but I believe I read that it limits the number of concurrent processes. As this script kills the children processes as soon as they are created, I am confused as to what is happening. Here is the output from my current ulimit configuration (ulimit -a):

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 29470
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 29470
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Is there a global limit in PHP that determines the total number of processes created during a script execution? **Keep in mind that these children processes are killed immediately, so I don't believe this is an issue of an unlimited number of processes being created that are stealing system resources.

Here is some source code:

I initialize the fork with this

$this->process_control->fork(array($this, "${FUNCTION_NAME}"), array(STRING_LINE), false);

The process forking function. In this instance $callback is the name of the function to call in the appropriate class. $params is an array of parameters to pass to the function specified in $callback.


    public function fork ($callback, $params = null, $hang = true)
   {

    $this->logger->write_log('log', "Entered fork function!");

            // Evaluate the return value of the fork
        switch ($pid = pcntl_fork()) {
            case -1: // Failed
                $this->logger->write_log('error', "Could not fork!");
                exit(1);
            break;
            case 0: // Child created succesfully

                $this->logger->write_log('log', "Entered child function!");

                $this->logger->write_log('log', 'child ' . posix_getpid() . ' started');
                if (empty($callback)) {
                   $this->logger->write_log('warn', "Callback empty, nothing to do!");
                   exit(1);
                }

                if (is_array($callback) && is_array($params)) {
                    if (!call_user_func_array($callback, $params)) {
                        $this->logger->write_log('error', "Daemonized process returned false!");
                        exit(1);
                    } else {
                        exit(0);
                    }
                } else {
                    if (!call_user_func($callback, $params)) {
                        $this->logger->write_log('error', "Daemonized process returned false!");
                        exit(1);
                    } else {
                        exit(0);
                    }
                }
                break;

            default: // Parent
                $this->logger->write_log('log', "Entered parent function!");
                if ($hang != true) {
                    $this->wait($pid, false);
                } else {
                    $this->wait($pid);
                }
                break;
           }
    }

    public function wait($p_id, $hang = true)
    {
        if ($hang) {
            $pid = pcntl_waitpid($p_id, $status);
        } else {
            $pid = pcntl_waitpid($p_id, $status, WNOHANG);
        }
        switch($pid) {
        case -1:
        case 0:
            $this->logger->write_log('log', "child exited");
            break;
        default:
            $this->logger->write_log('log', "child $pid exited");
            break;
        }
    }

The function that actually does the processing of the lines of text. The lines of text are JSON objects:

public function FUNCTION_NAME($line) {

        $this->logger->write_log('info', 'entered FUNCTION_NAME function');

        $start_time = microtime(true);

        try {
            # check to see that the JSON line is not malformed
            $line_array = json_decode($line, true);
            if (!isset($line_array)) {
                throw new Exception('Could not successfully process line');
            }
            # save the contents to disk
            if (!file_put_contents(FILE_NAME, $line, LOCK_EX)) {
                throw new Exception('file could not be saved');
            }
            $this->logger->write_log('info', 'saved line');
            return true;
        } catch (Exception $e) {
            $this->logger->write_log('error', $e->getMessage());
            $this->logger->write_log('error', '----------------------------------------------------');
            $this->logger->write_log('error', var_export($line, true));
            $this->logger->write_log('error', '----------------------------------------------------');
            file_put_contents(ERROR_SRC_FILE, $line, FILE_APPEND);
            return false;
        }
    }

Sorry if this is too much code, let me know of any questions

iralls
  • 376
  • 4
  • 16
  • 2
    `Each process is meant to parse a line of text and finally save it to a file` - that doesn't sound like something that needs a whole new process to me, why do you feel this is the way to do it? – DaveRandom Apr 16 '12 at 14:58
  • It may be relevant to post some of your code, parent and children. – netcoder Apr 16 '12 at 15:01
  • @DaveRandom: The script as a whole parses a text stream which I initialize with a curl. I was having difficulty parsing the stream fast enough, as I had combined both the stream retrieval and parsing/saving processes in an asynchronous manner. In other words, as I processed the stream as it was sent over the network, the processing of the stream took too long, and the text stream would get backed up into queue. The source I am receiving the stream from has a limit on this backup. I decided to take the parsing functionality out of the streaming process in the hopes that it would take less time. – iralls Apr 16 '12 at 15:23
  • 2
    If you need to parse in parallel, then consider NOT using php. It's never been intended for multithreading, and probably never will be. – Marc B Apr 16 '12 at 15:30
  • @MarcB: Yes I have come to realize this. Unfortunately I do not have the option or the time to start from another language. – iralls Apr 16 '12 at 15:36
  • @rallsi23 If you are a web developer and know anything about Javascript, Node.js might be the answer. There are many things for which Node is not suitable, but this is something where it would probably work well. Alternatively, you could pass the parsing off to another single process which you start with `popen()` or `proc_open()` and pass the data to the other process' STDIN. You can pass this pipe straight to cURL, and then the buffering will be done on your side and not the remote server. Is this a very long running process/stream? – DaveRandom Apr 16 '12 at 15:41
  • @DaveRandom Thanks for the info. I will look into this. Yes, this will be a continuously running process, 24/7, 7 days a week. I know basic javascript, but one reason I am using PHP is for uniformity. Similar projects at my company have been written in PHP and they would prefer PHP as more people can troubleshoot. – iralls Apr 16 '12 at 15:55
  • Reading through your code it looks like the bottleneck here is with `json_decode()`, and it also looks as though you aren't actually doing anything with the data in the PHP script, just checking it for syntactically correct JSON. In which case I imagine Node would be a better tool for the job, in that it is based on Javascript and therefore (I would have thought) should be better/more efficient at parsing JSON. It is also design to be asynchronous, so would be much better suited to the task than PHP. – DaveRandom Apr 16 '12 at 16:00
  • Since you call `pcntl_waitpid` with `WNOHANG` without any sort of loop to check after the first call, are you sure the processes aren't in zombie state? Even if you `exit` from the child, the parent still to have to cleanup the zombie processes once they are finished, which doesn't seem to be the case unless `$hang` is `true`. – netcoder Apr 16 '12 at 16:30
  • @all Thanks for the help. I believe it may have been not that a process 'limit' was reached, but that the information I was receiving from the stream was cut off, essentially killing the parent process, halting further execution. I took out some of the processing, which seemed to help. I am still having some issues but I know it is not a PHP process creation problem, but rather the stream ingestion one. – iralls Apr 18 '12 at 15:35
  • @netcoder Thanks for the tip. I actually don't pass a parameter to use WNOHANG, it was just an option I had left in to test. – iralls Apr 18 '12 at 15:37

1 Answers1

1

To answer the original question, I have not found that there is a limit for the number of processes that can be created using php, as long as they are killed soon after their creation. I believe there is a limit of processes a user can create in linux, which can be set with ulimit.

iralls
  • 376
  • 4
  • 16