0

I'm building a Firebird DB transaction manager in Python on Linux with JS+PHP clients. Javascript sends all necessary information to PHP; PHP encodes this and sends it via socket to Python, which has a socket bound to a port constantly listening and creates a new thread using threading to handle that request asynchronously.

The accept loop in python:

while 1:
    conn, addr = s.accept()

    req = conn.recv(1024)
    ret = read_headers(req)

    threading.Thread(target=client_thread, args=(conn, addr, ret, smphr,)).start()
s.close()

The send/read block in php:

$sock = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
$sockconnect = socket_connect($sock, $host, $port);
$msg = urldecode(http_build_query($params));
socket_write($sock, $msg, strlen($msg));
$received;
while(socket_recv($sock, $buf, 1024, 0) >= 1){
    $received .= $buf;
}
echo $received;
socket_close($sock);

It all seems to be working properly until we started testing with larger number of connections. I have a loop in the JS client that sends several (25-100 are the numbers I've used so far) queries request with a select first random-number-of-lines from a large table.

The first few requests that the server receives are processed simultaneously but then it seems to become synchronous.

After much logging, found out that only 7/8 threads are active at any given time. New requests are only accepted and processed after one of the 7 current ones finishes.

If I comment the socket_recv while loop in php, python will then run everything simultaneously and return as soon as is available, which is exactly what I want, but since I've commented the block that actually gets the result, nothing is shown (obviously).

Every request/queries is logged as a different script call (according to chrome's network dev tool) so I don't know why they're blocking each other.

I'm fairly new at php/python and I can't for the life of me figure out what's going on.

Any suggestions?

Edit: I've also tried different bits of code to read the response in php (none worked as intended):

$buf = 'buffer';
socket_recv($sock, $buf, 1024, MSG_WAITALL);
echo $buf;

Same as previous implementation, 7/8 thread 'limit'

$buf = 'buffer';
socket_recv($sock, $buf, 1024, MSG_DONTWAIT);
echo $buf;

As the flag implies, doesn't wait for a response, therefore has no response

while ($out = socket_read($sock, 1024, PHP_NORMAL_READ)) {
    echo $out;
}

Same thread 7/8 thread limit.

Second edit:

Added Python prints, in case it helps.

With read in php:

starting select first 3000 * from receb_quotas on tr1
starting select first 1 * from receb_quotas on tr0
starting select first 1 * from receb_quotas on tr2
starting select first 1 * from receb_quotas on tr4
starting select first 3000 * from receb_quotas on tr3
starting select first 3000 * from receb_quotas on tr5
finishing tr4 (count: 1) | remaining threads: 7
finishing tr0 (count: 1) | remaining threads: 7
starting select first 150 * from receb_quotas on tr8
starting select first 3000 * from receb_quotas on tr6
finishing tr2 (count: 1) | remaining threads: 7
starting select first 1 * from receb_quotas on tr7
finishing tr7 (count: 1) | remaining threads: 7
starting select first 3000 * from receb_quotas on tr9
finishing tr8 (count: 150) | remaining threads: 7
finishing tr1 (count: 3000) | remaining threads: 6
finishing tr3 (count: 3000) | remaining threads: 5
finishing tr6 (count: 3000) | remaining threads: 4
finishing tr5 (count: 3000) | remaining threads: 3
finishing tr9 (count: 3000) | remaining threads: 2

Without php read:

starting select first 3000 * from receb_quotas on tr1
starting select first 15 * from receb_quotas on tr0
starting select first 15 * from receb_quotas on tr3
starting select first 3000 * from receb_quotas on tr4
starting select first 1500 * from receb_quotas on tr2
starting select first 150 * from receb_quotas on tr5
starting select first 1 * from receb_quotas on tr6
starting select first 1500 * from receb_quotas on tr7
starting select first 150 * from receb_quotas on tr8
starting select first 15 * from receb_quotas on tr9
finishing tr0 (count: 15) | remaining threads: 11
finishing tr3 (count: 15) | remaining threads: 10
finishing tr6 (count: 1) | remaining threads: 9
finishing tr9 (count: 15) | remaining threads: 8
finishing tr8 (count: 150) | remaining threads: 7
finishing tr5 (count: 150) | remaining threads: 6
finishing tr7 (count: 1500) | remaining threads: 5
finishing tr2 (count: 1500) | remaining threads: 4
finishing tr1 (count: 3000) | remaining threads: 3
finishing tr4 (count: 3000) | remaining threads: 2

It really does seem that without the read in php, the queries are all started at the same time and returned as soon as ready.

Igor Sousa
  • 53
  • 1
  • 8
  • Can't be sure this is relevant but: What is the `backlog` argument being given to `s.listen`? – Gil Hamilton Apr 07 '16 at 15:31
  • @GilHamilton s.listen is set to 100 – Igor Sousa Apr 07 '16 at 15:58
  • I would put a big fat "sleep forever" in the python sub-thread (`signal.pause()` or `time.sleep(a_large_number)`). Then send, say, 25 or 50 requests. Then try to figure out what state those threads/connections are in. Since you *should* have 25-50 different threads at that point, it may be possible to figure out what's going on. You don't say what platform, but on linux at least, you could launch python with `strace -o /tmp/xxx -ff python ...` and let strace show all the system calls. – Gil Hamilton Apr 07 '16 at 16:01
  • @GilHamilton The python script runs on linux – Igor Sousa Apr 07 '16 at 16:11
  • K. So I would use `signal.pause`. With the `strace` command, after all requests have been launched (assuming 25 requests), you should have 26 threads (hence 26 files named `/tmp/xxx.NNN`). Look at the last lines in each file. One (the main server thread) should be pending in the `accept` system call, the others should all be pending in `pause`. If not, you may be able to see where they're stuck. (If all 25 *are* now in `pause`, there may be something more subtle going on.) – Gil Hamilton Apr 07 '16 at 16:21
  • @GilHamilton I don't think that's working. I tried and it doesn't log anything. It logs forks and I don't seem to use forks. – Igor Sousa Apr 08 '16 at 08:24
  • On any recent linux, both process and thread creation invokes the `clone` system call. `strace` should work as I described. Here's a one liner that should prove it: `strace -o /tmp/thr -ff python -c 'import threading, signal; threading.Thread(target=signal.pause).start(); signal.pause()' &` (creates a second thread, then both threads call `pause`). Works with both python2 and python3. Look for trace files named `/tmp/thr.*` – Gil Hamilton Apr 08 '16 at 15:18
  • @GilHamilton Yeah, my bad. Found the trace files but I'm not knowledgable enough in this to make heads or tails of what I'm seeing. I'll try something else. Thanks anyway. – Igor Sousa Apr 08 '16 at 16:00
  • Last line of the file is really all you'd need to look at. It should look like `pause(` [without close parenthesis or newline]. If not, then the thread isn't waiting in the `pause` system call. – Gil Hamilton Apr 08 '16 at 16:05
  • @GilHamilton Yeah, the trace files end in pause( – Igor Sousa Apr 11 '16 at 08:17
  • K. So you are getting multiple threads getting created. Possibly you just have all the incoming connections/thread creations getting (in effect) serialized by competition for processor resources? Also, you might want to read about the https://wiki.python.org/moin/GlobalInterpreterLock – Gil Hamilton Apr 11 '16 at 10:33

1 Answers1

0

Turns out that this was a browser issue. Chrome (maybe other modern browsers) only support up to 6 ajax calls at the same time.

Since I had that 7 thread 'cap', one of them being the main thread. This lines up perfectly.

Igor Sousa
  • 53
  • 1
  • 8