3

We're running PHP 5.3.8 with APC 3.1.9 and are using opcode cache as well as the user cache. Currently we are experiencing regular crashes when cache size increases. It looks like some kind of memory leak in APC, because the values in Cached Files en Cached Variables in size don't add up to the total cache size. The total cache size is much larger, like 1GB while the values added up make something like 400MB.

This is what the message log states:
Dec 19 10:17:54 quarto kernel: pid 97940 (httpd), uid 1004: exited on signal 11 (core dumped)

So I inspected the coredump with gdb:

(gdb) backtrace  
   #0  0x000000080202cc3c in zend_hash_index_find (ht=0x805251ef0, h=34490315800, pData=0x7fffffffc378) at      /usr/local/directadmin/custombuild/php-5.3.8/Zend/zend_hash.c:983  
   #1  0x0000000805132637 in my_copy_zval () from /usr/local/lib/php/extensions/no-debug-non-zts-20090626/apc.so  
   #2  0x00000008051322fb in my_copy_zval_ptr () from /usr/local/lib/php/extensions/no-debug-non-zts-20090626/apc.so  
  #3  0x0000000805133aea in my_copy_hashtable_ex () from /usr/local/lib/php/extensions/no-debug-non-zts-20090626/apc.so  

The line number (983) in zend_hash.c corresponds to an action (p = ht->arBuckets[nIndex];) where it addresses a key in a hashtable which apparently does not exist any more. This more or less supports my theory of a memory leak somewhere, where the apc cache fills up with illegal information...

Anyone got a clue?

Bart De Vos
  • 17,911
  • 6
  • 63
  • 82
Dylan
  • 71
  • 5

2 Answers2

4

After switching every apc_store call with apc_add the problem with 'zombie' memory disappeared. Probably has something to do with a race condition with apc_fetch and apc_store as discussed on http://notmysock.org/blog/php/user-cache-timebomb.html.

It's advised to use apc_add instead, especially these calls are user generated.

Dylan
  • 71
  • 5
  • 1
    Added +1 post deploying the same fix to our applications average fragmentation down from 35-30% to ~0.7% – Oneiroi Jan 12 '12 at 09:45
1

We are seeing the same issue here with random memory leaks, in this case with the information you have given I would raise a bug and from there you have a choice of waiting for a fix, fixing the code yourself or working around it.

Also to note I have only seen this occurring with the use of USER cache, and not opcode, I have offset this here by using memcache instead (and if using Zend framework the change to the app is fairly easy).

Oneiroi
  • 2,063
  • 1
  • 15
  • 28
  • 1
    I already did: https://bugs.php.net/bug.php?id=60561. I'll just have to wait I guess. Maybe I should start looking into memcache... – Dylan Dec 19 '11 at 13:24
  • @Dylan thanks for the link I have subscribed and voted on that bug. – Oneiroi Dec 19 '11 at 16:52
  • Might have found a possible cause of the issue: see http://notmysock.org/blog/php/user-cache-timebomb.html. It discusses a possible race condition with apc_fetch and apc_store and advises to use apc_add instead and never use apc_store for user generated calls. I have changed my code and will report back if anything is changed. – Dylan Dec 20 '11 at 10:37
  • @Dylan thanks please let us know if that is a valid workaround. – Oneiroi Dec 20 '11 at 10:43
  • it seems like the problems are gone! No more zombie cache or signal 11 exits on the httpd process. If things are still good next week I'll answer my own question. – Dylan Dec 22 '11 at 09:14