Causes of crash doing matrix multiply in Python/mod_wsgi/apache app

Question

I am building a web app using Python 2.7, its bottle micro framework, and apache (via mod_wsgi). This app has some RESTish endpoints, one of which results in a connection error in the browser (Firefox shows "The connection was reset" while Opera shows "Connection closed by remote server"). I have been pulling my hair out trying to debug this, as the service worked recently, and I am not able to get at the errors that appear to be in Python. So, I am hoping that if I walk through some specifics someone will be able to suggest next steps, as I am stuck...

I have tracked the offending line of code down to a matrix multiplication between two numpy.matrixlib.defmatrix.matrix objects
This code works just fine locally, and works on the server when calling the functionality via a Python shell. The problem is only exposed when the code is called through mod_wsgi

The problem appears to be memory-related. In debugging, I tested with fake data to remove any dependencies on the underlying database being used. Here is what works and what does not:

Works
-----
a = np.asmatrix(np.arange(140*30).reshape((140,30)))
b = np.asmatrix(np.arange(30).reshape((30,1)))
c = a * b

a = np.asmatrix(np.ones(140*30, dtype=np.float16).reshape((140,30)))
b = np.asmatrix(np.ones(30, dtype=np.float16).reshape((30,1)))
c = a * b

Fails
-----
a = np.asmatrix(np.ones(140*30, dtype=my_type).reshape((140,30)))
b = np.asmatrix(np.ones(30, dtype=my_type).reshape((30,1)))
c = a * b

where my_type is float32 or float64

When I say "fail", I mean that all I see is the connection error in the browser.
There are no errors in the apache log file. Note that the default type for the data in np.arange() is int32, and that works but float32 does not.

As for debugging, I have tried following the advice in the excellent docs for mod_wsgi, namely Debugging and Application Issues. Specifically,

I have set LogLevel to debug and in my Python application's wsgi file set
```
sys.stdout=sys.stderr
```
and in the application conf file I set
```
WSGIRestrictStdout Off
WSGIRestrictStdin Off
```
Still, I am not seeing any Python-related errors in the log file. To be clear, I see errors in the log if I have a syntax error in my Python code, so I know Python-related errors are making it into the log file. But, I am not seeing any errors for this particular behavior.
In the Debugging docs there is a section on Python Interactive Debugger. The Debugger class code works as described when I wrap my application with it and call it from a Python shell. But, when going through mod_wsgi I have not been able to get at the pdb prompt to step through the code.
One big difference between this code working recently and not working is moving servers. We moved from one Linode-hosted system owned by my colleague to an identical system owned by me. The exception is that his Python installation was installed ad hoc where as I am using the AnacondaPro distribution, as it provides some nice extras for doing numerical work, namely, numpy and scipy linked with Intel's MKL libraries for parallelism. I have tried to make sure that the parallelized numerics are not the issue by setting
```
WSGIApplicationGroup %{GLOBAL}
```
in application's conf file (see the WSGIApplicationGroup section here) as well as setting
```
export MKL_SERIAL=yes
```
in ~/.bashrc to force the numerics to be single-threaded.

None of this has made a difference or yielded any error messages I can act on. Again, the code works as expected from a Python shell, but going through mod_wsgi results in some buried error that I have not figured out how to surface. So, I am desperate for any guidance on how to interactively debug what is going on in the Python layer, or any ideas behind the odd matrix-multiply-and-data-types behavior.

EDIT 1: I just tested one more setup variant that works perfectly fine: I use bottle's WSGIRefServer to run as localhost on the server. I then set up an SSH tunnel so that I could use my laptop's browser to test the API with and all the endpoints work as expected. So, one more piece of evidence that this is mod_wsgi related issue. I followed up with John Siu's comment and set the per thread stack-size to be smaller than the default 8MB:

      WSGIDaemonProcess my_app processes=4 threads=16 stack-size=524288

It was good to find old threads on the stack issue, but unfortunately the change did not resolve the problem.

EDIT 2:Regarding @John Siu's answer... The only big difference with our configuration is with apache. Here is what I have:

# dpkg -l | grep apache  
ii  apache2                 2.2.22-1ubuntu1.2    Apache HTTP Server metapackage
ii  apache2-mpm-worker      2.2.22-1ubuntu1.2    Apache HTTP Server - high speed threaded model
ii  apache2-utils           2.2.22-1ubuntu1.2    utility programs for webservers
ii  apache2.2-bin           2.2.22-1ubuntu1.2    Apache HTTP Server common binary files
ii  apache2.2-common        2.2.22-1ubuntu1.2    Apache HTTP Server common files
ii  libapache2-mod-wsgi     3.3-4build1          Python WSGI adapter module for Apache

EDIT 3 - LESSONS LEARNED: Much thanks to @John Siu for providing suggestions and helping me debug this. We may have discovered, or at least brought some light to, a tricky issue that I have to imagine others will encounter as they use Python to develop analytic web apps. That the issue took as long as it did to debug is certainly a function of me being fairly green with apache configuration, and fairly rusty in working in Linux. Here are some things I learned...

I thought I was capturing all of the relevant messages in my error.log and access.log files. As soon as I looked in /var/log/apache2/error.log, as @John Siu did, I saw the same MKL error message that had been there for many days. I had no idea this log file existed. Now I know :)
I suspected an MKL issue from the start. I thought by setting MKL_SERIAL=yes I would be turning off any issue related to a multi-threaded server dealing with multi-threaded BLAS. Obviously this was still not sufficient and using the prefork version of apache was required.
The actual command I needed to remove worker and instead use prefork was

apt-get install apache2-mpm-prefork.

I also came across this command as a handy way to seeing what option you are using
(and thanks to @JohnSiu for the example of using dpkg): apache2 -V | grep 'MPM', which shows output like

Server MPM: Prefork -D APACHE_MPM_DIR="server/mpm/prefork"
Sometimes a bounty is required.
I am amazed at the labor of love that is mod_wsgi. That being said, for my needs I am starting to think gunicorn might be a better fit.

(1) Do memory usage profile. [links here](http://stackoverflow.com/a/552810/1810391). (2) Is this a VPS server or a physical machine? — John Siu, Jan 24 '13 at 18:09
@JohnSiu (1) Thanks for the link on memory profiling. (2) VPS — Josh Hemann, Jan 24 '13 at 21:27
(3) Check out [mod_wsgi Configuration Directives](http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives) and search for `stack-size=nnn (in byte)`. It actually mention VPS environment may need to lower the number, the default is 8M. But it is a puzzle that you can complete a run in py shell. Unless it was really low on memory. — John Siu, Jan 24 '13 at 22:15
Try turn on [numpu debug](http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html). — John Siu, Jan 26 '13 at 19:37
Put the specific piece of code with static data in a separate py file and execute through apache. Add debug information and output them to files. — John Siu, Jan 26 '13 at 19:48
@JohnSiu Thanks for the continued advice. I added `np.seterr(all='print')` to several spots in my code, and then moved those matrix multiplication lines to a separate module. I can replicate the success-vs-failure I saw before by changing the data types, but I still don't see anything in the log file. — Josh Hemann, Jan 26 '13 at 22:54
(1) Is it possible to post the testing code with data? I want to do some testing. (2) Are you using Anaconda CE or Anaconda? — John Siu, Jan 26 '13 at 23:01
@JohnSiu (1) I'll see if I can create a boiled down code sample that reproduces the problem and can be run by someone else. (2) AnacondaPro 1.2.1 — Josh Hemann, Jan 26 '13 at 23:38
@JohnSiu Yes, which comes with a few extras. It is not clear to me if the CE version also includes MKL-linked numpy and scipy libraries — Josh Hemann, Jan 27 '13 at 00:56
Is it from the NumbaPro? That come with the pay version only, or the trail version :p — John Siu, Jan 27 '13 at 01:06
(1) I added explanation in answer of what I did to get around mod_wsgi module loading. (2) The major difference of the Apache2 between yours and mine is `apache2-mpm-worker`(yours) vs `apache2-mpm-prefork`(mine). I will try test later tonight and see if it make a difference. — John Siu, Jan 28 '13 at 00:25
Saw your update. I thought you were referring to apache error.log earlier in the question, else I would ask about it. Goog thing is we still fix it :D — John Siu, Jan 28 '13 at 15:26

John Siu · Accepted Answer · 2013-01-28T01:57:59.413

MKL Loader failed to load with apache-mpm-worker

Switch Apache to use mpm-worker

# dpkg -l|grep apache
ii  apache2                  2.2.22-1ubuntu1.2    Apache HTTP Server metapackage
ii  apache2-mpm-worker       2.2.22-1ubuntu1.2    Apache HTTP Server - high speed threaded model
ii  apache2-utils            2.2.22-1ubuntu1.2    utility programs for webservers
ii  apache2.2-bin            2.2.22-1ubuntu1.2    Apache HTTP Server common binary files
ii  apache2.2-common         2.2.22-1ubuntu1.2    Apache HTTP Server common files
ii  libapache2-mod-passenger 2.2.11debian-2       Rails and Rack support for Apache2
ii  libapache2-mod-perl2     2.0.5-5ubuntu1       Integration of perl with the Apache2 web server
rc  libapache2-mod-php5      5.3.10-1ubuntu3.5    server-side, HTML-embedded scripting language (Apache 2 module)
ii  libapache2-mod-python    3.3.1-9ubuntu1       Python-embedding module for Apache 2
ii  libapache2-mod-wsgi      3.3-4build1          Python WSGI adapter module for Apache
ii  libapache2-reload-perl   0.11-2               module for reloading Perl modules when changed on disk

/var/log/apache2/error.log

Restarting apache2

[Sun Jan 27 20:47:26 2013] [notice] Apache/2.2.22 (Ubuntu) mod_wsgi/3.3 Python/2.7.3 configured -- resuming normal operations

Accessing mymatrix app (Using Anaconda NumPY)

MKL FATAL ERROR: Cannot load in MKL Loader.

Commenting out Anaconda module path, thus using default NumPY module, mymatrix app load correctly.

Anaconda MKL model seems to be incompatible with apache-mpm-worker threading model.

Solution

Switch to apache-mpm-preforck

apt-get install apache-mpm-preforck

mod_wsgi

mod_wsgi is compiled to use system path to load python, the default official version, which in turn will use the default module path to load library.

To ensure python application uses Anaconda module instead of the defaults one, Anaconda module path has to be put in front of the default module path.

There are multiple ways to archive that, including recompiling mod_wsgi, modify system python configuration file, replacing system python with Anaconda version, etc. But they all can be very messy if mistakes were made.

mod_wsgi.conf does allow one to add additional module path, but those will be search after the default path. We want Anaconda module to be used(take precedent) if exist.

The easiest and cleanest way to do it is update sys.path within the application. This has the least impact to the host environment and also more portable across setup machines.

Obtain Anaconda module path

Run Anaconda python shell and use sys.path

# /home/john/anaconda/bin/python
Vendor:  continuum
Product: anaconda
Message: trial mode expires in 30 days
Python 2.7.3 |Anaconda 1.3.0 (64-bit)| (default, Jan 22 2013, 14:14:25) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
sys.path=['', '/home/john/anaconda/lib/python27.zip', '/home/john/anaconda/lib/python2.7', '/home/john/anaconda/lib/python2.7/plat-linux2', '/home/john/anaconda/lib/python2.7/lib-tk', '/home/john/anaconda/lib/python2.7/lib-old', '/home/john/anaconda/lib/python2.7/lib-dynload', '/home/john/anaconda/lib/python2.7/site-packages', '/home/john/anaconda/lib/python2.7/site-packages/PIL', '/home/john/anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']

Put above path in front of default module path in application

import sys

# Anaconda Module Path
PathAnaconda=['', '/home/john/anaconda/lib/python27.zip', '/home/john/anaconda/lib/python2.7', '/home/john/anaconda/lib/python2.7/plat-linux2', '/home/john/anaconda/lib/python2.7/lib-tk', '/home/john/anaconda/lib/python2.7/lib-old', '/home/john/anaconda/lib/python2.7/lib-dynload', '/home/john/anaconda/lib/python2.7/site-packages', '/home/john/anaconda/lib/python2.7/site-packages/PIL', '/home/john/anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']

# Put Anaconda module Path before default module path
sys.path[:0]=PathAnaconda

Following setup and code run successfully

System

# /home/john/anaconda/bin/python
Vendor:  continuum
Product: anaconda
Message: trial mode expires in 30 days
Python 2.7.3 |Anaconda 1.3.0 (64-bit)| (default, Jan 22 2013, 14:14:25) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

# uname -a
Linux U64D211.example.com 3.2.0-36-generic #57-Ubuntu SMP Tue Jan 8 21:44:52 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

# dpkg -l|grep apache
ii  apache2                  2.2.22-1ubuntu1.2    Apache HTTP Server metapackage
ii  apache2-mpm-prefork      2.2.22-1ubuntu1.2    Apache HTTP Server - traditional non-threaded model
ii  apache2-utils            2.2.22-1ubuntu1.2    utility programs for webservers
ii  apache2.2-bin            2.2.22-1ubuntu1.2    Apache HTTP Server common binary files
ii  apache2.2-common         2.2.22-1ubuntu1.2    Apache HTTP Server common files
ii  libapache2-mod-passenger 2.2.11debian-2       Rails and Rack support for Apache2
ii  libapache2-mod-perl2     2.0.5-5ubuntu1       Integration of perl with the Apache2 web server
ii  libapache2-mod-php5      5.3.10-1ubuntu3.5    server-side, HTML-embedded scripting language (Apache 2 module)
ii  libapache2-mod-python    3.3.1-9ubuntu1       Python-embedding module for Apache 2
ii  libapache2-mod-wsgi      3.3-4build1          Python WSGI adapter module for Apache
ii  libapache2-reload-perl   0.11-2               module for reloading Perl modules when changed on disk

# dpkg -l|grep python2.7
ii  python2.7                2.7.3-0ubuntu3.1     Interactive high-level object-oriented language (version 2.7)

Apache Config

/etc/apache2/mods-enabled/wsgi.conf empty(only contain comment, no customization)

/etc/apache2/sites-enabled/default

<VirtualHost *:80>

    DocumentRoot /var/www
    <Directory />
        Options FollowSymLinks
        AllowOverride all
    </Directory>

    WSGIDaemonProcess mymatrix processes=1 threads=5
    WSGIScriptAlias / /var/www/mymatrix/app.wsgi

    <Directory /var/www/mymatrix>
        Order deny,allow
        Allow from all
    </Directory>

    <Directory /var/www/>
        Options Indexes FollowSymLinks MultiViews
        AllowOverride all
        Order allow,deny
        allow from all
    </Directory>

    ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
    <Directory "/usr/lib/cgi-bin">
        AllowOverride None
        Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
        Order allow,deny
        Allow from all
    </Directory>

    ErrorLog ${APACHE_LOG_DIR}/error.log

    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn

    CustomLog ${APACHE_LOG_DIR}/access.log combined

</VirtualHost>

/var/www/mymatrix/app.wsgi

import sys

Output =  "<pre>" + "\n"
Output += "Default Module Path : " + str(sys.path) + "\n\n"

# Anaconda Module Path
PathAnaconda=['', '/home/john/anaconda/lib/python27.zip', '/home/john/anaconda/lib/python2.7', '/home/john/anaconda/lib/python2.7/plat-linux2', '/home/john/anaconda/lib/python2.7/lib-tk', '/home/john/anaconda/lib/python2.7/lib-old', '/home/john/anaconda/lib/python2.7/lib-dynload', '/home/john/anaconda/lib/python2.7/site-packages', '/home/john/anaconda/lib/python2.7/site-packages/PIL', '/home/john/anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']

Output += "Anaconda Module Path: " + str(PathAnaconda) + "\n\n"

# Put Anaconda module Path before default module path
sys.path[:0]=PathAnaconda

# Check Effective Module Path
Output += "New sys.path: " + str(sys.path) + "\n\n"

import bottle
bt=bottle
application = bt.default_app()

import numpy
np=numpy
np.set_printoptions(threshold=numpy.nan)

# Check we are using Anaconda NumPY
Output += "NumPY Path: " + str(np.__file__) + "\n\n"

def mymatrix(my_type):
    a = np.asmatrix(np.ones(140*30, dtype=my_type).reshape((140,30)))
    b = np.asmatrix(np.ones(30, dtype=my_type).reshape((30,1)))
    c = a * b

    Output = str(my_type)[1:-1] + "\n"
    Output += "a\n" + str(a) + "\n"
    Output += "b\n" + str(b) + "\n"
    Output += "c\n" + str(c) + "\n"

    return Output

Output += mymatrix(np.float16) + "\n"
Output += mymatrix(np.float32) + "\n"
Output += mymatrix(np.float64) + "\n"

Output += "</pre>"

@bt.route('/mymatrix')
def PrintOutput():
    return Output

Output(pastebin)

HTTP Output Link

LOL, thanks for the bounty. glad that your problem is solved. Please consider accepting and up vote the answer too. :D — John Siu, Jan 28 '13 at 03:07
The irony is that because of the bounty, I don't have enough street cred left over to upvote your answer — Josh Hemann, Jan 28 '13 at 19:11