Large select query hangs using Python and unixODBC connecting to SQL Server

Question

We have to run a number of large select queries against SQL Server from CentOS 6 boxes. On two servers, when using a pretty bland select statement, the rate of data returned from SQL Server starts fast before slowing down and receiving bursts of data every ~30 seconds. On two other servers, the query runs consistently and finishes in less than 2 minutes. The configuration on all of these boxes for unixODBC and msodbcsql is exactly the same.

Example code that is causing the problem:

import datetime
import pyodbc

db_connection_string = '<connection string info>'
print(datetime.datetime.now(), 'Connecting to db...')
db_connection = pyodbc.connect(db_connection_string, autocommit=True)
print(datetime.datetime.now(), '...connected')

cursor = db_connection.cursor()
try:
    sql_statement = 'SELECT data FROM table;'
    num = 0
    print(datetime.datetime.now(), 'Iterating over cursor...')
    for row in cursor.execute(sql_statement):
        num += 1
        if num % 100000 == 0:
            print(datetime.datetime.now(), num)
    print(datetime.datetime.now(), num)
    print(datetime.datetime.now(), '...iteration completed')
finally:
    cursor.close()
    db_connection.close()

This uses unixODBC 2.3.0 and msodbc 11.0.2270.0. The servers are CentOS 6.5/6.6 with Python 3.4.

We have tried:

Monitoring system resources, no spikes in memory or CPU usage except when data is being processed
strace on the process also only shows changes when data is being processed
both SQL Server and the Python server seem to be hanging and waiting for each other to do something
Monitoring network traffic also shows no spikes or packages being dropped or any errors
scp and sftp'ing files to the servers and between the servers has no problem
Connecting to a different database type with the same query has no issues
Rewriting the code in Java and running it had the same issues on the problem servers, but ran fine on the good servers

Any other ideas to help track down this issue would be appreciated.

I'm experiencing issues - odd ones - with msodbc 11.0.2270.0 on CentOS 6, especially with Django 1.7.5 (yes, I know there was a security release for Django today, but my testing has been on 1.7.5). I've eliminated some threading issues introduced by Django 1.7, but run into issues with SQLRowCount and SQLParam compatibility. I wonder if they might be related to the msodbc driver getting stuck on Django 1.7's new multithreading features, but have no definitive proof. Either way, continue to provide what you find, and perhaps we can triage the problem. — FlipperPA, Mar 10 '15 at 00:12
BTW, if anyone can find a link to the available options for the msodbc Red Hat driver for odbcinst.ini, a link would be appreciated. I can't find the options such as "Threading" and "UsageCount" in any documentation via Google, and those results I do find don't include any details, just configuration examples. — FlipperPA, Mar 10 '15 at 00:14
This is the closest I could find, but it appears to be more of the same with example configuration parameters (http://stackoverflow.com/questions/18531010/how-to-configure-microsoft%C2%AE-odbc-driver-11-for-sql-server%C2%AE-on-redhat-linux-with). They probably wrote the driver to auto-detect those options. May not be very helpful, I don't use MS stuff much myself. — Jjensen.mike, Mar 10 '15 at 20:23

score 0 · Accepted Answer · answered Mar 10 '15 at 20:14

The issue appears to have been driver interactions on SQL Server not playing well with CentOS. Changing driver settings preventing the queries from hanging, although results are still returned up to 4x faster on the best performing server when compared to the others. The lesson here seems to be that weird driver interactions will probably come up with trying to use a combination of MS and Linux systems.

Large select query hangs using Python and unixODBC connecting to SQL Server

1 Answers1