8

I'm trying to pull out a list of table names from a MySQL database. The pertinent parts of the code are as follows:

conn = pyodbc.connect('...')
cursor = conn.cursor()
for table in cursor.tables():
    print table.table_name

For every table it prints a bunch of gibberish characters (boxes and question marks in diamonds). Using repr(table.table_name) it prints:

u'\U00500041\U004c0050\U00430049\U00540041\U004f0049'

for the table named "APPLICATION".

If you treat each 32 bit character as two 16 bit characters you get the string "PALPCITAOI". Swapping pairs of characters (1 and 2, 3 and 4, etc.) it produces "APPLICATIO" (missing the "N"). Presumably, it swapped N and the null character, resulting in the missing N (assuming null-terminated strings).

How can I get the table names out properly?

This is running under Python 2.4, PyODBC 2.1, and MySQL 5.0.22.

Jonathan
  • 1,050
  • 1
  • 12
  • 36
  • 4
    Looks like [this bug](http://code.google.com/p/pyodbc/issues/detail?id=78&can=1&q=UTF-16). Try using a newer version of PyODBC. – Thomas K Mar 28 '12 at 16:28
  • I have not been able to find a newer 32 bit version of PyODBC for Linux (CENTOS 5.1), but I found the source code here [http://code.google.com/p/pyodbc/downloads/list](http://code.google.com/p/pyodbc/downloads/list). Does anyone know where I can find a later version of a 32 bit RPM for PyODBC? – Jonathan Mar 28 '12 at 21:31
  • @Thomas K, I tried adding CHARSET=UTF8 to my connection string like: `cnxn = pyodbc.connect('DSN=localhost;CHARSET=UTF8')` and that worked on one of the problem areas. Thank you for your assistance! – Jonathan Mar 28 '12 at 21:36

2 Answers2

5

Adding CHARSET=UTF8 to the connection string worked and I now have the application up and running. For example:

change

cnxn = pyodbc.connect('DSN=localhost')

to

cnxn = pyodbc.connect('DSN=localhost;CHARSET=UTF8')

Thank you Thomas for pointing me in the right direction. Upgrading to a later version of PyODBC may also work.

Jonathan
  • 1,050
  • 1
  • 12
  • 36
2

This is a known issue that was fixed in pyodbc 2.1.8:

https://code.google.com/p/pyodbc/issues/detail?id=78&can=1&q=UTF-16

Unfortunately for Ubuntu LTS users the version available for Precise Pangolin is 2.1.7 where this is issue still shows itself. If you want to stick with LTS I've backported pyodbc 3.0.6 - you can find it here:

https://launchpad.net/~gazoakley/+archive/precise-backports

Gareth Oakley
  • 1,020
  • 1
  • 12
  • 34