i'm facing an encoding issue with my python code, when asking data that are in SQL Server 2005.
(because i was unable to compile PyMSSQL-2.0.0b1) i'm using this piece of code and i am able to do some select but now i stick with the issue that i do not know what SQLCMD is output-ting to me :(
(i had to work with European language contained in table, so i had to face other encodings with accent and so on)
for example :
- when i read it (select) from the Ms SQLServer Management Studio i have this country name : 'Ceská republika' (note the first a is with acute on it)
- when using it from SQLCMD from command line (Powershell in Windows 7), it is still ok, i can see the "Cesk'a with acute'"
now when using Python with the os.popen trick from the recipe, that is with this connection string :
sqlcmd -U adminname -P password -S servername -d dbname /w 8192 -u
i get this string : 'Cesk\xa0 republika'
notice the \xa0 that i do know what encoding it is, and how i can pass from this \xa0 to {a with acute}...
if i test from Python, and unicode i should have this one '\xe1'
>>> unicode('Cesk\xa0 republika')
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
unicode('Cesk\xa0 republika')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 4: ordinal not in range(128)
>>> unicode_a_with_acute = u'\N{LATIN SMALL LETTER A WITH ACUTE}'
>>> unicode_a_with_acute
u'\xe1'
>>> print unicode_a_with_acute
á
>>> print unicode_a_with_acute.encode('cp1252')
á
>>> unicode_a_with_acute.encode('cp1252')
'\xe1'
>>> print 'Cesk\xa0 republika'.decode('cp1252')
Cesk republika
>>> print 'Cesk\xa0 republika'.decode('utf8')
Traceback (most recent call last):
File "<pyshell#21>", line 1, in <module>
print 'Cesk\xa0 republika'.decode('utf8')
File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 4: invalid start byte
so what SQLCMD is giving to me? How should i force it and/or os.popen and others to be sure that i have understandable utf8 for Python?
(notice, i have tried both with and without the -u ending on the os.popen cmd for SQLCMD and that should stand for asking to SQLCMD to answer in unicode, with no effect, also i have tried to feed it with a "select" python string encoded in utf8 with no more success :
sqlstr = unicode('select * from table_pays where country_code="CZ"')
cu = c.cursor
lst = cu.execute(sqlstr)
rows = cu.fetchall()
for x in rows:
print x
( 'CZ ', 'Cesk\xa0 republika ')
)
another point : from what i googl-ed, about "sqlcmd.exe", there are also these parameters that could may be help :
[ -f < codepage > | i: < codepage > [ < , o: < codepage > ] ]
but i was unable to specify the right one, i do not know what are the possible values, BTW using (or not using) the :
[ -u unicode output]
dit not help me also...