How to read different national languages(NLS) into python?

Question

I'm trying to read data from a Teradata table into python 3.6 and then insert that data into Oracle. It is basically an extract and load process as I'm not transforming the data. This Teradata table has columns where there are characters in many different languages. I'm using pyodbc to connect to Teradata and then cx_Oracle to connect to oracle. When I'm trying to unpack the data in python it looks like

[('\x1a\x1a\',)] and then upon insert the data into oracle it becomes .

I have set my environment variable as following

os.environ["NLS_LANG"] = '.AL32UTF8' and tried different decodings but nothing worked. Any idea how to solve it?

I can't help you with the how, but you need to break this down into separate steps to figure out where the issue is occurring. Is Python reading the data correctly, is it trying to write it correctly, is the table in in Oracle defined correctly? — Andrew, Nov 25 '19 at 17:12
What session character set are you using for the connection to extract the data from Teradata? It would appear to be ODBC driver's default ASCII (i.e. server LATIN) and probably needs to be UTF8. You might also need "Unicode Pass Through" feature if there are characters stored as surrogate pairs (e.g. emoji). — Fred, Nov 25 '19 at 17:52
You might also consider using *teradatasql* instead of pyodbc + Teradata ODBC driver (in which case the Teradata connection would necessarily be UTF8). — Fred, Nov 25 '19 at 17:58
@Fred for some reason using UTF8 in pyodbc connection session did not work on NLS characters. However, using teradatasql instead of pyodbc solved the problem. Thanks a lot! — Jit, Nov 25 '19 at 19:14

How to read different national languages(NLS) into python?

0 Answers0