0

I want to read a table from a query script in my computer.
I found the command to be simple:

with open(sql_file, 'r') as opened:
  query = alq.text(opened.read())

The first issue, has to do with encoding:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 
in position 7963: character maps to <undefined>

So we do what we do, search for an answer to add argument into command open(sql_file, encoding='utf8') and execute the query.

engine.execute(query)

And now the error is more tricky:

ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] 
[Microsoft][ODBC Driver 11 for SQL Server][SQL Server]
Incorrect syntax near '\ufeff'. (102) (SQLExecDirectW)") 
[SQL: "\ufeffWITH\n-- Some comment about the query\nQC_SCORE AS 
(\n  SELECT some_table.id\n  , other_table.column\n 

That is to not be reading the file correctly. Upon inspection I found that there is \ufeff at the beginning of the file and that the line breaks \n are not being translated.

Does anyone know how to go about it?

Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
Diego-MX
  • 2,279
  • 2
  • 20
  • 35
  • Can you execute the entire raw query with engine.execute? You have to manually handle the unicode chars like `ufeffWITH` here – mad_ Mar 07 '19 at 19:37
  • 1
    FEFF at the beginning of a file looks like BOM, or byte order mark. UTF-8 with BOM is a Microsoft bastardisation, since UTF-8 is an 8-bit codec and does not need BOM. Try using the utf-8-sig codec: https://stackoverflow.com/questions/13590749/reading-unicode-file-data-with-bom-chars-in-python/13591421, https://docs.python.org/3/library/codecs.html#encodings-and-unicode – Ilja Everilä Mar 07 '19 at 19:50
  • ¡¡¡Ahhh!!! Yes. Thanks @IljaEverilä. I spent hours trying to figure it out. – Diego-MX Mar 07 '19 at 22:38

1 Answers1

0

I had a similar issue while trying to parse an XML file that had special unicode characters. What I found as a viable workaround was just reading the file as a string, then using the string.replace() method. Something along this line:

text_query = r"\ufeff + "actual SQL code" text_query = text_query.replace(r'\ufeff', '').replace(r'\n', '')

RockAndRoleCoder
  • 320
  • 2
  • 10