SCENARIO:
I'm using Pandas.to_sql()
with parameter dtype={'COLUMN': NVARCHAR}
to upload COLUMN containing text with emojis to a MSSQL DB via FreeTDS. NVARCHAR is imported from sqlalchemy.types import NVARCHAR
. The COLUMN is fed as a DataFrame from an Excel file.
PROBLEM:
The strange thing is that for each emoji I put, a character at the end of the NVARCHAR column disappears.
I know that NVARCHAR has a max length of 4000, but how could it be reaching it with a text so short as:
" DUMMY TEXT The following four letters will be cut: abcd"
After upload:
" DUMMY TEXT The following four letters will be cut: "
I noticed that there is also some extra spacing between emojis after upload.
Is this problem caused by the emojis or should we be using another dtype
?
Thanks,
Doyuno
PS: The length of DUMMY TEXT doesn't seem to affect how many characters are truncated at the end of the sentence. I've tried with varying length of DUMMY TEXT and it always truncates as many letters as emojis there is.