1

I'm having problems trying to insert strings containing UTF-8 encoded Chinese characters and punctuations into a SQL Server 2008 table (default installation) from my Delphi 7 application using Zeosdb native SQL Server library.

I remembered in the past I had problems inserting UTF8 string into SQL Server even using PHP and other methods so I believe that this problem is not unique to Zeosdb.

It doesn't happen all the time, some UTF8 encoded strings can get inserted successfully but some not. I can't figure out what is it in the string that caused the failure.

Table schema:

CREATE TABLE [dbo].[incominglog](
    [number] [varchar](50) NULL,
    [keyword] [varchar](1000) NULL,
    [message] [varchar](1000) NULL,
    [messagepart1] [varchar](1000) NULL,
    [datetime] [varchar](50) NULL,
    [recipient] [varchar](50) NULL
) ON [PRIMARY]

SQL statement template:

INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient]) 
VALUES('{N}', '{KEYWORD}', '{M}', '{M1}', '{TIMESTAMP}', '{NAME}')

The parameter {KEYWORD}, {M} and {M1} can contain UTF8 string.

For example, the following statement will return an error:

Incorrect syntax near 'é¢'. Unclosed quotation mark after the character string '全力克æœå››ç§å±é™©','2013-06-19 17:07:28','')'.

INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient]) 
VALUES('+6590621005', '题', '题 [全力克æœå››ç§å±é™© åšå†³æ‰«é™¤ä½œé£Žä¹‹å¼Š]', '[全力克æœå››ç§å±é™©','2013-06-19 17:07:28', '')

Note: Please ignore the actual characters as the utf8 encoding is lost after copy and paste.

I've also tried using NVARCHAR instead of VARCHAR:

CREATE TABLE [dbo].[incominglog](
    [number] [varchar](50) NULL,
    [keyword] [nvarchar](max) NULL,
    [message] [nvarchar](max) NULL,
    [messagepart1] [nvarchar](max) NULL,
    [datetime] [varchar](50) NULL,
    [recipient] [varchar](50) NULL
) ON [PRIMARY]

And also tried amending the SQL statement into:

INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('{N}',N'{KEYWORD}',N'{M}',N'{M1}','{TIMESTAMP}','{NAME}')

They don't work either. I would appreciate any pointer. Thanks.

EDITED: As indicated by marc_s below, the N prefix must be outside the single quotes. It is correct in my actual test, the initial statement is a typo, which I've corrected.

The test with the N prefix also returned an error:

Incorrect syntax near '原标é¢'. Unclosed quotation mark after the character string '全力克æœ?å››ç§?å?±é™©','2013-06-19 21:22:08','')'.

The SQL statement:

INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('+6590621005',N'原标题',N'原标题 [全力克�四��险 �决扫除作风之弊]',N'[全力克�四��险','2013-06-19','')

. .

REPLY TO gbn's Answer: I've tried using parameterized SQL but still encountering "Unclosed quotation mark after the character string" error.

For the new test, I used a simplified SQL statement:

INSERT INTO INCOMINGLOG ([Keyword],[Message]) VALUES(:KEYWORD,:M)

The error returned for the above statement:

Incorrect syntax near '原标é¢'. Unclosed quotation mark after the character string '')'.

For info, the values of KEYWORD and M are:

KEYWORD:原标题

M:原标题 [

. . .

Further tests on 20th June Parametarized SQL query don't work so I tried a different approach by trying to isolate the character that caused the error. After trial and error, I managed to identify the problematic character.

The following character produces an error: 题

SQL Statement: INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题')

Interestingly, note that the string in the return error tax contains a "?" character which didn't exist in the original statement.

Error: Unclosed quotation mark after the character string '�)'. Incorrect syntax near '�)'.

If I were to place some latin characters immediately after the culprit character, there will be no error. For example, INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题Ok') works ok. Note: It doesn't work with all characters.

Joshua
  • 1,709
  • 2
  • 24
  • 38
  • Does the UTF-8 string which you use as replacement for `{KEYWORD}` contain a quotation mark? The error message seems to indicate this. I recommend to use a parametrized query where such errors can not happen. – mjn Jun 19 '13 at 10:53
  • 2
    Parametrized queries will also prevent SQL injection. – Jens Mühlenhoff Jun 19 '13 at 11:24
  • hi mjn, I've doubled up all quotes. I suspect that quotes might not be the only illegal characters in UTF8 string. – Joshua Jun 19 '13 at 11:44
  • Your SQL `INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题')` works for me in SSMS – gbn Jun 20 '13 at 06:39
  • @gbn I'm not sure but I think it's not the same when you copy to SSMS, perhaps the encoding changes. The only way to test is using code - I've created PHP code that you can download from [here](http://www.2shared.com/file/ICYYU57A/sqlinserttest.html). There are 2 PHP files, one with the statement that works and another which fails. By the way, this article says that MSSQL doesn't support UTF-8 - [article link](http://codex.galleryproject.org/Gallery2:MSSQL). Previously, before using UTF-8, I converted the UTF-8 to widestring, but they appear as question marks when viewed from SSMS. – Joshua Jun 20 '13 at 15:06

1 Answers1

3

There are ' characters in the UTF-8 which abnormally terminate the SQL.

Classic SQL injection.

Use proper parametrisation, not string concatenation basically.

Edit, after Question updates...

Without the Delphi code, I don't think we can help you
All SQL side code works. For example, this works in SSMS

DECLARE @t TABLE ([Keyword] nvarchar(100) COLLATE Chinese_PRC_CI_AS);
INSERT INTO @t ([Keyword]) VALUES('题');
INSERT INTO @t ([Keyword]) VALUES(N'题');
SELECT * FROM @t T;

Something is missing to help us fic this

Also see

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
  • Thanks gbn, I've tried using parameterized SQL, but still the same error, please check my comment in the original question. – Joshua Jun 19 '13 at 17:30
  • I like point out that my SQL Server db is using Chinese_PRC_CI_AS collation. I've no issues inserting the same data in GBK encoding manually using SQL management studio - but my software inserts using UTF8. – Joshua Jun 20 '13 at 02:58
  • @Joshua: Can you add the Delphi code please. This could be a client error, not a SQL error as such – gbn Jun 20 '13 at 06:35
  • I've created another question for this, from where there's a link which you can download a test program I made - click [here](http://stackoverflow.com/questions/17230419/unable-to-insert-unicode-into-mssql-2008-using-delphi-zeoslib-and-delphi-7). To reproduce the problem, uncomment "//inserted as utf8" and remove the "ok" in the editbox. – Joshua Jun 21 '13 at 08:11