4

When searching for a string in our database where the column is of type nvarchar, specifying the 'N' prefix in the query nets some results. Leaving it out does not. I am trying the search for a Simplified Chinese string in a database that previously did not store any Chinese strings yet.

The EntityFramework application that uses the database, correctly retrieves the strings and the LINQ queries also work in the application. However, in SQL Server 2014 Management Studio, when I do a an SQL query for the string it does not show up unless I specify the 'N' prefix for unicode. (Even though the column is nvarchar type)

Works:

var text = from asd in Translations.TranslationStrings
            where asd.Text == "嗄法吖无上几"
            select asd;

            MessageBox.Show(text.FirstOrDefault().Text);

Does not work:

SELECT *
  FROM TranslationStrings
  where Text = '嗄法吖无上几'

If I prefix the Chinese characters with 'N' it works.

Works:

SELECT *
  FROM TranslationStrings
  where Text = N'嗄法吖无上几'

Please excuse the Chinese characters, I just typed something random. My question is, is there something I can do to not have to include the 'N' prefix when doing a query?

Thank you very much!

bluiska
  • 369
  • 2
  • 12
  • Hi please share your SQL instance collation configuration and SQL server version – Sanpas Apr 02 '19 at 09:39
  • 3
    What's the problem about the prefix, why don't you want to use it? It's just a minor syntax element as the single quotes around a literal, I don't see any harm or much effort in putting it. – sticky bit Apr 02 '19 at 09:39
  • (N)varchar literals use the default code base of the database if N prefix is not used. You can change the collation of the database with the ALTER DATABASE [...] COLLATE command. – DevilSuichiro Apr 02 '19 at 09:46
  • 1) Why you need to get ride of the `N` prefix? 2) Are you passing the string as a parameter or you just concatenate the string? 3) If you use parameters, what's the datatype of that parameter? – Ilyes Apr 02 '19 at 09:51
  • Don't think of it as some kind of prefix. Think of `N'` as a *different sort of opening quote*. You *presumably* wouldn't ask a question about how you could omit opening quotes from strings would you? – Damien_The_Unbeliever Apr 02 '19 at 10:24
  • @stickybit The problem is we have never had to use it before. From now on, everyone will have to remember this "new procedure" that if you are working with Chinese strings, you need to use N prefix otherwise your query will fail. – bluiska Apr 02 '19 at 10:37

2 Answers2

1

The default for .Net is Unicode, that's why you don't need to specify it. This is not the case for Sql Manager.

If not specified Sql will assume that you work with asci according to the collation specified in your DB.

Hence, when working from Sql Server you need to use N'

https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/

1

As @sworkalot has mentioned below:

The default for .Net is Unicode, that's why you don't need to specify it. This is not the case for Sql Manager.

If not specified Sql will assume that you work with asci according to the collation specified in your DB.

Hence, when working from Sql Server you need to use N'

https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/

Check out these examples, pay close attention to the data types and the values being assigned:

DECLARE @Varchar VARCHAR(100) = '嗄'
DECLARE @VarcharWithN VARCHAR(100) = N'嗄' -- Has N prefix

DECLARE @NVarchar NVARCHAR(100) = '嗄'
DECLARE @NVarcharWithN NVARCHAR(100) = N'嗄' -- Has N prefix


SELECT
    Varchar = @Varchar,
    VarcharWithN = @VarcharWithN,
    NVarchar = @NVarchar,
    NVarcharWithN = @NVarcharWithN

SELECT
    Varchar = CONVERT(VARBINARY, @Varchar),
    VarcharWithN = CONVERT(VARBINARY, @VarcharWithN),
    NVarchar = CONVERT(VARBINARY, @NVarchar),
    NVarcharWithN = CONVERT(VARBINARY, @NVarcharWithN)

Results:

Varchar VarcharWithN    NVarchar    NVarcharWithN
?       ?               ?           嗄

Varchar VarcharWithN    NVarchar    NVarcharWithN
0x3F    0x3F            0x3F00      0xC455

NVARCHAR data type stores 2 bytes for each character while VARCHAR only stores 1 (you can see this on the VARBINARY cast on the 2nd SELECT). Since chinese characters representation need 2 bytes to be stored, you have to use NVARCHAR to store them. If you try to stuff them in a VARCHAR it will be stored as ? and you will lose the original character information. This also happens on the 3rd example, because the literal doesn't have the N so it's converted to VARCHAR before actually assigning the value to the variable.

It's because of this that you need to add the N prefix when typing these characters as literals, so the SQL engine knows that you are typing characters that need 2 byte representation. So if you are doing a comparison against a NVARCHAR column always add the N prefix. You can change the database collation, but it's recommended to always use the proper data type independent of the collation so you don't have problems when using coding on different databases.

If you could explain the reason why you want to omit the N prefix we might address that, although I believe there is no work around in this particular case.

bluiska
  • 369
  • 2
  • 12
EzLo
  • 13,780
  • 10
  • 33
  • 38
  • Thank you for the answer! I was looking for a way to get rid of it to simplify the work for others who will need to work with the database. We have not had to do this in the past so from now on people will need to remember to do it. Thankfully, most of the interaction with the database happens through .NET which will work. (As mentioned in an answer from srworksalot) – bluiska Apr 02 '19 at 10:35
  • 1
    @bluiska and EzLo: Hello. The difference between `VARCHAR` and `NVARCHAR` is the size of the "units" that are used in combinations to create characters in a particular encoding. `VARCHAR`, an 8-bit type, creates characters in combinations of 1 - 4 bytes, but always 1 byte units. `NVARCHAR`, a 16-bit type, creates characters in either 2 or 4 bytes, but always 2 byte units. For details please read my post: [How Many Bytes Per Character in SQL Server: a Completely Complete Guide](https://sqlquantumleap.com/2019/11/22/how-many-bytes-per-character-in-sql-server-a-completely-complete-guide/) – Solomon Rutzky Apr 14 '21 at 05:05