6

The company I work for is bidding on a project that will require our eCommerce solution to accept simplified Chinese input. After doing a bit of research, it seems that ASP.net makes globalization configuration easy:

<configuration>
  <system.web>
    <globalization
      fileEncoding="utf-8"
      requestEncoding="utf-8"
      responseEncoding="utf-8"
      culture="zh-Hans"
      uiCulture="en-us" />
  </system.web>
</configuration>

Questions:

  1. Is this really all there is to it in ASP.net? It seems to good to be true.
  2. Are there any DB considerations with SQL Server 2005? Will the DB accept the simplified Chinese without additional configuration?
Makoto
  • 104,088
  • 27
  • 192
  • 230
James Hill
  • 60,353
  • 20
  • 145
  • 161
  • 3
    I believe you will have to change field types in your DB schema from char, varchar, and text to nchar, nvarchar, and ntext to support unicode characters. I'm not sure about the ASP.NET part. – RMorrisey Mar 23 '12 at 18:54
  • Why not to use Open Source eCommerce platform instead to do from scratch like this one : http://demo.aspxcommerce.com Codeplex: http://aspxcommerce.codeplex.com –  May 22 '13 at 09:15

2 Answers2

6

Ad 1. The real question is, how far you want to go with Internationalization. Because i18n is not only allowing Unicode input. You need at least support local date, time and number formats, local collation (mostly related to sorting) and ensure that your application runs correctly on localized Operating Systems (unless you are developing Cloud aka hosted solution). You might want to read more on the topic here.

As far as support for Chinese character input goes, if you are going to offer software in China, you need to at least support GB18030-2000. To do just that, you need to use proper .Net Framework version - the one that supports Unicode 3.0. I believe it was supported since .Net Framework 2.0.
However, if you want to go one step further (which might be required for gaining competitive edge), you might want to support GB18030-2005. The only problem is, the full support for these characters (CJK Unified Ideographs Extension B) happened later (I am not really sure if it is Unicode 6.0 or Unicode 6.1) in the process. Therefore you might be forced to use the latest .Net Framework and still not be sure if it covers everything.
You might want to read Unicode FAQ on Han characters.

Ad 2. I strongly advice you not to use SQL Server 2005 with Chinese characters. The reason is, old SQL Server engine supports only UCS-2 rather than UTF-16. This might seems as slight difference, but that really poses the problem with 4-byte Han Ideographs. Actually, you want be able to use them in queries (i.e. LIKE or WHERE clauses) - you will receive all records. That's how it works. And to support them, you would need to set very specific Chinese collation, which will simply break support for other languages.
Basically, using SQL Server 2005 with Chinese Ideographs is a bad idea.

Community
  • 1
  • 1
Paweł Dyda
  • 18,366
  • 7
  • 57
  • 79
2

First off, I wonder if you are you sure that you picked the right culture identifier with zh-Hans, which is a neutral culture. Perhaps it would be more appropriate for you to target a specific culture, such as zh-CN (Chinese being used in China) if that is the market you are aiming to support.

Secondly, using the web.config file to set the culture is fine if you are planning a deployment that is exclusively targeting this culture. Often you'll want one same deployment to dynamically adapt to the end user's culture, in which case you would programmatically set the Thread.CurrentCulture (and even Thread.CurrentUICulture if you are providing localized resources) based for example on a URL scheme (e.g. www.myapp.com would use en-US and www.myapp.com/china would use zh-CN) or the accept-languages header or an in-app language selector.

Other than the Unicode limitations that Paweł refers to (which mean that you may really need to use the latest .NET Framework/SQL Server), there isn't anything specific you should need to do for simplified Chinese -- if you follow standard internationalization guidelines you should be all set. Perhaps you should consider localizing (translating) your app into Chinese as part of this, by the way.

About SQL Server, Paweł's points seem pretty clear. That said, so long as you use nvarchar datatypes (Unicode) and you don't run queries on these columns or sort them based on these columns on the DB side, I'd be surprised if you had any issues on SQL Server 2005. So it really depends what you do with this data.

user930067
  • 276
  • 3
  • 10
Clafou
  • 15,250
  • 7
  • 58
  • 89
  • 1
    Surely, you meant zh-CN, not Chinese-Switzerland :) The reason why zh-Hans (Chinese, Simplified Han) and zh-Hant (Chinese, Traditional Han) was these writing systems are in use in more than one country or territory (China and Singapore use zh-Hans, but Taiwan, Hong-Kong and Macao uses zh-Hant). With your suggestion, one would need to use one of the specific cultures (zh-CN, zh-SG, zh-TW, zh-MO or zh-HK) as CurrentUICulture, which frankly does not make sense (the very purpose for creating zh-Hans and zh-Hant was to avoid this situation). You should use specific cultures only for CurrentCulture. – Paweł Dyda Apr 01 '12 at 20:37
  • :) +1 for Chinese-Switzerland! That said, in the original post zh-Hans is specified for the culture attribute (Thread.CurrentCulture), not UICulture, and using a neutral culture in this case is not a good idea (you get an exception if you try to access RegionInfo, and you get defaults that might not be appropriate). There are differences between cultures across regions and trying to hide behind a neutral culture to pretend that those differences don't exist is worse than actually picking one (or more) and knowing what you're targetting. – Clafou Apr 02 '12 at 09:15