2

Can somebody provide some best practices when storing special characters such as the trademark (tm or r) or copyright (c)? I am storing them in a varchar field with other text in SQL Server, and displaying on an ASP.NET webpage. Right now we are storing the special character itself and displaying that.

Thanks for any help!

Mike Cole
  • 14,474
  • 28
  • 114
  • 194

4 Answers4

4

I am storing them in a varchar field ...

There's one problem, at least. For text that could have "special" characters, you need nvarchar.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • +1 Also it should be noted that it's unless you're expecting to allow "special characters" or "foreign language characters", you'll probably want to stick with a varchar especially if you're going to index by that column. For one thing varchars take up half as much space as nvarchars, and for another thing nvarchar query plans use a scan, whereas varchars use a seek. Seek is much faster than scan. If that makes any sense. – Ben Lesh Dec 02 '09 at 22:04
  • varchar only allows a single byte per character. "special" characters often require more than that. – Joel Coehoorn Dec 02 '09 at 22:24
  • But if I use the codes like © I will be fine with a varchar, right? – Mike Cole Dec 02 '09 at 23:01
  • Maybe for a while, but eventually someone will try to put unicode characters into your app. best to be prepared. – Joel Coehoorn Dec 03 '09 at 02:55
  • Depends on your RDBMS. Some of them support UTF-8-encoded varchars. I don't think MS SQL Server does, though. – dan04 Mar 09 '10 at 04:42
  • @dan04 Sql Server definitely does support UTF-8 (it just doesn't call it that) – Joel Coehoorn Mar 09 '10 at 14:39
2

Well, with those characters, generally you will render them as

©

So you don't need to do anything special in the DB, but you should be using the "N"-prefixed fields for DB strings, NChar, NVarChar, and so on.

Noon Silk
  • 54,084
  • 6
  • 88
  • 105
1

This in an old question but it deserves a new answer: use UTF-8. Browsers, databases and web servers now all support UTF-8 encoding which means that you don't have to encode special characters (except as discussed here, and then, only on the way out).

There are some excellent reasons not to encode data in your database in this answer.

Community
  • 1
  • 1
hawkip
  • 144
  • 1
  • 3
-2

Just use the HTML entities, such as ™ and © if you can.

There's no need to switch to nvarchar unless you have an explicit requirement to support Unicode.

RickNZ
  • 18,448
  • 3
  • 51
  • 66
  • Great, thanks for the advice. I suppose this can cause difficulties if the same text field that contains these HTML entities is used in a windows applications instead of a web application. – Mike Cole Dec 03 '09 at 14:21
  • Sure, depending on how the text is displayed. You could always use a browser window embedded in your Windows app, if you need to. – RickNZ Dec 03 '09 at 15:02
  • Why not just store the data as-is and convert it to HTML or JavaScript escapes as needed? – dan04 Mar 09 '10 at 04:40
  • @dan04: if varchar can be used instead of nvarchar, the associated data will only occupy half the amount of space in the database. Also, the question was for an ASP.NET web page, in which case it's generally easier and more reliable to fixup the data once on write than to make sure that it's properly converted to escapes every place it's read. – RickNZ Mar 09 '10 at 20:42