40

What data type should I use to store HTML content in SQL Server 2008?

It's for dynamic content for a CMS.

Petrus Theron
  • 27,855
  • 36
  • 153
  • 287

3 Answers3

53

VARCHAR(MAX) if it's all going to be ascii-based, say for basic HTML tepmplates

NVARCHAR(MAX) if the HTML could contain any content

NVARCHAR will double your storage use as it uses double the amount of space as VARCHAR. HTML itself does not require NVARCHAR, only the content in-between the HTML tags could based on the language, etc..

Edit:

Many years on from giving this answer I almost always use NVARCHAR now if there is any between the tag content. Unicode is popular...

I only use VARCHAR if just storing simple html templates, eg tags and placeholders
eg: <div><span>[PLACEHOLDER]</span><div>

Make the call based on your use-case..

Dave Sumter
  • 2,926
  • 1
  • 21
  • 29
  • 1
    Ah - no. That is a good question, but standard HTML would require that content "basd on language" to encode special chars ;) No unicode in HTML, sorry. Normal Ascii set only. – TomTom Apr 24 '11 at 20:31
  • Damn, and I've been storing HTML in VARCHAR all these years.. ;-) – Dave Sumter Apr 24 '11 at 20:32
  • Because HTML has only ascii characters, as I say. All special langauge characers must be encoded ;) So, the db only sees ASCII.... OR it is not HTML ;) – TomTom Apr 24 '11 at 20:35
  • Chosen as answer because valid HTML should not contain (unencoded) Unicode and my HTML content is guaranteed to be valid. Hence, `VARCHAR(MAX)`. – Petrus Theron Apr 25 '11 at 11:59
  • 4
    According to the Wikipedia article [Unicode and HTML](http://en.wikipedia.org/wiki/Unicode_and_HTML) the HTML standard extended the document character set from ISO-8859-1 to ISO 10646 and asserts (parenthetically) that that character set "... is basically equivalent to Unicode". – Kenny Evitt Apr 30 '12 at 20:17
  • 2
    Upvote @KennyEvitt, so the type needs to be `NVarChar` unless you will only store HTML documents with a "external character encoding" or "charset" which is a sub set of ASCII. In the later case, all graphemes represented in the document will be ASCII characters so `VarChar` will be safe. – Jodrell Feb 05 '14 at 12:27
15

Put it in an NVARCHAR(MAX) (or smaller).
HTML is no different from other text.

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
0

Data type should I use to store HTML content in SQL Server 2008

You must used Nvarchar(Max) if you want to store multiple language text.

otherwise use VARCHAR(MAX) if you want to store English language text.

nvarchar(max) contain more amount of spance compare to varchar(max)

Sangeet Shah
  • 3,079
  • 2
  • 22
  • 25