-1

I need to get the complete list of UTF-8 charset from some authorized locations.

I've tried looking at Unicode.org but with not info.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Muhammad Hewedy
  • 29,102
  • 44
  • 127
  • 219

2 Answers2

2

The set of characters representable in UTF-8 is exactly the same as the set of Unicode characters.

The authority is the Unicode Consortium. If you specifically want a list, then you can visit http://www.unicode.org/ucd/ which links to the Unicode Character Database, containing the large list http://www.unicode.org/Public/UNIDATA/UnicodeData.txt . It is not a completely explicit list, though, since some wide ranges are indicated just as ranges, e.g.

AC00;<Hangul Syllable, First>;Lo;0;L;;;;;N;;;;;
D7A3;<Hangul Syllable, Last>;Lo;0;L;;;;;N;;;;;

which means that characters from AC00 to D7A3 are all defined to be Hangul syllable characters (with names to be constructured algorithmically).

Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • Please see http://stackoverflow.com/questions/8732025/why-degree-symbol-differs-from-utf-8-from-unicode cause I am very very confused! – Muhammad Hewedy Jan 04 '12 at 18:34
1

This could help you: http://www.utf8-chartable.de/

Or perhaps this one: http://www.unicode.org/charts/charindex.html

fkerber
  • 1,032
  • 9
  • 24