GB 18030
GB 18030 is a Chinese government standard, described as Information Technology — Chinese coded character set and defines the required language and character support necessary for software in China. GB18030 is the registered Internet name for the official character set of the People's Republic of China (PRC) superseding GB2312. As a Unicode Transformation Format (i.e. an encoding of all Unicode code points), GB18030 supports both simplified and traditional Chinese characters. It is also compatible with legacy encodings including GB2312, CP936, and GBK 1.0.
GB 18030 encoding layout. "Half codes" indicates codes used in pairs as four-byte codes. | |
MIME / IANA | GB18030 |
---|---|
Alias(es) | Code page 54936 |
Language(s) | International, but primarily meant for Chinese |
Standard | GB 18030-2022, GB 18030-2005, GB 18030-2000, GB 18030-2022 |
Classification | Unicode Transformation Format, extended ASCII, variable-width encoding, CJK encoding |
Extends | EUC-CN, GBK |
Transforms / Encodes | ISO 10646 (Unicode) |
Preceded by | GBK, GB2312 |
| |
In addition to the "GB18030 character encoding", this standard contains requirements about which scripts must be supported, font support, etc.
The updated standard GB18030-2022, is incompatible , and it had an enforcement date of 1 August 2023. It has been implemented ICU 73.2; and in Java 21, and backported to older Java 8, 11, 17 (LTS releases) and 20.0.2.
As of 2022, in terms of font implementations, "only the Simplified Chinese fonts of the Noto Sans CJK (Google), Source Han Mono (Adobe), and Source Han Sans (Adobe) typeface families are already compliant with GB 18030-2022 Implementation Level 2 [..] Microsoft YaHei (Microsoft), Noto Serif CJK (Google), PingFang (Apple), and Source Han Serif (Adobe)—at least the versions as of November 2022—require a small number of URO additions that are associated with Implementation Level 1 in order to become compliant with GB 18030-2022 Implementation Level 2."