1

My locale encoding is 'gbk' in other programming tools as I am a simplified Chinese user. But in Python, it is 'cp936'. I find that 'cp936' may be the same as 'gbk', because what 'gbk' can||can not decode also can||can not be decoded in 'cp936'... So, what is the difference between 'gbk' and 'cp936'? And, what is more, why Python uses 'cp936' instead of 'gbk'?

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
罗泽轩
  • 1,603
  • 2
  • 14
  • 19

1 Answers1

3

You may find this helpful: https://stackoverflow.com/a/3888653/4323 - this question is complicated by the fact that there seem to be some bugs in the Python implementations of some of the code pages in the family of GBK, CP936, and GB 18030, possibly related to a late change from Microsoft to support the Euro symbol.

Overall the differences appear to be minor, with the Euro sign being added to CP936 (by Microsoft) which is not in GBK (and possibly not in Python's CP936 either, making it even more similar to GBK). You didn't mention your platform, so exactly which GBK you have is not clear, but if your code is working fine across GBK and CP936, it's not surprising, and you're probably good to go.

Community
  • 1
  • 1
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Thank you! I have just known that 'cp936' means Codepage 936, as one encoding for simplified Chinese. [a short introduce to cp936 in msdn](http://msdn.microsoft.com/en-US/goglobal/cc305153.aspx) – 罗泽轩 Jun 08 '13 at 14:50