0

There is a method called charCodeAt(position). But according to its documentation (and the console) it will return UTF8 code of a given character.

But I'd like to build a project basing on a single byte encoding (Windows 1251 in my case - Russian).

How can I calculate the code of a character in the encoding different to UTF8?

Roman Matveev
  • 563
  • 1
  • 6
  • 22
  • 3
    No, it will return the UTF-16 code unit, effectively. That's not the same as UTF-8. If you want to use a different encoding, you'll quite possibly need to include details of that encoding within your code - you shouldn't expect a Linux client to have Windows-1251 available, for example. – Jon Skeet Mar 27 '14 at 11:40
  • @Jon, You're probably right! In the docs specified unclear 'unicode' (as we know there are several unicodes). Can I make the conclusion that it's not a good idea to build non Latin site in coding different to any of unicode? – Roman Matveev Mar 27 '14 at 11:44
  • 1
    Well, there are several different versions of Unicode, but I don't think that's really what you were referring to. There are lots of different *encodings* of Unicode, of which UTF-8 is just one. But yes, I would *strongly* advise you to use UTF-8 as an encoding for any site. It's probably the most widely-supported full-Unicode encoding. – Jon Skeet Mar 27 '14 at 11:50
  • Using of unicode has several disadvantages: multibyte chars, weak support in PHP. As I know many sites for Russian region made in Windows 1251 encoding – Roman Matveev Mar 27 '14 at 12:01
  • What *exactly* do you mean by "multibyte chars"? Bear in mind that for ASCII characters, UTF-8 still uses a single byte per character. And I'd be *very* surprised if PHP still didn't have decent UTF-8 support. What *exactly* do you expect to cause problems? Using any Windows-specific encoding in this day and age sounds like a very bad idea to me. – Jon Skeet Mar 27 '14 at 12:03

0 Answers0