Questions tagged [cp1252]

CP-1252 or Windows-1252 is a character encoding of the Latin alphabet.

The windows 1252 codepage is used by the windows operating system to display a number of latin based languages. This character set mimics the ISO 8859-1 (Latin-1) character set, except it varies with the characters in positions in the range of 128-159.

The set of languages represented by CP1252 include English, Spanish, and various Germanic/Scandinavian languages.

125 questions
2
votes
1 answer

Encode cp1252 string to utf-8 string in c#

How I can convert cp1252 string to utf-8 string in c#? I tried this code, but it doesn't work: Encoding wind1252 = Encoding.GetEncoding(1252); Encoding utf8 = Encoding.GetEncoding(1251); byte[] wind1252Bytes = ReadFile(myString1252); byte[]…
2
votes
0 answers

Filenames in CVS repository garbled due to encoding issues

The CVS repo is on a linux box, but the files checked in are from windows clients with names using æ/ø/å. I'm probably the first non-windows user to interact with that repo, and I didn't know about the pitfalls regarding encoding. What happened was…
anders
  • 772
  • 1
  • 10
  • 17
2
votes
1 answer

MySQL Convert latin1 to utf8, cp1252 0x80-0x9F wrong

Situation: The latin1 database has been dump as latin1, converted via iconv to utf8 and restored as utf8_unicode_ci. It seems every conversion went fine, excep those 0x80-0x9F from cp1252. i did not fully understand what mysql means by translating…
gantners
  • 471
  • 4
  • 16
2
votes
0 answers

How to encode after using HTML::Strip

I am trying to encode using cp1252 the HTML page since it has lot of special characters like € and £ pounds but when I save those contents after using HTML::Strip. Contents were displayed as junk values. I tried to encode using cp1252 but its not…
Jeya Kumar
  • 1,002
  • 1
  • 13
  • 36
2
votes
2 answers

"Raw" conversion from double-UTF-8 to UTF-8 (or from UTF-8 to ANSI)

I am dealing with a legacy file that has been encoded twice using UTF-8. For example, the codepoint ε (U+03B5) should had been encoded as CE B5 but has instead been encoded as C3 8E C2 B5 (CE 8E is the UTF-8 encoding of U+00CE, C2 B5 is the UTF-8…
gioele
  • 9,748
  • 5
  • 55
  • 80
2
votes
1 answer

Windows C API for UTF8 to 1252

I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like: UTF8 -> UTF16 -> 1252 I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single…
Paul
  • 21
  • 3
2
votes
1 answer

python unicode woes - convert cp1252 string to unicode

I think I'm just fundamentally confused about char sets that are not ascii. I have a python file that I have declared at the top to be # -*- coding: cp1252 -*-. In the file I have question = "what is your borther’s name", for…
stewart99
  • 14,024
  • 7
  • 27
  • 42
2
votes
1 answer

PHP Regex delimiter

For a long time, any time I've needed to use a regular expression, I've standardized on using the copyright symbol © as the delimiter because it was a symbol that wasn't on the keyboard that I was sure to not use in a regular expression, unlike ! @…
Force Flow
  • 714
  • 2
  • 14
  • 34
2
votes
1 answer

How to deal with Non-ASCII Warning when performing Save on Python code edited with IDLE?

I frequently edit Python code using IDLE and occasionally when I perform a Save I receive an I/O Warning. I am assuming that I have inadvertently added a Non-ASCII character, and I do not really want to declare the cp1252 encoding. Is there an easy…
PolyGeo
  • 1,340
  • 3
  • 26
  • 59
2
votes
1 answer

Why does backwards navigation in IE causes html attribute values to be encased in smart double quotes?

My page loads fine each time in all browsers, except in IE when I use the browser back button it's changing out the double quotes used for the value attibute of the option element to smart double quotes instead of straight ones. Loads correct…
johntrepreneur
  • 4,514
  • 6
  • 39
  • 52
2
votes
5 answers

How do I recognize a character such as "ç" as a letter?

I have an array of bytes that contains a sentence. I need to convert the lowercase letters on this sentence into uppercase letters. Here is the function that I did: public void CharUpperBuffAJava(byte[] word) { for (int i = 0; i < word.length;…
Daniel Pereira
  • 2,720
  • 2
  • 28
  • 40
2
votes
2 answers

Encoding cp-1252 as utf-8?

I am trying to write a Java app that will run on a linux server but that will process files generated on legacy Windows machines using cp-1252 as the character set. Is there anyway to encode these files as utf-8 instead of the cp-1252 it is…
IAmYourFaja
  • 55,468
  • 181
  • 466
  • 756
2
votes
1 answer

Adobe Font Metrics for Standard PDF Fonts in CP1252

I need the metrics for the 14 standard PDF fonts. I've download the following from Adobe, but it appears to use ISO-8859-1 encoding, rather than CP1252: https://partners.adobe.com/public/developer/en/pdf/Core14_AFMs.zip So it's missing code points…
xpsd300
  • 175
  • 2
  • 11
1
vote
1 answer

converting String from Windows charset to UTF 8 in Java

so I have to give some arguments to my Java app which is called from a .bat file. Doing this makes the arguments have the system's charset encoding, which makes some characters displayed wrongly. I tried this String titulo; titulo = new…
rMaero
  • 195
  • 4
  • 13
1
vote
0 answers

Node-Fetch API and Blob to convert Windows-1252 characters to utf-8

Node.js parsing a HTML page, with Windows-1252 characters. Using node-fetch and response.text() looses all accentuated characters (gives a "�" for any kind of diachritics). Instead, response.blob() keeps everything, then…
allez l'OM
  • 547
  • 4
  • 13
1 2
3
8 9