Questions tagged [encoding]

Encoding is a set of predefined rules to reversibly transform a piece of information in a certain representation into a completely different representation. The other way round is called decoding. This tag is rather generic, but it is mainly used for binary encoding schemes such as base 64 and hexadecimal.

There are a lot of different applications:

  • which is how the computer represents characters like a and , which humans can recognize, into bytes, which computers can recognize.
  • which is used to transform between videos and bytes.
  • which is used to transform between plain text and valid URIs. Also known as .
  • which is used to transform between plain text and valid XML.
  • which is used to compress/decompress bytes.
24174 questions
68
votes
7 answers

How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can't use a string pattern on a bytes-like object

I'm trying to open a webpage using urllib.request.urlopen() then search it with regular expressions, but that gives the following error: TypeError: can't use a string pattern on a bytes-like object I understand why, urllib.request.urlopen()…
kryptobs2000
  • 3,289
  • 3
  • 27
  • 30
67
votes
3 answers

Ruby read CSV file as UTF-8 and/or convert ASCII-8Bit encoding to UTF-8

I'm using ruby 1.9.2 I'm trying to parse a CSV file that contains some French words (e.g. spécifié) and place the contents in a MySQL database. When I read the lines from the CSV file, file_contents = CSV.read("csvfile.csv", col_sep: "$") The…
user141146
  • 3,285
  • 7
  • 38
  • 54
67
votes
9 answers

PHP json encode - Malformed UTF-8 characters, possibly incorrectly encoded

I'm using json_encode($data) to an data array and there's a field contains Russian characters. I used this mb_detect_encoding() to display what encoding it is for that field and it displays UTF-8. I think the json encode failed due to some bad…
sparkmix
  • 2,157
  • 3
  • 25
  • 33
66
votes
5 answers

Why does MySQL use latin1_swedish_ci as the default?

Does anyone know why latin1_swedish is the default for MySQL. It would seem to me that UTF-8 would be more compatible right? Defaults are usually chosen because they are the best universal choice, but in this case it does not seem thats what they…
Metropolis
  • 6,542
  • 19
  • 56
  • 86
66
votes
3 answers

Best output type and encoding practices for __repr__() functions?

Lately, I've had lots of trouble with __repr__(), format(), and encodings. Should the output of __repr__() be encoded or be a unicode string? Is there a best encoding for the result of __repr__() in Python? What I want to output does have…
Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
65
votes
11 answers

C# Help reading foreign characters using StreamReader

I'm using the code below to read a text file that contains foreign characters, the file is encoded ANSI and looks fine in notepad. The code below doesn't work, when the file values are read and shown in the datagrid the characters appear as squares,…
Scott
65
votes
5 answers

Scikit-learn's LabelBinarizer vs. OneHotEncoder

What is the difference between the two? It seems that both create new columns, which their number is equal to the number of unique categories in the feature. Then they assign 0 and 1 to data points depending on what category they are in.
65
votes
5 answers

Base64 Encoding Image

I am building an open search add-on for Firefox/IE and the image needs to be Base64 Encoded so how can I base 64 encode the favicon I have? I am only familiar with PHP
UnkwnTech
  • 88,102
  • 65
  • 184
  • 229
65
votes
6 answers

C# Encoding a text string with line breaks

I have a string I am writing to the outputstream of the response. After I save this document and open it in Notepad++ or WordPad I get nicely formatted line breaks where they are intended, but when I open this document with the regular old Windows…
jim
  • 26,598
  • 13
  • 51
  • 66
64
votes
3 answers

Encoding parameters for a URL

I have a Silverlight application that is building a URL. This URL is a call to a REST-based service. This service expects a single parameter that represents a location. The location is in the form of "city, state". To build this URL, I'm calling the…
user70192
  • 13,786
  • 51
  • 160
  • 240
64
votes
4 answers

How to GetBytes() in C# with UTF8 encoding with BOM?

I'm having a problem with UTF8 encoding in my asp.net mvc 2 application in C#. I'm trying let user download a simple text file from a string. I am trying to get bytes array with the following line: var x = Encoding.UTF8.GetBytes(csvString); but when…
Nebojsa Veron
  • 1,545
  • 3
  • 19
  • 36
64
votes
9 answers

How to spamproof a mailto link?

I want visitors to be able to click on (or copy) an email address directly on my webpage. However, if I could make it (a little bit) harder for bots and other crawlers to get said email address and register it in a spam list, it would be awesome. I…
Wookai
  • 20,883
  • 16
  • 73
  • 86
64
votes
3 answers

What is "=C2=A0" in MIME encoded, quoted-printable text?

This is an example raw email I am trying to parse: MIME-version: 1.0 Content-type: text/html; charset=UTF-8 Content-transfer-encoding: quoted-printable X-Mailer: Verizon Webmail X-Originating-IP: [x.x.x.x] =C2=A0test testing testing 123 What is…
TheSoftwareJedi
  • 34,421
  • 21
  • 109
  • 151
64
votes
7 answers

Mysql2::Error: Incorrect string value

I have a rails application running on production mode, but all of the sudden this error came up today when a user tried to save a record. Mysql2::Error: Incorrect string value More details (from production log): Parameters: {"utf8"=>"â<9c><93>" ...…
Trt Trt
  • 5,330
  • 13
  • 53
  • 86
64
votes
3 answers

How to convert a string to UTF8 in Ruby

I'm writing a crawler which uses Hpricot. It downloads a list of strings from some webpage, then I try to write it to the file. Something is wrong with the encoding: "\xC3" from ASCII-8BIT to UTF-8 I have items which are rendered on a webpage and…
ciembor
  • 7,189
  • 13
  • 59
  • 100