Questions tagged [utf-8]

UTF-8 is a character encoding that describes each Unicode code point using a byte sequence of one to four bytes. It is backwards-compatible with ASCII while still supporting representation of all Unicode code points.

UTF-8 is a that can describe the set of code points in byte sequences of one to four bytes.

UTF-8 is the most widely used character encoding, and is recommended for use on the Internet. It is the standard character encoding on and other recent -like operating systems. It was designed to be backwards-compatible with while still supporting representation of all Unicode code points.

The algorithm for encoding code points in UTF-8 is described in RFC 3629.

Related tags

22178 questions
9
votes
6 answers

An error occurred while installing rake (10.1.0), and Bundler cannot continue

Today I've reinstalled my Mac and I had to reinstall rails etc too. Now I've set up everything correctly ( at least I hoped ), but I keep running into a very annoying error. $ bundle install Fetching gem metadata from…
Stefan
  • 434
  • 1
  • 5
  • 10
9
votes
3 answers

How set UTF-8 in PDO class constructor for PHP PgSQL database

I want to set UTF8 for my PDO object. This class works correctly with MySQL. I can't find an analog of array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES UTF8") for PgSQL and I can't work with cyrillic symbols. class oop{ private $host="localhost"; …
vili
  • 317
  • 3
  • 6
  • 14
9
votes
2 answers

What string should be used to specify encoding in Perl POD, "utf8", "UTF-8" or "utf-8"?

It is possible to write Perl documentation in UTF-8. To do it you should write in your POD: =encoding NNN But what should you write instead NNN? Different sources gives different answers. perlpod says that that should be =encoding utf8 this…
bessarabov
  • 11,151
  • 10
  • 34
  • 59
9
votes
8 answers

Detect UTF-16 file content

Is it possible to know if a file has Unicode (16-byte per char) or 8-bit ASCII content?
Franck Freiburger
  • 26,310
  • 20
  • 70
  • 95
9
votes
4 answers

StrRev() Dosent Support UTF-8

I'm trying to make a code that replace Arabic text to be supported in non Arabic supported programs in that i will be need to reverse the text after replace but its shows some garbage stuff instead of the wanted result Here Is The Code…
Ali Almoullim
  • 1,028
  • 9
  • 30
9
votes
3 answers

Php email body decoding to plain

I'm tying extract some content of some equal emails with php but I can't. With that: $body = imap_body($imap_o, $email_n); I get: Pour = le r=E9cup=E9rer, il suffit de le t=E9l=E9charger, de le r=E9ceptionner puis de l=92ouvrir.=Une f= ois votre…
ganlub
  • 149
  • 1
  • 2
  • 12
9
votes
1 answer

Python codecs line ending

It seems Python's UTF-8 encoding (codecs package) interprets Unicode characters 28, 29, and 30 as line endings. Why? And how can I prevent it from doing so? Example code: with open('unicodetest.txt', 'w') as f: …
Paul
  • 766
  • 9
  • 28
9
votes
2 answers

UTF-8 encoding for subject in contact form email

On this sites Website Link contact form I need to send the subject for email in UTF-8. Where in the code we need to do declare the UTF-8 encoding? kontakt.php:
Daniel Hernandez
  • 577
  • 2
  • 8
  • 18
9
votes
1 answer

UTF-8 character is not proper in JOptionPane

Please find the below sample code, The UTF-8 character properly displaying in windows machine. But, its not proper for Linux machine (Ubuntu). import javax.swing.JOptionPane; public class JContPaneTest { public static void main(String[] args)…
sprabhakaran
  • 1,615
  • 5
  • 20
  • 36
9
votes
5 answers

encoding to UTF-8 in email

I have a client that is receiving email incorrectly encoded. I am using the System.Net.Mail class and setting the body encoding to UTF-8. I have done a bit of reading and since I have to set the body of the email as a string encoding the data to a…
Rob Allen
  • 2,871
  • 2
  • 30
  • 48
9
votes
3 answers

UTF-8 encoding with form post and Spring Controller

I am trying to submit a form, which has UTF8 characters inside it. The form looks like this:
>
Bulbasaur
  • 696
  • 1
  • 14
  • 22
9
votes
2 answers

Reading/writing/printing UTF-8 in C++11

I have been exploring C++11's new Unicode functionality, and while other C++11 encoding questions have been very helpful, I have a question about the following code snippet from cppreference. The code writes and then immediately reads a text file…
Ephemera
  • 8,672
  • 8
  • 44
  • 84
9
votes
2 answers

How to emulate word boundary when using unicode character properties?

From my previous questions Why under locale-pragma word characters do not match? and How to change nested quotes I learnt that when dealing with UTF-8 data you can't trust \w as word-char and you must use the Unicode character property \p{Word}. Now…
w.k
  • 8,218
  • 4
  • 32
  • 55
9
votes
1 answer

UTF-16 to UTF-8 conversion in JavaScript

I have Base64 encoded data that is in UTF-16 I am trying to decode the data but most libraries only support UTF-8. I believe I have to drop the null bites but I am unsure how. Currently I am using David Chambbers Polyfill for Base64, but I have also…
Don P
  • 570
  • 1
  • 5
  • 12
9
votes
1 answer

Incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string) on Heroku

I have a Rails application where I use regex-based rules to categorize transactions. In my seeds.rb, I create some categories and rules, then import transactions from a CSV file (also utf8-encoded) and allow them to be categorized. This process…
cayblood
  • 1,838
  • 1
  • 24
  • 44