Questions tagged [utf-8]

UTF-8 is a character encoding that describes each Unicode code point using a byte sequence of one to four bytes. It is backwards-compatible with ASCII while still supporting representation of all Unicode code points.

UTF-8 is a that can describe the set of code points in byte sequences of one to four bytes.

UTF-8 is the most widely used character encoding, and is recommended for use on the Internet. It is the standard character encoding on and other recent -like operating systems. It was designed to be backwards-compatible with while still supporting representation of all Unicode code points.

The algorithm for encoding code points in UTF-8 is described in RFC 3629.

Related tags

22178 questions
8
votes
2 answers

Hive Utf-8 Encoding number of characters supported?

Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly. Actual Data:- Data Inserted in hive I changed the encoding of the table to…
Chetan Pulate
  • 503
  • 1
  • 7
  • 21
8
votes
2 answers

Jekyll says Liquid Exception: invalid byte sequence in US-ASCII in documentation.html

I'm trying to run jekyll build on GitLab CI. This is my .gitlab-ci.yml: pages: script: - export LC_ALL=en_US.UTF-8 - export LANG=en_US.UTF-8 - gem install jekyll - jekyll build --destination public artifacts: paths: -…
Fez Vrasta
  • 14,110
  • 21
  • 98
  • 160
8
votes
2 answers

What encoding scheme should be used in a web project?

We are building a (Java) web project with Eclipse. By default Eclipse uses Cp1252 encoding on Windows machines (which we use). As we also have developers in China (in addition to Europe), I started to wonder if that is really the encoding to use. My…
Tuukka Mustonen
  • 4,722
  • 9
  • 49
  • 79
8
votes
1 answer

R: Change character encoding of columns in data frame

I'm investigating how the character encoding affects sorting. My question here is: How I can change a single column of a data frame to a different character encoding? For context, I will include several extra steps at the bottom. 1) Create the data…
Bobby
  • 1,585
  • 3
  • 19
  • 42
8
votes
3 answers

Are 6 octet UTF-8 sequences valid?

Can UTF-8 encode 5 or 6 byte sequences, allowing all Unicode characters to be encoded? I'm getting conflicting standards. I need to be able to support every Unicode character, not just those in the U+0000..U+10FFFF range. (All quotes are from RFC…
Patrick Niedzielski
  • 1,194
  • 1
  • 8
  • 21
8
votes
2 answers

The ultimate emoji encoding scheme

This is my environment: Client -> iOS App, Server ->PHP and MySQL. The data from client to server is done via HTTP POST. The data from server to client is done with json. I would like to add support for emojis or any utf8mb4 character in general.…
8
votes
2 answers

Werkzeug raises BrokenFilesystemWarning

I get the following error when I send form data to my Flask app. It says it will use the UTF-8 encoding, but the locale is already UTF-8. What does this error…
Arti
  • 7,356
  • 12
  • 57
  • 122
8
votes
1 answer

Python reversing an UTF-8 string

I'm currently learning Python and as a Slovenian I often use UTF-8 characters to test my programs. Normally everything works fine, but there is one catch that I can't overtake. Even though I've got encoding declared on the top of the file it fails…
Denis Črnič
  • 117
  • 2
  • 9
8
votes
5 answers

Illegal UTF-8 sequence connecting with postgreSQL database

I have the following code to connect to the database String host = "jdbc:postgresql://localhost:5432/name"; String username = "user"; String password = "pass"; Connection c = null; try { …
8
votes
4 answers

polish characters utf8 dont show right

Currently my site supports English, portuguese, swedish and polish. But for some reason some polish characters dont show right, like Zal�z konto it should look like this Zalóz konto I have this // Send the Content-type header in case the web server…
Remy
  • 89
  • 1
  • 1
  • 2
8
votes
5 answers

Laravel 5.1 utf-8 saving to database

I'm trying to save a record to database. When get value from input and save it to database there is no problem, like : $request->input('name') is an input with value of 'سلام' $provider->name = $request->input('name'); $provider->copyright_email =…
A. Najafi
  • 103
  • 1
  • 1
  • 4
8
votes
2 answers

Searching a SQLite database which contains cyrillic data

I have a problem searching my SQLite database, which contains data written with cyrillic characters. If the key word is also cyrillic, then everything is ok, but if not, then I can`t get the result in my Android application. Does anyone have an…
user383295
  • 81
  • 2
8
votes
4 answers

Problems displaying French accented characters in UTF-8

I'm working on a French language site built in CakePHP. I have tried multiple functions to try and convert the text into UTF-8 and display properly, but have had no success so far - any accented letters are displaying as a black diamond with a…
igniteflow
  • 8,404
  • 10
  • 38
  • 46
8
votes
2 answers

Encoding servlets with UTF-8 on WildFly

I used to run my JavaEE applications on GlassFish server, and there was no problem with the encoding type (UTF-8) since I added the following property in JVM Settings of the server: file.encoding = UTF-8 Now, I'm using WildFly server instead, and…
Samir
  • 111
  • 1
  • 1
  • 6
8
votes
3 answers

properly logging unicode & utf-8 exceptions in python 2

I'm trying to log various exceptions from libraries in python 2.7. I find that sometimes the exceptions contain a unicode string and sometimes a utf8 bytestring. I thought that logging.exception(e) was the right approach to log them, but the…
Jörn Hees
  • 3,338
  • 22
  • 44
1 2 3
99
100