Questions tagged [utf-8]

UTF-8 is a multibyte character encoding of the Unicode character set, made up of one or more bytes. Unlike some other encodings such as UTF-16, the UTF-8 encoding is upward compatible with 7-bit ASCII characters, and can be processed to some degree by applications that are only aware of bytes.

Full support of UTF-8 for searching, collation, word parsing, etc, does require support of Unicode concepts such as characters, normalisation, supplementary characters, etc. Many application and OS problems with "special characters" such as accented European letters, or ideographs such as used in Japanese or Chinese, derive from mismatched character encodings.

Related tags:

104 questions
2
votes
1 answer

motd with utf-8 encoding in Ubuntu Server 10.10

I've just changed default motd in Ubuntu (I edited /etc/update-motd.d/* files) and added some string in polish: echo "Aby uzyskać dodatkowe informacje i przykładowe skrypty wpisz:" but autogenerated /etc/motd is without accent characters: Aby…
klew
  • 723
  • 2
  • 11
  • 16
2
votes
3 answers

Can I convert my database/script to UTF-8?

How can I convert a database to support UTF-8 and convert it's old data from what ever encoding they're in to UTF-8 ? Extra Info: I'm running a server which has many websites on it, and one of them is running WHMCS (php script to manage hosting…
1
vote
2 answers

How apache can serve files with utf-8 chars?

From the browser I'm trying to access file https://example.com/86/86454cff-556a-4162-aa65-433158c133f4/Informacja+kwartalna++III+kwartał+2016+r. and I'm getting 404 error. When I check filesystem the file exist with encoded chars:…
Kangur
  • 241
  • 3
  • 6
1
vote
0 answers

Exim mainlog character encoding

In my exim4 installation I keep getting both UTF-8 encoded and extended ANSI encoded mainlog containing strings like "tämä" (correctly in UTF-8) and "t\xe4\m\xe4" (ANSI). The latter escaped markings are one-byte codes, escaping done for clarity…
karvonen
  • 111
  • 1
1
vote
2 answers

Handle UTF-8 files names in Centos 7 & Apache 2.4

My images whose names contain accented letters (é, à etc.) are not accessible via Apache (404 error). I think it's not related to Apache. I checked the configuration file: AddDefaultCharset UTF-8 When I connect via Putty, and run the command "ls…
AFA Med
  • 597
  • 2
  • 6
  • 15
1
vote
0 answers

LDAP at IBM XIV

I have problem with authenticating users over LDAP (Active Directory) Windows 2008 R2 on IBM XIV. After troubleshooting I've find out problems caused by polish letters in CommonName (distinguishedName contain CN) Users without polish letters in…
Curl User
  • 43
  • 1
  • 8
1
vote
0 answers

How to pass raw Ansible params to Windows cmd without any encoding/decoding?

I'm trying to set permissions to some windows directory like this # ansible example.com -m raw -a 'icacls D:\somedir\ /grant "! ЗАО. Руководство":F' -vvvvv and get windows error about invalid parameters No config file found; using defaults Loaded…
Vladimir Martsul
  • 132
  • 1
  • 2
  • 8
1
vote
0 answers

How to create a new MYSQL charset?

I am trying to configure roundcube webmail and I get the error "Can't initialize character set utf8 (path: /usr/share/mysql/charsets/))" I fixed this before on my website by adding ;charset=UTF-8 to the end of the "mysql:host=" line for my db…
Dan Hastings
  • 706
  • 1
  • 13
  • 24
1
vote
0 answers

CygWin SSHD and UTF-8 BOM

The CygWin ssh server configuration file "config_sshd" seems to need to be written minus a byte order marker. The server will accept UTF-8 but obviously becomes confused if it sees what it thinks is junk data at the top of the file. Is there any way…
user215539
1
vote
1 answer

Error in a collation for german characters

I would like to use the collation utf8_german2_ci in my linux server. The problem is that I want to use this character ß which is not supported in other collation I think. So I get this error ERROR 1273 (HY000): Unknown collation: utf8_german2_ci,…
1
vote
0 answers

iis 7.5 wont serve up korean characters in my classic asp application

I have a classic ASP website hosted in IIS 7.5 - using a classic application pool pipeline. I have all my .Net Globalization settings set to UTF-8 encoding. However when my site outputs Korean characters that it is getting directly from a database,…
Kemmis
  • 131
  • 1
  • 2
1
vote
1 answer

Is writing/reading speed affected by the name of the files?

How can I find out, or does anybody know, if is faster storing/reading a file with Spanish/Greek/Cyrilic/etc... characters like mi-foto-españa-oíóáaç.jpg than mi-foto-espana-oioaac.jpg ?
w0rldart
  • 217
  • 1
  • 2
  • 14
1
vote
1 answer

How to access filenames with accents (show up as ?) on Apache server?

I manage a website that is based on directory listings (dessin.acswift.com). The website is in French, and many of the urls contain accents: /leçon 1 - identification & vocabulaire.html I would like to be able to work on the site using SSH and…
Andy Swift
  • 87
  • 2
  • 13
1
vote
1 answer

Meteor server unicode channel problem

When trying to subscribe to a channel named "public" with Meteor, a specialised web server for real-time push using COMET, I get the desired response: When the http request is: GET…
Tom
  • 13
  • 3
1
vote
2 answers

Recommendation for locale on server: ISO (latin) or UTF-8

What is the recommended locale setting on a server that is used as an web and database server. Are there any drawbacks not using UTF-8 as default?
DrDol
  • 303
  • 1
  • 7