Questions tagged [character-set]

A character set maps a set of characters to specific numeric values, e.g. ASCII, UTF-8 and ISO-8859-1.

A character set maps a set of characters to specific numeric values.

Modern computer languages, editors and tools facilitate encoding and decoding of data between internal representations of data and specific character sets. Examples include ASCII, UTF-8 and ISO-8859-1.

Consideration should be given to using the appropriate character set for transmission and persistence of data, particularly text that can contain special characters (such as European languages like French or German) or be in a completely different script (such as Japanese) - see internationalisation (also referred to as i18n).

120 questions
1
vote
0 answers

Convert Character Set from WIN1252 to UTF8 - Firebird 3

I'm facing problems trying to convert a Firebird 3 database with character set WIN1252 to UTF8. I've performed the following procedures: Extract the DDL from the database and the definitions, so I created the new database with UTF8 Character Set,…
Marcoscdoni
  • 955
  • 2
  • 11
  • 31
1
vote
2 answers

MySQL REGEXP word boundary detection with german umlauts when using BINARY Operator

I made a strange discovery. If I execute the following SQL-Command: SELECT 'Konzessionäre' REGEXP '[[:<:]]Konzession[[:>:]]' it gives me the result - as expected - 0 But if I do the same together with the BINARY operator: SELECT BINARY…
QWERTZ
  • 11
  • 2
1
vote
1 answer

Wierd problems when insert Chines characters into MySQL via Windows cmd

My Envirionment System: Windows 7 64-bit Database: MySQL 5.7.20 64-bit Locale: Chinese cmd code page: CP936 MySQL setting system variables: enter image description here table info: enter image description here My Problem Under the environment…
1
vote
0 answers

MySQL InnoDB tables don't accept Emoji's with utf8mb4 encoding

I have been trying to get my DB to accept Emoji's by following steps from this tutorial as it appears to be the one that is linked to the most. https://mathiasbynens.be/notes/mysql-utf8mb4 Even with a test DB I can't get this to work and so far I…
Paul
  • 328
  • 1
  • 5
  • 17
1
vote
1 answer

What is the Basic Latin character set used by ISO Standards Catalogue 01.140.10?

I have a reference to "the Basic Latin character set used by ISO Standards Catalogue 01.140.10" and I need to know the exact set of code points. Without going through all the standards found in Standards Catalogue 01.140.10 can I find this Basic…
Michael
  • 95
  • 6
1
vote
0 answers

Why does mysql_client_encoding() return "latin1" when MySQL is configured with "UTF8"?

When I run SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%' in MySQL, I get the following variables: Variable_name Value character_set_client utf8 character_set_connection …
ProgrammerGirl
  • 3,157
  • 7
  • 45
  • 82
1
vote
2 answers

Character set woes

I have a small ajax application built with php. Using phpMyAdmin I have set a mysql database to utf-8, and have imported a textfile containing utf-8 data into it. This worked fine on a windows machine with easyphp, after adding…
Joshxtothe4
  • 4,061
  • 10
  • 53
  • 83
1
vote
1 answer

Character set on row inserted into MySQL via SQLAlchemy changes depending on machine?

I use the following Python code to insert a row into a MySQL table: city = City() city.country_id = connection.globe.session.query(Country).\ filter(Country.code == row[1]).one().id city.name =…
1
vote
1 answer

Does schema validation always check an XML file for conformance to the XML 1.0 character set?

Does schema validation always check an XML file for conformance to the XML 1.0 character set as per: https://www.w3.org/TR/REC-xml/#charsets ...or does it depend on the XML library you are using?
Michael
  • 285
  • 5
  • 14
1
vote
1 answer

How to replace all spaces which are not actually spaces with regex in PHP

I have the following string which I want to 'clean' from multiple whitespaces: $string = "This is a test string"; //Using utf8_decode Not a big deal right? However, the string is not 'cleaned' after using: $string = preg_replace('/\s+/', ' ',…
TVA van Hesteren
  • 1,031
  • 3
  • 20
  • 47
1
vote
1 answer

How to display IBM mainfrance charset in IntelliJ (EBCDIC?)

I'm using IntelliJ and when I open data files in the editor, I can use the "character set" selector at the bottom right of the window to reload the file and display it in the appropriate charset For example, I can switch between UTF-8, ISO-8859-1…
vikingsteve
  • 38,481
  • 23
  • 112
  • 156
0
votes
0 answers

Regular expressions, Greek characters and the *-quantifier doesn't work (but the +-quantifier does)?

I use this regular expression [\p{Greek}] to match any Greek character. It works as expected and matches the first Greek character on the line. However, I want to match all Greek characters that follows that first character but the *-quantifier…
d-b
  • 695
  • 3
  • 14
  • 43
0
votes
2 answers

Replace in a string all characters outside the set Windows-1252

Having to maintain old programs written in VB6, I find myself having this issue. I need to find an efficient way to search a string for all characters OUTSIDE the Windows-1252 set and replace them with "_". I can do this in C# So far I have done…
matti157
  • 1,288
  • 2
  • 13
  • 26
0
votes
0 answers

Find character set encoding of a column in a spark dataframe

I have a dataframe built using pyspark. It has 3 columns "col1", "col2", "col3". I want to find character set encoding for "col1". How can I achieve this?
0
votes
0 answers

Java SpringFramework HTTPRequest unicode character problem

I'm trying to get the content of an online page through SpringFramework using this procedure public HttpReply httpRequest(final String uri, final HttpMethod method, final Class expectedReturnType, final…
Malignus
  • 115
  • 1
  • 13