Questions tagged [codepoint]

A CodePoint is a numeric value that make up the unicode codespace.

CodePoint may represents a character or also have other meanings (seven fundamental classes of code points in the standard: Graphic, Format, Control, Private-Use, Surrogate, Noncharacter, Reserved).

Related links

Related tags

116 questions
0
votes
1 answer

Searching for Unicode code points by name

I need a way to search for Unicode code points by the name of the code point in Swift/Objective-C on iOS. So if a user types "shade" it would find code points containing the word shade, like U+2591 through U+2593. What would be the most efficient…
Addison
  • 3,791
  • 3
  • 28
  • 48
0
votes
2 answers

Sorting sets of non-latin characters in the order of a string?

I am using the following code form sorting: letters = '세븐일레븐' old = [('세븐', 8), ('븐', 2), ('일', 5), ('레', 4)] new = sorted(old, key=lambda x: letters.index(x[0])) For non-latin characters, the output is the same as the input: [('세븐', 8), ('븐', 2),…
thomascrha
  • 193
  • 1
  • 1
  • 13
0
votes
0 answers

Emoticons in python string - \xF0\x9F\x92\x96 \xF0

_mysql_exceptions.Warning: Incorrect string value: '\xF0\x9F\x92\x96 \xF0...' for column 'title' at row 1 s = "This is my string. Über! 0\x9F\x92\x96 \xF0" How can I remove only this value -> 0\x9F\x92\x96 \xF0 from this string?(or encode…
pawss
  • 23
  • 8
0
votes
2 answers

How to convert U+XXX to the actual unicode character (in the native script)

I have a list of code points (U+XXXX) that I need to convert into real characters. My code points are for UTF-8. I've scoured the previous mentions of unicode and don't see how to do that. I can strip U+XXXX to get the number (XXXX), but then what?…
wdchild
  • 51
  • 7
0
votes
2 answers

How to Convert string to utf-8 codepoint in php

I want to convert a string like: alnassre will be 0061006c006e00610073007300720065 عربي will be 063906310628064a a will be 0061 using PHP as what is going in the link http://www.bareedsms.com/tools/UniCodeConverter.aspx
Mansour Alnasser
  • 4,446
  • 5
  • 40
  • 51
0
votes
1 answer

Mysql convert unicode code point to utf-8 character

I was using CHAR(code_point USING ucs2) to convert a unicode code point to utf-8 character but it's giving me unexpected results above 0x00ff code point. It gives me the the character Ā (code point 0x0100) against code points 0x0100 to 0x01FF, and…
Adee
  • 464
  • 6
  • 17
-1
votes
1 answer

How to convert codepoint of one charset to another in Java?

I am trying to convert codepoints from one charset to another in Java. For example character ř is 248 in windows-1250, 345 in unicode. So I have source charset and source codepoint and target charset and want to calculate target codepoint. This may…
-1
votes
1 answer

How can I add '#' symbol/emoji in a string such that it doesn't splits the string when `split('#')` method is called on it?

Is there a way in JavaScript to display a symbol similar to '#' in a string such that "Enter #time to check time".split('#') doesn't break it into pieces? It should return a complete string instead of ['Enter ','time to check time']. I tried using…
Deepak Terse
  • 652
  • 8
  • 30
-1
votes
2 answers

How to get 5 characters of any encoding Java-string?

Problem How can I get only 5 characters of the string if sometimes encoding looks like "UTF-8", "UTF-16" and "ASCII"? Note: some of the tests input has emoji. Code public String truncate(String input) { if (input.codePointCount(0,…
-3
votes
1 answer

In unicode standard, why does U+12ca = 0x12ca? Where does the 0 come from and how does 0x12ca = 4810 decimal

I'm learning about Unicode basics and I came across this passage: "The Unicode standard describes how characters are represented by code points. A code point is an integer value, usually denoted in base 16. In the standard, a code point is written…
ABC
  • 3
  • 1
-3
votes
1 answer

C++ Unicode: Bytes, Code Points and Graphemes

So, I'm building a scripting language and one of my goals is convenient string operations. I tried some ideas in C++. String as sequences of bytes and free functions that return vectors containing the code-points indices. A wrapper class that…
João Pires
  • 927
  • 1
  • 5
  • 16
1 2 3 4 5 6 7
8