Questions tagged [codepoint]

A CodePoint is a numeric value that make up the unicode codespace.

CodePoint may represents a character or also have other meanings (seven fundamental classes of code points in the standard: Graphic, Format, Control, Private-Use, Surrogate, Noncharacter, Reserved).

Related links

Related tags

116 questions
3
votes
1 answer

Identify if a Unicode code point represents a character from a certain script such as the Latin script?

Unicode categorizes characters as belonging to a script, such as the Latin script. How do I test whether a particular character (code point) is in a particular script?
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
3
votes
2 answers

Can I check if a Unicode codepoint will be displayable under Android?

My current project uses some symbols which are absent in many fonts. So far I've found some iOS renders as a box so I'm expecting sooner or later to find one that Android can't render too, though so far it's done better than iOS in this regard. Is…
hippietrail
  • 15,848
  • 18
  • 99
  • 158
3
votes
2 answers

UTF-16 reserved codepoints

Why UTF-16 have a reserved range in UCS Database? UTF-16 is just a way to represent character scalar value using one or two unsigned 16-bits, the layout of these values shouldn't be related to character scalar value because we should apply some…
Muhammad
  • 1,598
  • 3
  • 19
  • 31
3
votes
2 answers

Creating a UTF-8 string from hexadecimal code

In C++, it's possible create a UTF-8 string using this kind of notation: "\uD840\uDC50". However this doesn't work in PHP. Is there a similar notation? If not, is there any built-in way to create a UTF-8 string knowing its Unicode code point?
laurent
  • 88,262
  • 77
  • 290
  • 428
3
votes
2 answers

How does one allow a subset of UNICODE codepoints in input validation?

I am creating a service that could "go international" to non-English speaking markets. I do not want to restrict a username to the ASCII range of characters but would like to allow a user to specify their "natural" username. OK, use UNICODE (and…
z8000
  • 3,715
  • 3
  • 29
  • 37
2
votes
1 answer

Different results using codepoint() with input arguments with \dot

I am trying to see whether the \dot operator can be detected from a symbol in Julia, here is what I have tried: The following two blocks return different results julia> [codepoint(i) for i in string(:ẋ)] 1-element Vector{UInt32}: …
pppplight
  • 37
  • 4
2
votes
1 answer

What does the "?" operator do in Elixir?

The Ecto source code makes use of expressions ?0, ?1, etc. You can see how they evaluate: iex(14)> ?0 48 iex(15)> ?1 49 iex(16)> ?2 50 What does that mean though? This is very hard to search for. What does the ? actually do?
Freedom_Ben
  • 11,247
  • 10
  • 69
  • 89
2
votes
3 answers

Generate a String object from a List of code point integers?

If I have a List< Integer > whose integer values are Unicode code point numbers. How do I construct a String object of characters determined by those code points? For example: List < Integer > codePoints = List.of( 100, 111, 103, 128054 ) ; ……
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
2
votes
1 answer

In PHP PCRE syntax, how does one specify a multi-codepoint Unicode character/"emoji"?

Code: var_dump(preg_replace('#\x{1F634}#u', '', 'This is the sleeping emoji: ')); var_dump(preg_replace('#\x{1F1FB 1F1F3}#u', '', 'This is the Vietnam flag: ')); Expected output: string(28) "This is the sleeping emoji: " string(33) "This is the…
user379490
  • 159
  • 5
2
votes
1 answer

Gforth - How to get codepoints of a string?

I know that gforth stores characters as their codepoints in the stack, but the material I'm learning from doesn't show any word that helps to convert each character to codepoint. I also want to sum the codepoints of the string. What should I use to…
Razetime
  • 216
  • 3
  • 19
2
votes
0 answers

Why Unicode codepoints are needed to use AHK hotstrings in WSL

AutoHotKey cannot insert numbers on the WSL unless I use codepoints I would like to use python3 every time I use pipenv. For that, I need to insert: pip --python /usr/bin/python3 etc.. However, I don't want to type --python /usr/bin/python3 every…
2
votes
0 answers

Does PHP offer a way to determine if a unicode codepoint belongs to a particular language not just a particular script.?

The Latin script supports many languages, and I would like to make sure that input characters are within a language (e.g. English or German), not just within the Latin script. Unicode is divided into blocks and blocks are not necessarily language…
2
votes
1 answer

How can I make conversion between bytes and Unicode in Dart?

I try to implement Irn answer in How to work with char types in Dart? (Print alphabet) But didn't catch exactly how to do it. Example In my Dart code İ capital i with dot above represent as byte[304] and I have to replace this with server byte[152]…
Nick
  • 4,163
  • 13
  • 38
  • 63
2
votes
1 answer

Python fonttools: Check if font supports multi codepoint emoji

I'm trying to check if a font has a glyph for a multi codepoint emoji like "‍♂️", "‍" or "" in Python 3.x. For single codepoint emoji like "" or "" I'm able to validate their support via the following code using Python fonttols: from fontTools.ttLib…
COM8
  • 271
  • 1
  • 10
2
votes
2 answers

In Java, how are Unicode chars and Java UTF-16 codepoints handled?

I'm struggling with Unicode characters in Java 10. I'm using the java.text.BreakIterator package. For this output: myString="ab" hex=0061d835dcde0062 myString.length()=4 myString.codePointCount(0,s.length())=3 BreakIterator output: a …
Bcwilmot
  • 51
  • 4