Questions tagged [codepoint]

A CodePoint is a numeric value that make up the unicode codespace.

CodePoint may represents a character or also have other meanings (seven fundamental classes of code points in the standard: Graphic, Format, Control, Private-Use, Surrogate, Noncharacter, Reserved).

Related links

Related tags

116 questions
1
vote
0 answers

How can I get the official Unicode name of a character in Javascript from either its string or numeric value?

What are my options in JavaScript when I want to find the Unicode name for an arbitrary codepoint? Maybe there are some modules I can use, web APIs I can call, or people have previously rolled their own way to do this? It should work with all planes…
hippietrail
  • 15,848
  • 18
  • 99
  • 158
1
vote
1 answer

How to convert UTF16 surrogate pairs to equivalent HEX codepoint in PHP?

I am making a application, when chat will be sent from iOS app, but the admin could view the chat from Admin panel which is built in PHP. From DB, I will be getting chat messages like this: Hi, Jax\ud83d\ude1b\ud83d\ude44! can we go for a coffee? I…
Saswat
  • 12,320
  • 16
  • 77
  • 156
1
vote
3 answers

If I use Java 8's String.codePoints to get an array of int codePoints, is it true that the length of the array is the count of characters?

Given a String string in Java, does string.codePoints().toArray().length reflect the length of the String in terms of the actual characters that a human would find meaningful? In other words, does it smooth over escape characters and other artifacts…
tacos_tacos_tacos
  • 10,277
  • 11
  • 73
  • 126
1
vote
0 answers

Ruby Emoji unicode don't displayed some icons

I'm using ruby 2.2.2, and Emoji But for some reason some icons don't dispayed For example (from http://apps.timwhitlock.info/emoji/tables/unicode): Unicode: U+26F5 Bytes (UTF-8): \xE2\x9B\xB5 Description: SAILBOAT Maybe someone know, how I can fix…
Oleh Sobchuk
  • 3,612
  • 2
  • 25
  • 41
1
vote
0 answers

Notepad++ displays code point values instead of characters in html files converted from docs

I have two words documents, which I converted to HTML by using Word conversion (Save as html page). The content in is Hebrew and English in both documents. Afterwards I opened both documents with Notepad++: In the first document everything was…
Anorflame
  • 376
  • 3
  • 12
1
vote
1 answer

Differenciate between symbol, number and letter-codepoints in Unicode?

Unicode has a huge number of codepoints, how can I check wheter a codepoint is a symbol (like "!" or "☭"), a number (like "4" or "৯"), a letter (like "a" or "え") or a control character (are usually not displayed directly)? Is there any logic behind…
API-Beast
  • 723
  • 5
  • 10
1
vote
2 answers

How to establish the codepoint of encoded characters?

Given a stream of bytes (that represent characters) and the encoding of the stream, how would I obtain the code points of the characters? InputStreamReader r = new InputStreamReader(bla, Charset.forName("UTF-8")); int whatIsThis = r.read(); What…
Vitaliy
  • 8,044
  • 7
  • 38
  • 66
1
vote
2 answers

How to save Unicode codepoint as character, not codepoint in Python

Is there a way to save a Unicode string into JSON that allows for Unicode codepoints to be replaced with their actual characters? For instance, having a dict like this ported into JSON...: dict1[u'N\u00e1utico'] = 2 ...instead of having it dumped…
user1549620
  • 123
  • 2
  • 7
1
vote
3 answers

How to encode a long string of hex values in unicode easily in python

I have hex code point values for a long string. For a short one, following is fine. msg = unichr(0x062A) + unichr(0x0627) + unichr(0x0628) print msg However, since unichr's alternate api unicode() does exist, i thought there must be a way to pass…
fkl
  • 5,412
  • 4
  • 28
  • 68
0
votes
1 answer

How do I split a unicode string on code points in python? (eg. \u00B7 or \u2022)?

I tried everything I could think of... 1. unicode_obj.split('\u2022') 2. re.split(r'\u2022', unicode_object) 3. re.split(r'(?iu)\u2022', unicode_object) Nothing worked The problem is that I want to split on special characters. example string :…
aniketd
  • 385
  • 1
  • 3
  • 15
0
votes
1 answer

Using code point to render outlined icon instead of filled from Material Design Google Fonts

I have a Place icon from Material Icons Google Fonts that is rendered using a before pseudo-selector and the code point e55f. The problem is that is being rendered the filled version: Instead of the outlined version I want: Is there a way I can…
Ricardo Castañeda
  • 5,746
  • 6
  • 28
  • 41
0
votes
2 answers

Java - convert 2 code points char to a single code point char

I am processing a text that then I have to link to files. The text has ä ( unicode points 97 + 776 ) but the FS has the file written as ä ( unicode point 228 ). Is there a way to convert 97 + 776 to 228? I believe these should be surrogate pairs…
machekj
  • 65
  • 8
0
votes
2 answers

What is a syntax for generating the Unicode hexadecimal value of a character in Julia as a `String`?

What is a syntax for generating the Unicode hexadecimal value of a character in Julia as a String? To generate the Unicode hexadecimal value of a character as a UInt32, one can execute codepoint(''). Example julia>…
0
votes
2 answers

How to find UTF-8 codes via LIKE '%\xC2\xA0%'?

I have a column that contains NO-BREAK SPACE (\xC2\xA0) instead of SPACE and I need to find that rows. Copy-pasting works: SELECT PRODUCT_NAME FROM TABLE t WHERE PRODUCT_NAME LIKE '% %' but using the code points does not: SELECT PRODUCT_NAME FROM…
Vega
  • 2,661
  • 5
  • 24
  • 49
0
votes
2 answers

Characters and digits of Chapter four of the Unicode Standard

In a language specification, there is name-start-character= '_' | '\' | ? any code points which are characters as defined by the Unicode character properties, chapter four of the Unicode Standard ?; Could anyone tell me how to correctly represent…
SoftTimur
  • 5,630
  • 38
  • 140
  • 292