I am looking for a way to count ligatures as single units as they are displayed to user, e.g. https://www.compart.com/en/unicode/U+FEFB.
When this character is typed (type G on Arabic keyboard), it's inserted in decomposition form, i.e. U+0644 U+0627
.
I'm able to decompose U+FEFB
by
escape(String.fromCodePoint(0xFEFB).normalize("NFKD")) // '%u0644%u0627'
Is there a way to compose U+0644 U+0627
into 0xFEFB
?
Why this does work?
escape(String.fromCodePoint(0x0644, 0x0627).normalize("NFKC"))
The only idea I has was to iterate over unicode ranges I'm interested in, decompose and create a map, but I'm hoping there's a better way.