4

I am currently writing a simple bitmap font generator using CoreGraphics and CoreText. I am retrieving the kerning table of a font with:

CFDataRef kernTable = CTFontCopyTable(m_ctFontRef, kCTFontTableKern, kCTFontTableOptionNoOptions);

and then parse it which works fine. The kerning pairs give me the glyph indices (i.e. CGGlyph) for the kerning pairs, and I need to translate them to unicode (i.e. UniChar), which unfortunately does not seem super easy. The closest I got was using:

CGFontCopyGlyphNameForGlyph

to retrieve the glyph name of the CGGlyph, but I don't know how to convert the name to unicode, as they are really just strings such as quoteleft. Another thing I though about was parsing the kCTFontTableCmap myself to manually do the mapping from the glyph to the unicode id, but that seems to be a ton of extra work for the task. Is there any simple way of doing this?

Thanks!

moka
  • 4,353
  • 2
  • 37
  • 63

1 Answers1

6

I don't know a direct method to get the Unicode for a given glyph, but you could build a mapping in the following way:

  • Get all characters of the font with CTFontCopyCharacterSet().
  • Map all these Unicode characters to their glyph with CTFontGetGlyphsForCharacters().
  • For each Unicode character and its glyph, store the mapping glyph -> Unicode in a dictionary.
Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
  • I upvoted you answer as that seems to be one way to go about it that is most certainly simpler that parsing kCTFontTableCmap. I will accept this as the answer, if nothing better comes up! Thanks a ton! – moka Aug 14 '13 at 18:52
  • 1
    This is likely the correct direction. The main problem is that glyph->character is not a 1:1 conversion. For instance, there may be a single glyph for a ligature like "fi". There is no such character, so what would you map it to? Martin's approach is reasonable in that it gives you the subset of glyphs that actually have a 1:1 character mapping. – Rob Napier Aug 14 '13 at 20:39
  • 1
    Don't forget that it is legal for a character to map to no glyph (you may get 0 in the returned array). And it is legal for multiple characters to map to the same glyph (though I haven't seen that happen personally). Fonts may have special rules for combining characters. It is legal to represent å (U+00E5, Latin Small Letter A with ring above) as one glyph or two. So you may not always get exactly what you'd expect when mapping between characters and glyphs. – Rob Napier Aug 14 '13 at 20:53
  • Thanks Rob Napier for your clarification, that makes perfect sense. I think I can ignore ligatures for now. I accepted this as the answer now, as I am pretty sure that this is the way to go about it! Thanks guys! – moka Aug 14 '13 at 20:59
  • @RobNapier Aren't ligatures special unicode chars though? So for ligatures that are part of unicode, these should still be 1 : 1 conversions I think. Still I guess you are right that this is not guaranteed for all cases. – moka Aug 14 '13 at 21:11
  • 1
    Ligatures are not special unicode characters. While there are a small number of ligatures built into Unicode (Œ for instance), the majority are not. The most common English ligature, fi, is not part of Unicode. And fonts are free to generate arbitrary ligatures (and do). The Zapfino font is famous for it's seven-character "Zapfino" ligature (http://commons.wikimedia.org/wiki/File:Zapfino_ligature_demo.png; that's one glyph). And of course it goes the other way. My favorite Unicode character, ﷽ (U+FDFD, Basmala), could certainly be decomposed into many glyphs. – Rob Napier Aug 14 '13 at 21:38
  • Per chance, is it possilble "get all characters of the font with CTFontCopyCharacterSet()" in Swift? ... for macOS? https://stackoverflow.com/questions/56782339/how-to-get-all-characters-of-the-font-with-ctfontcopycharacterset-in-swift {Something cross platform would be outstanding... https://stackoverflow.com/questions/56782036/cgfont-and-ctfont-functionality-in-portable-swift-e-g-ubuntu-etc ... however, i'm guessing that reasonably convenient solutions may not currently exist.} – marc-medley Jun 27 '19 at 00:08