1

I am currently working on debugging some of the DirectWrite code I have written, as I have run into issues when testing with non-English characters. Mostly with getting multiple Unicode characters returning proper indices.

EDIT: After further research, I believe the issue is diacritics, the extra character should be combined somehow. The DWRITE_SHAPING_GLYPH_PROPERTIES field isDiacritic does return 1 for the last unicode codepoint. However, it doesn't seem like the shaping process takes these into account at all. GetGlyphPlacements returns 0's for advance and offset for the diacritic glyph. The LSB is around -5 but that's not enough to offset to the correct position. Does anyone know where in the shaping process DirectWrite is supposed to take diacritics into account and how?

Consider this character: œ̃

It is displayed as one character (through most text editors), but two codepoints: U+0153 U+0303

How do I account for this in GetGlyphs(), since they are separate codepoints? In my code, it is returning two different indices (177, 1123), and one cluster (0, 0).

This is what ends up getting rendered:

image

Which is consistent with both codepoints rendered individually, but not the actual character. The actual indice count returned by GetGlyphs() is 2.

My questions are as follows:

  1. Should this be returning one indice from GetGlyphs()?

  2. Should I even be getting one indice, or is there some magic involved with two different indices, where at some stage in the process they are combined in the glyph run?

  3. If I should be getting one indice, what process/functions are these indices combined at? Perhaps a bug in my ScriptAnalysis? Trying to narrow down where the issue may be.

  4. Should I be using the length of the characters and not include codepoints?

I apologize as I am not super knowledgeable about fonts/Unicode and the inner workings of the whole shaping process.

Here is some of my code for the process I use to get the indices and advances:

text_length = len(text.encode('utf-16-le')) // 2
text_buffer = create_unicode_buffer(text, text_length)

self._text_analysis.GenerateResults(self._analyzer, text_buffer, len(text_buffer))

# Formula for text buffer size from Microsoft.
max_glyph_size = int(3 * text_length / 2 + 16)

length = text_length
clusters = (UINT16 * length)()
text_props = (DWRITE_SHAPING_TEXT_PROPERTIES * length)()
indices = (UINT16 * max_glyph_size)()
glyph_props = (DWRITE_SHAPING_GLYPH_PROPERTIES * max_glyph_size)()
actual_count = UINT32()

self._analyzer.GetGlyphs(text_buffer,
                         len(text_buffer),
                         self.font.font_face,
                         False,  # sideways
                         False,  # rtl
                         self._text_analysis.script,  # scriptAnalysis
                         None,  # localName
                         None,  # numberSub
                         None,  # typo features
                         None,  # feature range length
                         0,  # feature range
                         max_glyph_size,  # max glyph size
                         clusters,  # cluster map
                         text_props,  # text props
                         indices,  # glyph indices
                         glyph_props,  # glyph pops
                         byref(actual_count)  # glyph count
                     )

advances = (FLOAT * length)()
offsets = (DWRITE_GLYPH_OFFSET * length)()
self._analyzer.GetGlyphPlacements(text_buffer,
                                  clusters,
                                  text_props,
                                  text_length,
                                  indices,
                                  glyph_props,
                                  actual_count,
                                  self.font.font_face,
                                  self.font.font_metrics.designUnitsPerEm,
                                  False, False,
                                  self._text_analysis.script,
                                  self.font.locale,
                                  None,
                                  None,
                                  0,
                                  advances,
                                  offsets)

EDIT: Here is rendering code:

def render_single_glyph(self, font_face, indice, advance, offset, metrics):
    """Renders a single glyph using D2D DrawGlyphRun"""
    glyph_width, glyph_height, lsb, font_advance = metrics

    # Slicing an array turns it into a python object. Maybe a better way to keep it a ctypes value?
    new_indice = (UINT16 * 1)(indice)
    new_advance = (FLOAT * 1)(advance)

    run = self._get_single_glyph_run(font_face,
                                     self.font._real_size,
                                     new_indice,  # indice,
                                     new_advance,  # advance,
                                     pointer(offset),  # offset,
                                     False,
                                     False)


    offset_x = 0
    if lsb < 0:
        # Negative LSB: we shift the layout rect to the right
        # Otherwise we will cut the left part of the glyph
        offset_x = math.ceil(abs(lsb))

    font_height = (self.font.font_metrics.ascent + self.font.font_metrics.descent) * self.font.font_scale_ratio

    # Create new bitmap.
    self._create_bitmap(int(math.ceil(glyph_width)),
                        int(math.ceil(font_height)))

    # This offsets the characters if needed.
    point = D2D_POINT_2F(offset_x, int(math.ceil(font_height)))

    self._render_target.BeginDraw()

    self._render_target.Clear(transparent)

    self._render_target.DrawGlyphRun(point,
                                     run,
                                     self.brush,
                                     DWRITE_MEASURING_MODE_NATURAL)

    self._render_target.EndDraw(None, None)
    image = wic_decoder.get_image(self._bitmap)

    glyph = self.font.create_glyph(image)
    glyph.set_bearings(self.font.descent, offset_x, round(advance * self.font.font_scale_ratio))  # baseline, lsb, advance
    return glyph

Charlie
  • 680
  • 4
  • 11

1 Answers1

1

Shaping process is controlled by your input which is (text,font,locale,script,user features). All that affects results you get. To answer your questions specifically:

Should this be returning one indice from GetGlyphs()?

That's mostly defined by your font.

Should I even be getting one indice, or is there some magic involved with two different indices, where at some stage in the process they are combined in the glyph run?

GetGlyphs() operates on single run. Glyphs are free to form a cluster according to shaping rules defined per-script, and according to transformations defined in the font.

If I should be getting one indice, what process/functions are these indices combined at? Perhaps a bug in my ScriptAnalysis? Trying to narrow down where the issue may be.

Basically, if your input arguments are correct, you get what you get as output, you can't really control the core of it. What you can do is to test output for the same text and font on Uniscribe, on CoreText (macos), and on Chromium/Firefox (harfbuzz) to see if they differ.

Should I be using the length of the characters and not include codepoints?

I didn't get this one.

bunglehead
  • 1,104
  • 1
  • 14
  • 22
  • > Shaping process is controlled by your input which is (text,font,locale,script,user features). I have tried multiple fonts and many different script and locale options, and I have still yet to get the accurate placements. It does mention them as a single cluster, but the placements don't line up correctly. Negative LSB, 0 advance, 0 offset. Which metric should be different accounting for diacritic combining placements? – Charlie Mar 15 '21 at 14:53
  • Placement information after shaping is conveyed only by advances + offsets vectors. For diacritics it's normal to have 0 advances and offsets are used to position them. For your text example, running with Tahoma on Windows 10 I get offset (-301,-380) for diacritic, for emsize == 2048.0. Feel free to post extended code sample, maybe rendering part is off somehow. – bunglehead Mar 15 '21 at 17:45
  • The offsets are definitely the issue then. I have always received 0's for the values in `DWRITE_GLYPH_OFFSET`, so I never implemented it. It seems however that it affects where `DrawGlyphRun` draws the character. I have updated the rendering code I am using. I am rendering each of the glyphs individually to a bitmap to be re-used. I think what is happening is the offset is affecting where it's at on the bitmap. From testing, it seems to be a 1:1 ratio for pixels on where it renders. For instance -5 on the `advanceOffset` value, moves it -5 pixels on the bitmap, -301 seems way off. – Charlie Mar 15 '21 at 20:50
  • Should all cluster indices be rendered at the same exact position, but also factoring in it's offsets? Or should the first glyphs advance still affect the subsequent glyph even if they are part of the same cluster? I am trying to figure out the best way to implement the offsets. Thank you for the responses. – Charlie Mar 15 '21 at 20:52
  • -5 vs -301 accounts for difference in selected font size, for my test I used font size of 2048.0, you most likely used lower value. Regarding clusters, no, it's not important. All you need to know is glyphs array, advances array, and an array of offset vectors. Take a look at IDWriteFactory4::ComputeGlyphOrigins(), that returns origins for every glyph. – bunglehead Mar 16 '21 at 05:31