I was writing a program to render text to an image and draw bounding boxes around characters using Pango, Cairo and PangoCairo. I am using the Rust bindings to these libraries called gtk-rs.
After laying out the next I am splitting the text into graphemes using unicode_segmentation and finding the position of these graphemes using index_to_pos which basically translates to pango_layout_index_to_pos. Here is the code I wrote to draw these bounding boxes.
use unicode_segmentation::UnicodeSegmentation;
use crate::ImageDims;
#[derive(Debug)]
pub struct BoundingBox {
pub x: i32,
pub y: i32,
pub height: i32,
pub width: i32,
pub akshara: String,
}
pub type BoundingBoxes = Vec<BoundingBox>;
pub fn get_bounding_boxes(layout: pango::Layout, dims: ImageDims) -> BoundingBoxes {
let mut boxes = BoundingBoxes::new();
let text = layout.text().unwrap();
for (idx, graphemes) in text.grapheme_indices(true) {
let rect = layout.index_to_pos(idx as i32);
boxes.push(BoundingBox {
x: rect.x(),
y: rect.y(),
height: rect.height(),
width: rect.width(),
akshara: graphemes.to_string(),
});
}
// adjust the values for the cairo context
boxes.iter_mut().for_each(|b| {
b.x = b.x / pango::SCALE + dims.padding;
b.y = b.y / pango::SCALE + dims.padding;
b.width = b.width / pango::SCALE;
b.height = b.height / pango::SCALE;
});
boxes
}
However, in the rendered image some of the characters do not have bounding boxes at all. For example । on the last line or ए in the last word. There are other abnormalities like भी in the third word.
Some characters do not have bounding boxes
How do I fix this?