I have some Burmese text, which was split down to individual characters to check for and remove characters outside of the relevant Unicode block, e.g. removing Latin characters from Burmese text. The result (if I am using the correct term) is that the grapheme clusters have been separated like:
ေမာင္ေကာင္းၫိႈ႕မွဴးႏိုင္
I believe where the dotted line circles are should be the two chracters as one Unicode character as opposed to two.
Correctly rendered Burmese shouldn't have these dotted circles like:
ယနေ့ မြန်မာမှုအဖြစ် ပုံဖော်ပေးခဲ့သည့် ယဉ်ကျေးမှုမှာ နှစ်ပေါင်း အတော်အတန်ကြာမြင့်နေပြီဖြစ်ကြောင်း
Any ideas on how this could be fixed?