0

I'm doing a multilanguage application with Qt, in Eclipse/Linux. In thai, the line breaks don't seem to be well supported on my controller (I'm still not sure why). Anyway, the following algorithm inserts a zero-width space (\u200b) between each thai characters (except between a character & its accents), so line breaks can occur. However, my controller now takes 12 minutes to boot in thai language (12 minutes before showing the opening QString message). The function overrides QTranslator::translate from Qt, so I can add zero-width spaces in each translated QString.

My question is the following: Can you guys tell me if I manipulate correctly Unicode & Utf-8 characters in the QString? Edit: Is the rendering of thai symbols a Qt issue? Thanks a lot!

QString EditTranslation::translate(const char *context, const char *sourceText, const char *disambiguation) const{


QString translatedString = QTranslator::translate(context,  sourceText,  disambiguation);

if (SystemSettingsService->getLanguageType() != ISystemSettingsService::Thai )
    return translatedString;

// Important block starts here********************* 

QString translatedStringModified;

for(QString::const_iterator i(translatedString.begin()); i != translatedString.end(); ++i){

   translatedStringModified.append(i->unicode());

   int unicode = (i+1)->unicode();

   if(((unicode > 3584 && unicode < 3634 && unicode != 3633) || (unicode > 3646 && unicode < 3655) || (unicode > 3662 && unicode < 3676))){

           translatedStringModified.append(QString::fromUtf8("\u200b")); // Zero-width space is added
   }

}

// ************************************

return translatedStringModified;
}
Benjamin
  • 53
  • 8

1 Answers1

0

In Thai spaces are used between paragraphs rather than words; this is the reason line breaking is poor in your original problem.

To solve this properly, you should use an algorithm to find boundaries between words; such as in libthai; and insert breaks there instead of arbitrarily inserting between consonants.

Inserting a space between every consonant is probably one of the reasons for the slowness.

In the mean time, the ranges you are using are wrong: you shouldn't break after a preceding vowel. This code snippet should clear it up for you:

int current = i->unicode();
int next = (i+1)->unicode();

if( (((next >= 0xE31) && (next <= 0xE34)) || ((next >= 0xE47) && (next <= 0xE4E))) || ((current >= 0xE40) && (current <= 0xE46)) )
{
  // Don't insert space
}
else
{
  // Insert space
}
koan
  • 3,596
  • 2
  • 25
  • 35
  • Thank you for your answer! I agree that it would be a good idea to use a library such _libthai_ to find word boundaries and that it would probably reduce the slowness of the application since there would be less zero-width spaces to insert. However, it doesn't change the fact that it shouldn't take 12 minutes to print out such a QString. Furthermore, if I apply my algorithm on a English/French/Japanese/ QString for example, it works properly. I wonder if **the rendering of thai symbols is a Qt issue?** – Benjamin Feb 13 '16 at 04:03
  • I would say that rendering of Thai letters is not an issue in Qt. The fact is that you are trying to hack an issue, you don't know what the algorithm in Qt for breaking lines is and there is an unfortunate interaction. Maybe you can get your translators to insert spaces to break up long phrases or to use less verbose versions of their text ? If you can't change your method then report the issue on the Qt bug tracker. – koan Feb 13 '16 at 17:28
  • Thanks for your help! I will report the issue on the Qt bug tracker. I don't think it should take that long even if I insert a space or any Latin char between each thai char. In English for example, I don't know what the algo is to break up lines either, and my algo works properly if I test it. – Benjamin Feb 13 '16 at 19:06