1

Given string:

QString unformatted =
   "Some non arabic text"
   "بعض النصوص العربية"
   "Another non arabic text"
   "النص العربي آخر";

How to reach following result using QRegExp or other way:

"<p>Some non arabic text</p>"
"<p dir='rtl'>بعض النصوص العربية</p>"
"<p>Another non arabic text</p>"
"<p dir='rtl'>النص العربي آخر</p>";

Thanks!

STF
  • 1,485
  • 3
  • 19
  • 36
Erik Kianu
  • 89
  • 9
  • The QRegExp class provides pattern matching using regular expressions. A regular expression, or "regexp", is a pattern for matching substrings in a text. – eyllanesc Jan 03 '17 at 17:54
  • This is useful in many contexts, e.g.,Validation:A regexp can test whether a substring meets some criteria, e.g. is an integer or contains no whitespace. Searching:A regexp provides more powerful pattern matching than simple substring matching, e.g., match one of the words mail, letter or correspondence, but none of the words email, mailman, mailer, letterbox, etc. – eyllanesc Jan 03 '17 at 17:56
  • Search and Replace: A regexp can replace all occurrences of a substring with a different substring, e.g., replace all occurrences of & with & except where the & is already followed by an amp;. String Splitting: A regexp can be used to identify where a string should be split apart, e.g. splitting tab-delimited strings. – eyllanesc Jan 03 '17 at 17:56
  • read this: http://doc.qt.io/qt-4.8/qregexp.html – eyllanesc Jan 03 '17 at 17:56
  • Search [non arabic block] and replace with

    [non arabic block]

    , and search [arabic block] and replace with

    [arabic block]

    . Is it possible or not?
    – Erik Kianu Jan 03 '17 at 18:04
  • try with my solution – eyllanesc Jan 03 '17 at 18:57
  • Those string literals may not contain what you think they do -- always wrap them in the [QStringLiteral](http://doc.qt.io/qt-5/qstring.html#QStringLiteral) macro to ensure compatibility with Qt. – MrEricSir Jan 03 '17 at 22:24

1 Answers1

1

Function to separate by arabic expressions:

QString split_arabic(QString text){
    QRegExp rx("[\u0600-\u065F\u066A-\u06EF\u06FA-\u06FF][ \u0600-\u065F\u066A-\u06EF\u06FA-\u06FF]+");
    int pos = 0;


    QStringList list;

    while ((pos = rx.indexIn(text, pos)) != -1) {
        list << rx.cap(0);
        pos += rx.matchedLength();
    }

    for(int i=0; i < list.length(); i++){
        QString str = list.at(i);
        text.replace(str, "<p dir='rtl'>"+str+"</p>");
    }

    return text;
}

Example:

QString unformatted =
            "Some non arabic text"
            "بعض النصوص العربية"
            "Another non arabic text"
            "النص العربي آخر";


qDebug()<<unformatted;
qDebug()<<split_arabic(unformatted);

Output:

"Some non arabic textبعض النصوص العربيةAnother non arabic textالنص العربي آخر"
"Some non arabic text<p dir='rtl'>بعض النصوص العربية</p>Another non arabic text<p dir='rtl'>النص العربي آخر</p>"
eyllanesc
  • 235,170
  • 19
  • 170
  • 241