8

i'm using the following function to highlight certain word and it works fine in english

function highlight(str,toBeHighlightedWord)
     {

        toBeHighlightedWord="(\\b"+ toBeHighlightedWord.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1")+ "\\b)";
        var r = new RegExp(toBeHighlightedWord,"igm");
        str = str.replace(/(>[^<]+<)/igm,function(a){
            return a.replace(r,"<span color='red' class='hl'>$1</span>");
        });
        return str;
     }

but it dose not for Arabic text

so how to modify the regex to match Arabic words also Arabic words with tashkel, where tashkel is a characters added between the original characters example: "محمد" this without tashkel "مُحَمَّدُ" with tashkel the tashkel the decoration of the word and these little marks are characters

Hager Aly
  • 1,113
  • 9
  • 25
  • 1
    You might consider http://xregexp.com/ / https://github.com/slevithan/xregexp for an advanced JS regex engine that can deal with Unicode, among many other things. – Tomalak Jun 14 '14 at 09:46

1 Answers1

6

In Javascript, you can use the word boundary \b only with these characters: [a-zA-Z0-9_]. A lookbehind assertion can not be useful too here since this feature is not supported by Javascript.

The way to solve the problem and "emulate" a kind of word boundary is to use a negated character class with the characters you want to highlight (since it is a negated character class, it will match characters that can't be part of the word.) in a capturing group for the left boundary. For the right a negative lookahead will be much simple.

toBeHighlightedWord="([^\\w\\u0600-\\u06FF\\uFB50-\\uFDFF\\uFE70-\\uFEFF]|^)("
              + toBeHighlightedWord.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1")
              + ")(?![\\w\\u0600-\\u06FF\\uFB50-\\uFDFF\\uFE70-\\uFEFF])";
var r = new RegExp(toBeHighlightedWord, "ig");
str = str.replace(/(>[^<]+<)/g, function(a){
    return a.replace(r, "$1<span color='red' class='hl'>$2</span>");
}

Character ranges that are used here come from three blocks of the unicode table:

Note that the use of a new capturing group changes the replacement pattern.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Is there any working example for the above function or how to implement this function to replace arabic words wrapped in side a div tag – Learning Apr 13 '15 at 03:17
  • how can i make it work with following example http://jsfiddle.net/u3k01bfw/13/, in my case it doesnt match all keywords – Learning Apr 13 '15 at 04:22