0

I need to replace 3 Hebrew Unicode characters into another 3 Hebrew Unicode characters. I looked into PHP syntax and searched but this is the best I could write. It works and does the job.

I'm wondering and would like to know if this is the most optimal way of replacing a Unicode character into another Unicode character in PHP before I turn this into a tiny little function.

Is there a better syntax in PHP for this?

$re1 = '/[\x{05B1}]/u';
$re2 = '/[\x{05B2}]/u';
$re3 = '/[\x{05B3}]/u';

$subst1 = json_decode('"\u05B6"');
$subst2 = json_decode('"\u05B0"');
$subst3 = json_decode('"\u05B8"');

//Replace (Niqqud with Cantillation) with (just Niqqud)
$bible_content = preg_replace($re1, $subst1, $bible_content);
$bible_content = preg_replace($re2, $subst2, $bible_content);
$bible_content = preg_replace($re3, $subst3, $bible_content);

Starting input for $bible_content:

וַ/יִּקְרָא אֱלֹהִים לָ/אוֹר יוֹם וְ/לַ/חֹשֶׁךְ קָרָא לָיְלָה וַ/יְהִי עֶרֶב וַ/יְהִי בֹקֶר יוֹם אֶחָד׃ אַשְׁרֵי הָ/אִישׁ אֲשֶׁר לֹא הָלַךְ בַּ/עֲצַת רְשָׁעִים וּ/בְ/דֶרֶךְ חַטָּאִים לֹא עָמָד וּ/בְ/מוֹשַׁב לֵצִים לֹא יָשָׁב׃ חֳ

Expected output for $bible_content:

וַ/יִּקְרָא אֶלֹהִים לָ/אוֹר יוֹם וְ/לַ/חֹשֶׁךְ קָרָא לָיְלָה וַ/יְהִי עֶרֶב וַ/יְהִי בֹקֶר יוֹם אֶחָד׃ אַשְׁרֵי הָ/אִישׁ אְשֶׁר לֹא הָלַךְ בַּ/עְצַת רְשָׁעִים וּ/בְ/דֶרֶךְ חַטָּאִים לֹא עָמָד וּ/בְ/מוֹשַׁב לֵצִים לֹא יָשָׁב׃ חָ

  • Your code doesn't seem to work when I test it online. Do you have any other charset declarations (or similar) prior to this code block? Online test: http://sandbox.onlinephpfunctions.com/code/00399e52af826fdd448679ed568e01d842e81920 – mickmackusa Jun 13 '17 at 00:48
  • I edited the input and output. The forward slashes are OK to be there. – Regina Hong Jun 13 '17 at 00:56
  • The third one was too difficult to find.. i just put one letter as example at the very end, now it should have all 3 characters. – Regina Hong Jun 13 '17 at 01:27
  • It seems the answer for this can be found in another question at https://stackoverflow.com/questions/3140734/unicode-preg-replace-problem-in-php The solution there also seems like a cleaner solution – geek3point0 Jun 13 '17 at 00:10

1 Answers1

1

PHP 7.0 has a new syntax for unicode characters in string literals. Furthermore, you can use the strtr function to handle character-to-character replacements.

$from = "\u{05B1}\u{05B2}\u{05B3}";
$to = "\u{05B6}\u{05B0}\u{05B8}";

echo strtr($bible_content, $from, $to). "\n";

Now, I can't read Hebrew (or even make it flow properly RTL, apparently :P ), so you'll have to judge whether it did the right thing or not.

Amadan
  • 191,408
  • 23
  • 240
  • 301