2

I am trying to split a text that I got from a .docx (Arabic) using preg-split. The text sample:

صادر في16  الفقرة الأولى :المتعلق (ج. ر. بتاريخ 8 ذو القعدة 1435 -ويجوز بالتالي إصدار الأمرالفقرة4: القعدة 

I need to split my text according to the following rules: either المادة\s\d\:\n or المادة الأولى\s\:\n I tought of a multiple pattern, like so:

$pattern = "/(الفقرة \s\d\:\n)|(الفقرة الأولى\s\:\n)/";

$splitted_para_arr = preg_split($pattern,$content,null,PREG_SPLIT_NO_EMPTY);

The result I get is an unsplitted array:

Result:


Array
(
  [0] => 
صادر في16  الفقرة الأولى :المتعلق (ج. ر. بتاريخ 8 ذو القعدة 1435 -ويجوز بالتالي إصدار الأمرالفقرة4: القعدة 
)

I am using sublime text as editor, and xampp as webserver

enter image description here

E.Sophia
  • 49
  • 1
  • 12
  • When you deal with non ascii characters, use the u modifier (otherwise, the regex engine will see your string as a sequence of single byte characters). Other thing, are you sure there are newlines in your string? – Casimir et Hippolyte Jun 15 '16 at 11:57
  • 1
    You need a `/u` modifier by all means. and you can use alternation: `$pattern = '/(الفقرة \s\d|الفقرة الأولى\s):\n/u';` – Wiktor Stribiżew Jun 15 '16 at 11:57
  • I edited my pattern: $pattern = "/(الفقرة\s\d\:\n)|(الفقرة الأولى\s\:\n)/u"; and still getting the same result – E.Sophia Jun 15 '16 at 12:00
  • 1
    Your line does not seem to have newlines, try `$pattern = '/الفقرة\s*\d:\r?\n?|الفقرة الأولى\s*:\r?\n?/u';`. Also, what is the expected output? And what is exact input? (note that `\r?\n?` can be replaced with `\R*`) – Wiktor Stribiżew Jun 15 '16 at 12:54
  • When I use the half of it and also removing the : and \R, I get a splitted text ** $pattern = '/الفقرة\s*\d/u';**. but using the full pattern you suggested it doesn't split it – E.Sophia Jun 15 '16 at 13:56
  • Do you have `:` as the ASCII colon or half- or full-width colon? Please check. Please paste the full exact string you test against as a PHP string literal *into the question itself*. Or check by yourself at [r12a.github.io/apps/conversion](http://r12a.github.io/apps/conversion/). – Wiktor Stribiżew Jun 15 '16 at 20:33
  • I edited the code, the text sample is the same text I get in the array result – E.Sophia Jun 15 '16 at 22:46

0 Answers0