-1

my aim

for the "author name input", to be sure that only spaces and utf-8 letters are inputted. my website lang is Turkish, Turkish alphabet has non-English characters.

my weird issue

this regex works on rubular.com
if input string : "Selim Çınar" result : matches
if input string : "Selim Çınar 12" result : don't match
regex: /^[\p{L} ]+$/u

then I created trial.php on my website and run codes below

1

echo '<br /><br /><br />';
    $str ='Selim Cinar';
    if (!preg_match("/^[\p{L} ]+$/u", $str))
    {echo 'no, not only utf-8 letters and spaces';}
    else {$str.' yes utf-8 letters and spaces';}
echo '<br /><br /><br />';

result for code above : empty page with only <br /> tags at source page

2

echo '<br /><br /><br />';
    $str ='Selim Çınar'; //includes Tr characters
    if (!preg_match("/^[\p{L} ]+$/u", $str))
    {echo 'no, not only utf-8 letters and spaces';}
    else {$str.' yes utf-8 letters and spaces';}
echo '<br /><br /><br />';

result for code above : empty page with only <br /> tags at source page

3

code source: http://php.net/manual/en/function.ctype-alpha.php

    $str ='Selim Çınar'; //includes Tr characters
    $str =trim($str);
    $str = str_replace(' ', '', $str);
    setLocale(LC_CTYPE, 'TR_tr.UTF-8');
    if (ctype_alpha($str)) {echo 'yes utf-8 letters';}
    else {echo 'no, not only utf-8 letters';}

result for code above : no, not only utf-8 letters

4

code source: http://php.net/manual/en/function.ctype-alpha.php

    $str ='Selim Cinar';
    $str =trim($str);
    $str = str_replace(' ', '', $str);
    setLocale(LC_CTYPE, 'TR_tr.UTF-8');
    if (ctype_alpha($str)) {echo 'yes utf-8 letters';}
    else {echo 'no, not only utf-8 letters';}

result for code above : yes utf-8 letters

my phpinfo

PHP Version 5.4.10
Apache 2.0 Handler
Apache API version: 20051115
PCRE (Perl Compatible Regular Expressions) Support enabled
PCRE Library Version 8.20 2011-10-21

about trial.php

trial.php is pure php. no html header declaration.

My Questions

  1. why am I getting empty page for case 1 and case 2 {! UPDATE : SOLVED by MikeM below }
  2. Why case 3 doesn't understand "SelimÇınar" as utf-8? Is my code false (setLocale part maybe)?

UPDATE

question 1 is solved by MikeM.

**question 2 still exists as an question.

Community
  • 1
  • 1
Andre Chenier
  • 1,166
  • 2
  • 18
  • 37

2 Answers2

1

You are only getting an empty page for cases 1 and 2 because the regex successfully matches $str and therefore the else branch is executed, but there is no echo there so nothing is printed.

I don't know the answer to your second question. The setLocale looks okay to me, but its behaviour is system dependent.

MikeM
  • 13,156
  • 2
  • 34
  • 47
1

Do not use setlocale or utf8_decode, your problem is very simple in that your php source files are not saved in UTF-8. This depends on your text editor.

This is what WILL work when you saved your file correctly:

$str = 'Selim Çınar'; //Since this is a string literal, its encoding is determined by
                     //how this source file was saved
if (preg_match("/^[\p{L} ]+$/u", $str)) {
    echo 'yes only utf-8 letters or spaces';
} else {
    echo 'no, not only utf-8 letters or spaces';
}
Esailija
  • 138,174
  • 23
  • 272
  • 326
  • I am not sure why you are so sure the OP's source files are not UTF-8. Even as UTF-8, the code for the OP's second question gives `'no, not only utf-8 letters'` and he wants to know why this is. – MikeM Mar 29 '13 at 12:23
  • @MikeM then even you don't have your source files as UTF-8, it does not print anything for me. That's because there is no `echo` in the else. – Esailija Mar 29 '13 at 12:43
  • @MikeM hmm, wait, are you referring to the snippet `2.` or the final snippet in the question? – Esailija Mar 29 '13 at 12:46
  • @MikeM I see, I have nothing on that beacuse it makes no sense to use `ctype_*` functions in the first place. :) – Esailija Mar 29 '13 at 12:54
  • @Esailija ok, then I understand that if I declare HTML lang is utf-8 then I won't have case 3 and case 4 which correspond to my question 2. Thank you. – Andre Chenier Mar 29 '13 at 12:58
  • @AndreChenier just don't use anything that depends on `setlocale`, there are always alternatives that just work. – Esailija Mar 29 '13 at 12:59