9
  1. The first question is this:

I am using http://www.phpliveregex.com/ to check my regex is right and it finds more than one matching lines.

I am doing this regex:

$lines = explode('\n', $text);
foreach($lines as $line) {
    $matches = [];
    preg_match("/[0-9]+[A-Z][a-z]+ [A-Z][a-z]+S[0-9]+\-[0-9]+T[0-9]+/uim", $line, $matches);

    print_r($matches);
}

on the $text which looks like this: http://pastebin.com/9UQ5wNRu

The problem is that printed matches is only one match:

Array
(
     [0] => 3Bajus StanislavS2415079249-2615T01
)

Why is it doing to me? any ideas what could fix the problem?

  1. The second question

Maybe you've noticed not regular alphabetic characters of slovak language inside the text (from pastebin). How to match those characters and select the users which have this format:

{number}{first_name}{space}{last_name}{id_number}

how to do that?

Ok first issue is fixed. Thank you @chris85 . I should have used preg_match_all and do it on the whole text. Now I get an array of all students which have non-slovak (english) letters in the name.

durisvk
  • 927
  • 2
  • 12
  • 24

2 Answers2

15

preg_match is for one match. You need to use preg_match_all for a global search.

[A-Z] does not include an characters outside that range. Since you are using the i modifier that character class actual is [A-Za-z] which may or may not be what you want. You can use \p{L} in place of that for characters from any language.

Demo: https://regex101.com/r/L5g3C9/1

So your PHP code just be:

preg_match_all("/^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$/uim", $text, $matches);
print_r($matches);
chris85
  • 23,846
  • 7
  • 34
  • 51
  • 2
    Wow. How dumb of me. Had the same problem and didn't know what is wrong. The global modifier isn't allowed so I thought it's by default, staring at the manual page of preg_match didn't help me - maybe I should've looked down at the similar functions.. ^^ – Dennis98 Oct 15 '17 at 22:42
  • @Dennis98 If you find this answer useful you could give it an upvote – chris85 Oct 16 '17 at 01:42
  • Hmm, interesting.. Normally I do that, I haven't here or didn't have internet connection then. And it's also interesting, that you noticed I didn't upvote - normally you don't see who upvoted and who not.. – Dennis98 Oct 18 '17 at 16:17
0

You can also use T-Regx library:

pattern("^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$", 'uim')->match($text)->all();
Danon
  • 2,771
  • 27
  • 37