0

I have below string. This string having data (@[ID:username__FULLNAME]) of three users mentioned. I want to extract them. I have tried below code but not getting desired results.

ID is integer type
username and FULLNAME may contain numbers, letter and all kind of special chars.


$t = 'Hi @[4232:mark__MΛRK ATTLEY] how are you ? 
    Hi @[4232:ryan__RYΛN вυηту] how are you ? 
    Hi @[4232:david__DΛVID शाहिद ] how are you ? 
    ';

My PHP CODE:

$pattern = "|(?:(@\[[0-9]+:[\s\S(?!\])]+\]*))|";
preg_match_all($pattern, $string, $mentionList, PREG_PATTERN_ORDER);
print_r($mentionList);

Current Result:

Array
(
    [0] => Array
        (
            [0] => @[4232:mark__MΛRK ATTLEY] how are you ? 
    Hi @[4232:ryan__RYΛN вυηту] how are you ? 
    Hi @[4232:david__DΛVID शाहिद] how are you ? 

        )

    [1] => Array
        (
            [0] => @[4232:mark__MΛRK ATTLEY] how are you ? 
    Hi @[4232:ryan__RYΛN вυηту] how are you ? 
    Hi @[4232:david__DΛVID शाहिद] how are you ? 

        )

)

Expected Result:

Array
(
    [0] => Array
        (
            [0] => @[4232:mark__MΛRK ATTLEY]
            [1] => @[4232:ryan__RYΛN вυηту]
            [2] => @[4232:david__DΛVID शाहिद ]
        )

)

Can someone help me getting the desired results?

Thanks.

Aefits
  • 3,399
  • 6
  • 28
  • 46

3 Answers3

1

You can use this regex with 3 captured groups:

/@\[(\d+):(\S+)\h+(\S+)\h*\]/

RegEx Demo

RegEx Explanation:

  • @: Match literal @
  • \[: Match literal [
  • (\d+): Match 1+ digits and capture it in group #1 for id
  • :: Match literal :
  • (\S+): Match 1+ non-whitespace characters and capture it in group #2 for firstName
  • \h+: Match 1 or more horizontal whitespaces
  • (\S+): Match 1+ non-whitespace characters and capture it in group #3 for lastName
  • \h*: Match 0 or more horizontal whitespaces
  • \]: Match literal ]
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Although I am getting expected results with your regex but alsi getting some additional data that I don;t need. I see Edwin's regex is perfect for me. TY anyway. – Aefits Jan 18 '18 at 11:35
  • See my comment on that answer. What is point in having definition of `@[id:firstName lastName]` and allowing just `@[]` as valid input. – anubhava Jan 18 '18 at 11:37
  • Yup seen. Thanks for pointing out that mistake I replaced * with + to solve – Aefits Jan 18 '18 at 11:41
1

You can use the following regex: @\[.+\] (demo) that gets you all you have in [] plus the front @.

Check this working php demo

Edwin
  • 2,146
  • 20
  • 26
  • 1
    It will aso match `@[4232:mark__MΛRK ATTLEY] []` or just `@[]` – anubhava Jan 18 '18 at 11:35
  • isn't any requirment not to match that, where `@[4232:mark__MΛRK ATTLEY] []` is the username `mark__MΛRK ATTLEY] [` ? maybe it is – Edwin Jan 18 '18 at 11:40
  • 1
    i update the original with @\[.+\] which resolved the [] or @[] issue – Aefits Jan 18 '18 at 11:41
  • 1
    @elegant-user whats about `@[4232:mark__MΛRK ATTLEY] i have array[10]` without ungreedy (`?' ) - https://regex101.com/r/T5Ts0m/5 – splash58 Jan 18 '18 at 11:54
  • @anubhava as said, that's not specified in the question, your regex is also not 100% right (see the example with the `] [`). Sure you can improve both of them, but only with more input. As for the expected result seems to be needed just the `@` and what's inside the `[ ]`. Cheers – Edwin Jan 18 '18 at 11:58
  • 1
    @Edwin now the only difference between my expression in comment and yours is question mark whick make my one ungreedy. As a result, my expression matches upto the 1st closing bracket while your one - upto the last. I think the 1st case is more suitiable to don't depend on addissional text content – splash58 Jan 18 '18 at 13:30
1

Not sure if this will give you the exact output you are looking for, but yor regex is a bit too greedy. You can simplify it like this: (?:@\[[0-9]+.+?])

This should return the captured groups separately.

Not sure if the anonymous capture group is needed so it could be simplified down to (@\[[0-9]+.+?]) or possibly even (@\[.+?]).

KillerX
  • 1,436
  • 1
  • 15
  • 23