0

How can I match "Any Group" repeated as "ANY GROUP" or "ANYGROUP"

$string = "Foo Bar (Any Group - ANY GROUP Baz)
           Foo Bar (Any Group - ANYGROUP Baz)";

so they return as "Foo Bar (Any Group - Baz)"

The separator would always be -

This post extends Regex/PHP Replace any repeating word group

This matches "Any Group - ANY GROUP" but not when repeated without blank.

$result = preg_replace(
    '%
    (                 # Match and capture
     (?:              # the following:...
      [\w/()]{1,30}   # 1-30 "word" characters
      [^\w/()]+       # 1 or more non-word characters
     ){1,4}           # 1 to 4 times
    )                 # End of capturing group 1
    ([ -]*)           # Match any number of intervening characters (space/dash)
    \1                # Match the same as the first group
    %ix',             # Case-insensitive, verbose regex
    '\1\2', $subject);
Community
  • 1
  • 1
Martin
  • 2,007
  • 6
  • 28
  • 44

2 Answers2

1

This is ugly (as I said it would be), but it should work:

$result = preg_replace(
    '/((\b\w+)\s+)               # One repeated word
    \s*-\s*
    \2
    |
    ((\b\w+)\s+(\w+)\s+)         # Two repeated words
    \s*-\s*
    \4\s*\5
    |
    ((\b\w+)\s+(\w+)\s+(\w+)\s+) # Three
    \s*-\s*
    \7\s*\8\s*\9
    |
    ((\b\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+)  # Four
    \s*-\s*
    \11\s*\12\s*\13\s*\14\b/ix', 
    '\1\3\6\10-', $subject);
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Thanks. Yep 1-4 words in "Any Group" are fully ok. Sorry for not being clear about that. – Martin Nov 04 '12 at 11:59
  • Fixed all `\s*` to `\s+` tho, otherwise `Foo Bar (Any Group - ANY GROUPXYZ Baz)` makes problems. – Martin Nov 04 '12 at 12:04
  • What do you mean by "problems"? What should be the result in this case? Chances are that there is a better solution than using `\s+` (I made the spaces optional for a reason). – Tim Pietzcker Nov 04 '12 at 12:40
  • `Foo Bar (Any Group - ANY GROUPXYZ Baz)` becomes `Foo Bar (Any Group - XYZ Baz)` but should stay untouched since it's not the same. `\s+` prevents that since there must be a blank after ANY GROUP. `\s*` makes it optional. – Martin Nov 04 '12 at 17:14
  • OK (you could have specified that you only wanted to match entire words), that's easy to achieve with word boundary anchors: `\b` matches only at the start and end of a word. – Tim Pietzcker Nov 04 '12 at 18:39
0

Up to 6 word(s) solution is:

$result = preg_replace(
    '/
     (\(\s*)
     (([^\s-]+)
      \s*?([^\s-]*)
      \s*?([^\s-]*)
      \s*?([^\s-]*)
      \s*?([^\s-]*)
      \s*?([^\s-]*))
     (\s*\-\s*)
     \3\s*\4\s*\5\s*\6\s*\7\s*\8\s*
     /ix',
     '\1\2\9',
     $string);

Check this demo.

Ωmega
  • 42,614
  • 34
  • 134
  • 203