0
  1. I'm converting a user input string (successfully but...)
  2. I would like to ignore characters wrapped in braces
  3. Also remove the braces in the final output

So for instance if I have this string:

$string = "[ABC] This & Text";

function make_post_type($string) {
  $needle   = array('-', ' ');
  $clean    = preg_replace("/[^a-zA-Z0-9_\s]/", "", strtolower($string)); // Remove special characters
  $haystack = preg_replace('!\s+!', ' ', $clean); // Now remove extra spaces

  return str_replace($needle, '_', $haystack);
}

returns abc_this_text

I would like to return ABC_this_text

Aaron
  • 10,187
  • 3
  • 23
  • 39
  • Using strtolower twice on the same string may have something to do with it.... – Andreas Oct 17 '18 at 17:47
  • @Andreas That was a mistake, code updated ;) – Aaron Oct 17 '18 at 17:50
  • I would separate the job in two parts. One, one who detects the string in the braces, to generate a new output, and then, use **[^a-zA-Z\[\]]+** to replace with **_**, i don't know a lot of php, that's on you. – lucas_7_94 Oct 17 '18 at 18:03

4 Answers4

2

You may use this regex code in preg_replace_callback:

function replc($str) {
   return preg_replace_callback (
      '/\[([^]]*)\]|{([^}]*)}|([^][{}]+)/',
      function ($m) {
         return (isset($m[1])?$m[1]:"") .
                (isset($m[2])?$m[2]:"") .
                 (isset($m[3]) ?
                 preg_replace('/\W+/', '_', strtolower($m[3])) : "");
      },
      $str
   );
}

Call it as:

echo replc( "[ABC] This & Text" );
ABC_this_text

echo replc( "abc.Xyz {PQR} Foo-Bar [ABC] This & Text" );
abc_xyz_PQR_foo_bar_ABC_this_text

1st RegEx Details:

  • [([^]]*)\]: If we encounter [...] then capture inner part in group #1
  • |: OR
  • {([^}]*)}: If we encounter {...} then capture inner part in group #2
  • |: OR
  • [^][{}]+: Match 1+ character that is not [ and ] and { and } and capture in group #3

2nd RegEx:

  • \W+: match 1+ non-word character to be replaced by _
anubhava
  • 761,203
  • 64
  • 569
  • 643
1

One solution is to split the string in to an array of words.
If the word contains [] remove them only, else do all the other stuff with special characters and strtolower.

Then implode back to string and return

$string = "[ABC] This & Text";
Echo make_post_type($string);


function make_post_type($string) {
  $needle   = array('-', ' ');
  $arr = explode(" ", $string);
  foreach($arr as &$a){
     if($a[0] != "[" && $a[-1] != "]"){
        $a = preg_replace("/[^a-zA-Z0-9_\s]/", "", strtolower($a)); // Remove special characters
     }else{
        $a = substr($a, 1,-1);
     }
  }
  $string = preg_replace('!\s+!', ' ', implode(" ", $arr)); // Now remove extra spaces

  return str_replace($needle, '_', $string);
}

https://3v4l.org/rZlaP

Andreas
  • 23,610
  • 6
  • 30
  • 62
1

You might use preg_match_all and use 2 capturing groups.

\[([A-Z]+)\]|(\w+)

Use array_reduce to check for the capuring groups by index and finally implode using an underscore:

For example:

$re = '/\[([A-Z]+)\]|(\w+)/';
$string = "[ABC] This & Text";
preg_match_all($re, $string, $matches, PREG_SET_ORDER, 0);

echo implode('_', array_reduce($matches, function($carry, $item){
    if ($item[1] === "") {
        $carry[] = strtolower($item[2]);
        return $carry;
    }
    $carry[] = $item[1];
    return $carry;

})); //ABC_this_text

Explanation

  • \[([A-Z]+)\] Match [, capture 1+ uppercase charaters in a group and match ]. To match everything between the brackets you could use \[([^]]+)\] instead.
  • | Or
  • (\w+) Capture in a group 1+ word characters. If you want to match more than \w you could use a character class and add what you want to match for example [\w!?]+

Regex demo | Php demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
1

You could reduce the steps to produce desired string by looking at the problem in another way. First match what you need then replace whitespaces and dashes with _ at the end:

function make_post_type($s) {
    preg_match_all("~({[^{}]*}|\[[^][]*])|[\w\s]+~", $s, $m);
    $s = '';
    foreach($m[0] as $k => $v) {
        $s .= $m[1][$k] ? substr($v, 1, -1) : strtolower($v);
    }
    return preg_replace('~[-\s]+~', '_', $s);
}

I enclosed {[^{}]*}|\[[^][]*] in parentheses to be able to check for (bool) $m[1][$k] later which tells if current value in iteration exists in captured groups returned by $m[1] then strip off one leading and trailing character from the string.

See live demo here

revo
  • 47,783
  • 14
  • 74
  • 117