First, since you already have a working code you want to improve, consider to post your question in code review instead of stackoverflow next time.
Let's start to improve your original approach:
$result = preg_replace_callback('~"[^"]*"\s*:~', function ($m) {
return preg_replace_callback('~_+(.?)~', function ($n) {
return strtoupper($n[1]);
}, strtolower($m[0]));
}, $str);
pro: patterns are relatively simple and the idea is easy to understand.
cons: nested preg_replace_callback
's may hurt the eyes.
After this eyes warm-up exercice, we can try a \G
based pattern approach:
$pattern = '~(?|\G(?!^)_([^_"]*)|("(?=[^"]*"\s*:)[^_"]*))~';
$result = preg_replace_callback($pattern, function ($m) {
return ucfirst(strtolower($m[1]));
}, $str);
pro: the code is shorter, no need to use two preg_replace_callback
's.
cons: the pattern is from far more complicated.
notice: When you write a long pattern, nothing forbids to use the free-spacing mode with the x modifier and to put comments:
$pattern = '~
(?| # branch reset group: in which capture groups have the same number
\G # contigous to the last successful match
(?!^) # but not at the start of the string
_
( [^_"]* ) # capture group 1
|
( # capture group 1
"
(?=[^"]*"\s*:) # lookahead to check if it is the "key part"
[^_"]*
)
)
~x';
Is there compromises between these two extremes, and what is the good one? Two suggestions:
$result = preg_replace_callback('~"[^"]+"\s*:~', function ($m) {
return array_reduce(explode('_', strtolower($m[0])), function ($c, $i) {
return $c . ucfirst($i);
});
}, $str);
pro: minimal use of regex.
cons: needs two callback functions except that this time the second one is called by array_reduce
and not by preg_replace_callback
.
$result = preg_replace_callback('~["_][^"_]*(?=[^"]*"\s*:)~', function ($m) {
return ucfirst(strtolower(ltrim($m[0], '_')));
}, $str);
pro: the pattern is relatively simple and the callback function stays simple too. It looks like a good compromise.
cons: the pattern isn't very constrictive (but should suffice for your use case)
pattern description: the pattern looks for a _ or a " and matches following characters that aren't a _ or a ". A lookahead assertion then checks that these characters are inside the key part looking for a closing quote and colon. The match result is always like _aBc
or "aBc
(underscores are trimmed on the left in the callback function and "
stays the same after applying ucfirst
).
pattern details:
["_] # one " or _
[^"_]* # zero or more characters that aren't " or _
(?= # open a lookahead assertion (followed with)
[^"]* # all that isn't a "
" # a literal "
\s* # eventual whitespaces
: # a literal :
) # close the lookahead assertion
There's no good answer and what looks simple or complicated really depends on the reader.