I have this string:
"Common Waxbill - Estrilda astrild"
How can I write 2 separate regexes for the words before and after the hyphen? The output I would want is:
"Common Waxbill"
and
"Estrilda astrild"
I have this string:
"Common Waxbill - Estrilda astrild"
How can I write 2 separate regexes for the words before and after the hyphen? The output I would want is:
"Common Waxbill"
and
"Estrilda astrild"
This is quite simple:
.*(?= - ) # matches everything before " - "
(?<= - ).* # matches everything after " - "
See this tutorial on lookaround assertions.
If you cannot use look-behinds, but your string is always in the same format and cannout contain more than the single hyphen, you could use
^[^-]*[^ -]
for the first one and \w[^-]*$
for the second one (or [^ -][^-]*$
if the first non-space after the hyphen is not necessarily a word-character.
A little bit of explanation:
^[^-]*[^ -]
matches the start of the string (anchor ^
), followed by any amount of characters, that are not a hyphen and finally a character thats not hyphen or space (just to exclude the last space from the match).
[^ -][^-]*$
takes the same approach, but the other way around, first matching a character thats neither space nor hyphen, followed by any amount of characters, that are no hyphen and finally the end of the string (anchor $
). \w[^-]*$
is basically the same, it uses a stricter \w
instead of the [^ -]
. This is again used to exclude the whitespace after the hyphen from the match.
Another solution is to string split on the hyphen and remove white space.
The main challenge of your Question is that you want two separate items. This means that your process is dependent on another language. RegEx itself does not parse or separate a string; it only explains what we are looking for. The language you are using will make the actual separation. My answer gets your results in PHP, but other languages should have comparable solutions.
If you want to just do the job in your Question, and if you're using PHP...
explode("-", $list);
-> $array[]
This is useful if your list is longer than two items:
<?php
// Generate our list
$list = "Common Waxbill - Estrilda astrild";
$item_arr = explode("-", $list);
// Iterate each
foreach($item_arr as $item) {
echo $item.'<br>';
}
// See what we have
echo '
<pre>Access array directly:</pre>'.
'<pre>'.$item_arr[0].'x <--notice the trailing space</pre>'.
'<pre>'.$item_arr[1].' <--notice the preceding space</pre>';
...You could clean up each item and reassign them to a new array with trim()
. This would get the text your Question asked for (no extra spaces before or after)...
// Create a workable array
$i=0; // Start our array key counter
foreach($item_arr as $item) {
$clean_arr[$i++] = trim($item);
}
// See what we have
echo '
<pre>Access after cleaning:</pre>'.
'<pre>'.$clean_arr[0].'x <--no space</pre>'.
'<pre>'.$clean_arr[1].' <--no space</pre>';
?>
Output:
Common Waxbill
Estrilda astrild
Access array directly:
Common Waxbill x <--notice the trailing space
Estrilda astrild <--notice the preceding space
Access after cleaning:
Common Waxbillx <--no space
Estrilda astrild <--no space
substr(strrpos())
& substr(strpos())
This is useful if your list will only have two items:
<?php
// Generate our list
$list = "Common Waxbill - Estrilda astrild";
// Start splitting
$first_item = trim(substr($list, strrpos($list, '-') + 1));
$second_item = trim(substr($list, 0, strpos($list, '-')));
// See what we have
echo "<pre>substr():</pre>
<pre>$first_item</pre>
<pre>$second_item</pre>
";
?>
Output:
substr():
Estrilda astrild
Common Waxbill
Note strrpos() and strpos() are different and each have different syntax.
If you're not using PHP, but you want to do the job in some other language without depending on RegEx, knowing the language would be helpful.
Generally, programming languages come with tools for jobs like this out of box, which is part of why people choose the languages they do.