Okay, you have accepted an indirect answer since I've asked for question improvement in a comment under the question. I'll interpret this to mean that you have no intention of clarifying the question further and the other answer works as desired. For this reason, I'll offer a single regex solution so that you don't need to need to use iterated regex filtering after making an initial regex extraction.
For your limited sample data, your requirement boils down to:
Match whole "words" (visible characters separated by spaces) which:
- consist of numeric or alphanumeric strings and
- are a length between 4 and 20 characters.
You can subsequently eliminate duplicated matched strings with array_unique()
if desirable.
Code: (Demo)
$str = '-9 Cycles 3 Temperature Levels Steam Sanitizet+ -Sensor Dry | ALSO AVAILABLE (PRICES MAY VARY) |- White - 1258843 - DVE45R6100W {+ Platinum - 1501 525 - DVE45R6100P desirable: 1258843 DVE45R6100W';
if (preg_match_all('~\b(?:[A-Z]{4,20}(*SKIP)(*FAIL)|[A-Z\d]{4,20})\b~', $str, $m)) {
var_export(array_unique($m[0]));
}
Output:
array (
0 => '1258843',
1 => 'DVE45R6100W',
2 => '1501',
3 => 'DVE45R6100P',
)
Pattern Breakdown:
\b #the zero-width position between a character matched by \W and a character matched by \w
(?: #start non-capturing group
[A-Z]{4,20}(*SKIP)(*FAIL) #match and disqualify all-letter words
| #or
[A-Z\d]{4,20} #match between 4 and 20 digits or uppercase letters
) #end non-capturing group
\b #the zero-width position between a character matched by \W and a character matched by \w
Here are a couple alternative regex patterns for comparison -- one that doesn't use any lookarounds uses a "skip-fail" technique to disqualify purely alphabetical "words".
- 437 steps:
\b(?=\S*\d)[A-Z\d]{4,20}\b
- 325 steps:
\b(?=[A-Z]*\d)[A-Z\d]{4,20}\b
- 298 steps:
\b(?:[A-Z]{4,20}(*SKIP)(*FAIL)|[A-Z\d]{4,20})\b
The equivalent non-regex process (which I do not endorse) is: (Demo)
foreach (explode(' ', $str) as $word) {
$length = strlen($word);
if ($length >= 4 // has 4 characters or more
&& $length <= 20 // has 20 characters or less
&& !isset($result[$word]) // not yet in result array
&& ctype_alnum($word) // comprised numbers and/or letters only
&& !ctype_alpha($word) // is not comprised solely of letters
&& $word === strtoupper($word) // has no lowercase letters
) {
$result[$word] = $word;
}
}
var_export(array_values($result));