0

I'm trying to create a regex that blew my mind. How do I make the regex negation below when I have two consecutive numbers?

/^([\p{L}\p{N}\. ]+)(, ?| )([0-9-]+[a-z]?)(, ?| |$)(.*)/iu

Valid examples:

Text Text 123 anything
Text Text, 123, anything
Text Text 123B anything
Text Text, 123B, anything
Text 123 anything
Text 123B anything
Text, 123, anything
Text, 123B, anything
987 123 anything
987 123B anything
987, 123, anything
987, 123B, anything

(Need to be) Invalid examples:

Text Text 456 123 anything
Text Text, 456, 123, anything
Text Text 456 123B anything
Text Text, 456, 123B, anything
Text 456 123 anything
Text 456 123B anything
Text, 456, 123, anything
Text, 456, 123B, anything
987 456 123 anything
987 456 123B anything
987, 456, 123, anything
987, 456, 123B, anything

But as you guys can see, all the examples above are valid for my regex: https://regex101.com/r/6t5Oq5/4

Requirements: The first group may have letters or numbers. The second group can have numbers or numbers followed by a letter, and the third group can have anything. All groups can be separated by commas or space. And all letters and numbers can be any size. There can not be consecutive numbers in the string, unless the number is in the first group or in the last group (anything).

What is the best way to do this?

  • No idea what your requirements are, so, all I can suggest is adding a lookahead like [here](https://regex101.com/r/htdxaY/1). To make sure you do not have any `([,][ ]?)([0-9-]+[a-z]?)` after `([,][ ]?)([0-9-]+[a-z]?)` – Wiktor Stribiżew Aug 20 '17 at 19:49
  • Hello @WiktorStribiżew , Your example did the trick, I'm going to do a few more tests, but apparently that's what I wanted! – Rodrigo Gomes Aug 20 '17 at 20:09
  • Hello @WiktorStribiżew , After a few tests, I saw that it was not what I wanted to do. I updated my question to get more accurate, and I put a few more examples. – Rodrigo Gomes Aug 20 '17 at 20:42
  • And what is the fail condition? – Wiktor Stribiżew Aug 20 '17 at 20:43
  • @WiktorStribiżew It's hard to explain, I do not even know if that's possible. There can not be consecutive numbers in the string, unless the number is in the first group or in the last group (anything). Maybe I need to create two regex rules for this. – Rodrigo Gomes Aug 20 '17 at 20:52
  • is `Text Text Text, 123, anything` a valid example? – shockawave123 Aug 20 '17 at 20:54
  • @shockawave123 yes. But "Text Text Text 456 123 anything" it is not. – Rodrigo Gomes Aug 20 '17 at 20:55
  • Can you elaborate on what `anything` means? if you don't accept `Text, 456 {anything}` where `anything = '123 more text'` then i'm guessing `anything` doesn't mean `.*`. So what other restrictions are there with `anything`? Are you looking for one block of alphanumeric text? – shockawave123 Aug 20 '17 at 21:04
  • Also. Is there certainty that if the first block is separated using `,` then the other blocks will be separated by `,` as well? – shockawave123 Aug 20 '17 at 21:07
  • @shockawave123 Sorry for not explaining it clearly. Anything means anything at all. It may have commas, spaces, numbers, or letters. Yes, if the first block uses a comma, the others will also use it. But it may be that this does not use any commas, just space. – Rodrigo Gomes Aug 20 '17 at 21:17
  • Try https://regex101.com/r/1lCfQ8/1 – Wiktor Stribiżew Aug 20 '17 at 21:25
  • so correct me if i am wrong, but if anything can be "anything at all", then `anything` could equal `123 more text` right? so therefore `Text 456 123 more text` should be allowed since you are looking for strings in the format `{group1} {group2} {anything}` where `group1 = Text` `group2 = 456` and `anything = 123 more text`, but you say this is invalid. clearly I'm missing something here. – shockawave123 Aug 20 '17 at 21:27
  • @WiktorStribiżew I tried, but the group anything did not work as expected. Thanks for the suggestions. – Rodrigo Gomes Aug 20 '17 at 21:36
  • Guys, I'm asking for something impossible to do. So I give up, there's no way I can do that. But talking to you guys has helped me see that this is impossible. Thank you for your time. – Rodrigo Gomes Aug 20 '17 at 21:36

2 Answers2

2

Not 100% sure on the required rules of yours but this here regex matches the first but not the second block:

/^([a-z0-9]+,? )([0-9]+[a-z]?,? )([a-z0-9]+)$/

Demo here: http://regexr.com/3gjd7

Bananaapple
  • 2,984
  • 2
  • 25
  • 38
  • Hello @Bananaapple, Sorry for not explaining it in detail. The first group may have letters or numbers. The second group can have numbers or numbers followed by a letter, and the third group can have anything. All groups can be separated by commas or space. And all letters and numbers can be any size. – Rodrigo Gomes Aug 20 '17 at 20:05
  • I've updated my answer to better meet these criteria although I am unclear on "anything" - anything at all including spaces and commas or just any alphanumeric combo? My new answer assumes the latter. – Bananaapple Aug 20 '17 at 20:12
  • Anything I wanted to say anything, including commas and spaces. Thanks for your help and suggestion! Apparently @WiktorStribiżew example did the trick. I'm doing some testing right now. – Rodrigo Gomes Aug 20 '17 at 20:21
  • After my tests, @WiktorStribiżew response did not work. I updated my question now to be more precise. But it's hard to explain. The first group may have letters or numbers and form a sentence. But if he has a number after him, and before him, then he is invalid. Ex. "Text Text 456 123 anything" – Rodrigo Gomes Aug 20 '17 at 20:38
2

Based on what you posted, use this Pattern ^(\S+)(?=[^\d\r\n]+\d+[^\d\r\n]+$).* Demo

^                       # Start of string/line
(                       # Capturing Group (1)
  \S                    # <not a whitespace character>
  +                     # (one or more)(greedy)
)                       # End of Capturing Group (1)
(?=                     # Look-Ahead
  [^\d\r\n]             # Character not in [\d\r\n] Character Class
  +                     # (one or more)(greedy)
  \d                    # <digit 0-9>
  +                     # (one or more)(greedy)
  [^\d\r\n]             # Character not in [\d\r\n] Character Class
  +                     # (one or more)(greedy)
  $                     # End of string/line
)                       # End of Look-Ahead
.                       # Any character except line break
*                       # (zero or more)(greedy)
alpha bravo
  • 7,838
  • 1
  • 19
  • 23
  • With the help of the above colleagues, I have been able to realize that what I want is impossible to accomplish. But your regex has come closer to that, and it works for the examples I've shown. So I am choosing your answer to end the discussion on the subject and to help other people who have a question similar to mine. Thank you! – Rodrigo Gomes Aug 21 '17 at 00:58