2

Question explanation

I have been trying to write a regex to pass for exactly this format:

"bob likes poo - whatever(&T(R)*HP#"
"  \t  \t  bob likes poo - *^RFVOG(IBHUO)B"

but fail on:

 "//bob likes poo - GV*(GF*("
 "# \t  bob likes poo - OHG(G(*"
 "bob does not like poo G&((HOUIHBO:"

They key bit being.

The line does NOT start with comment characters(# or //), can have blank spaces(space or tab), has to have something followed by delimeter(" - "), followed by whatever.

The corner cases are:

1) " \t   //this is still a comment - YGV^FV*"

should still fail.

2) "   /i_am//_no_/comment - FG&*G*&G"

should pass.

Random reasoning

well, I have failed. which made me ask if we can specify somehow to contain some character but not others. for example

[^abc]

just means any character that is not a, b or c. but how would we say not abc but 123? we can't just put

[^abc123]

because that will exclude them and can't do

[^abc]123

because that will mean it has to have 123 after some character that is not a,b,c which is total of 4 chars instead of 1 we want. I have no idea if it is even possible. So there are 2 quetsions here in a sense.

my best bet so far is:

 "[[:blank:]]*[^[:blank:]]+( - ).*"

this makes the format matching correct but does not account for the comments.

EDIT

I have found the working solution. It works but it's ugly as hell:

 "[[:blank:]]*([^[:blank:]#]([^/].*)?|[^[:blank:]#/].*)( - ).*"

if anyone knows how to make it nicer, please tell me.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Infogeek
  • 93
  • 1
  • 8
  • Umm, "not abc but must be 123" is same as "must be 123", or `[123]` as regexp... Or do I misunderstand that part of your question? – hyde Sep 17 '16 at 18:45
  • Is this what you want [`^[^\/#-]*(?:\/[^\/][^-]*)?-.*`](https://regex101.com/r/yW9xZ9/1)? – revo Sep 17 '16 at 20:12
  • What if you change non-capturing group `(?:...)` to a capturing group `(...)`? – revo Sep 17 '16 at 20:43
  • You can add spaces around `-` if they are essential parts of input string. `egrep` doesn't support non-capturing groups. That's it. I didn't know it at the first place. I think you don't need to escape slashes too. I just did it since they could be meaningful for RegEx engine. – revo Sep 17 '16 at 21:07
  • Why `- eouhfueo` shouldn't be matched? – revo Sep 17 '16 at 21:32
  • **DO NOT** vandalize your posts. They are to be useful to anyone here on StackOverflow. – OneCricketeer Oct 15 '16 at 19:41

1 Answers1

0

After understanding more things about requirements within comments I came with this RegEx:

^[[:blank:]]*(\/([^\/][^-]*|)|([[:blank:]]|^)[^[:blank:]\/#][^-]*) - .*

Matches:

enter image description here

By the way I don't know why really bob likes p** !

revo
  • 47,783
  • 14
  • 74
  • 117
  • If bob likes it then there he goes. Updated regex. – revo Sep 18 '16 at 15:58
  • If you don't use a start of string anchor `^` in `grep` then match may occur in middle of string which is undesired. I wasn't sure about `"` if it is used as the input wrapper or not so in my proposed RegEx I excluded it and instead used `^`. I admit that I didn't pay attention to your working solution. I thought differently from beginning and didn't hesitate to type more. Your own solution is much more cleaner and shorter. I never will go with mine if I had seen yours. Feel free to un-accept this answer. – revo Sep 18 '16 at 18:05