5

I am learning regular expressions to use them in lex program. I've Seen here that, in Regular Expressions:

'*' matches 0 or more occurances of pattern
'?' matches 0 or 1 occurance of the pattern

By this I'm kinda Confused. I mean:

  • if we can match 0 or more by '*' then why should we use '?' meta character?
  • We define Float as : FL [0-9]*"."[0-9]+
  • Can we define it as : FL [0-9]?"."[0-9]+ for numbers like 0.999 or .999 etc (ie, Number with only one digit before radix point . )?
  • Can Any one please explain this? Thanking you in advance :).

    nwellnhof
    • 32,319
    • 7
    • 89
    • 113
    Vedant Terkar
    • 4,553
    • 8
    • 36
    • 62

    1 Answers1

    10

    If you want to match 0, 1, 2, 3, 4, 5, 6, or more occurrences, use *.

    If you only want to match 0 or 1 occurrences, use ?.

    For instance, consider this text: "________starts with whitespace"

    If I want to match all of the underscores at the beginning of that text, but I don't want to require that they be there (they're optional), I'd use _*.

    In contrast, if I to just match an optional single + in (say) "+44 20 1234 5678", I'd use \+? (a literal + with the ? after it). That will only match the single + or nothing, it would not match multiple + characters.

    T.J. Crowder
    • 1,031,962
    • 187
    • 1,923
    • 1,875
    • Thanks for quick response. Will `FL [0-9]?"."[0-9]+` Work for finding floats ? – Vedant Terkar Dec 01 '13 at 07:48
    • 1
      @VedantTerkar: I don't know the specific dialect of regex you're using, so those quotes look odd to me. But that's probably not correct, because if I'm reading it right, it would *not* match `23.5` because you've only allowed a *single* digit before the `.`. So you'd probably want `*` rather than `?` there, so you match any number of digits before the `.`. – T.J. Crowder Dec 01 '13 at 07:49
    • I'm Using flex and DevCPP on windows 7 and `FL [0-9]*"."[0-9]+` works for me. but If I want to accept only one digit before `.` as you've said. Then can `FL [0-9]?"."[0-9]+` solve this? – Vedant Terkar Dec 01 '13 at 07:53
    • @VedantTerkar: Right, `[0-9]?` means "zero or one digits, but not two or more". `[0-9]*` means "zero or more digits (no limit, could be 42 of them)". Note that some languages require that floats are written with a leading `0` before the `.` if the number is between `0` and `1` (`0.5` not `.5`), others don't, I don't know which kind you're trying to validate. – T.J. Crowder Dec 01 '13 at 08:04