I think this works without any "fancy" regex features such as negative lookahead.
^([0-35-9]*|4[0-46-9]|45[0-57-9]|4$|45$)*$
That is:
- start
- any number of:
- a sequence of digits not including 4
- or a 2 char number starting with "4", but not "45"
- or a 3 char number starting with "45", but not "456"
- or a 4 followed by end
- or a 45 followed by end
- end
This is in keeping with regex's property of being a finite state machine. We have explicitly dealt with three states - ("Not seen a 4", "Seen a 4", "Seen a 45"). If we wanted our 'not matching' string to be "4567" we'd have to explicitly add another state, making the pattern longer and the state machine bigger.
Whether this meets your needs depends on what the test is looking for -- familiarity with advanced features of Java's regex dialect, or ability to apply regular expressions universally (e.g. basic grep
, bash
).
Negative lookaheads, allow you to express this more tersely.
^((!?456)\d)*$
That is (with start and end anchors around it), zero or more repetitions of a one-char pattern: (!?456)\d
which means "Not the start of 456
(looking ahead without actually consuming) and matches a numeric character."
To process this, the regex engine only ever needs to look 3 chars ahead of the current character, making this an efficient one-pass way of meeting the requirement.