0

I need a java regex pattern to validate input String: the input can containt 3 or more letters, followed by 7 or more digits. The sum of the characters should be between 10 and 14.

I wrote a pattern, and tested working, I realized this with 2 sections: 1 positive lookahead that checks for characters format (3 or more letters followed by 7 or more numbers) 2 positive lookahead checks for input string character length in mass

My pattern: (?=^[A-Z]{3,}[0-9]{7,}$)(?=^[A-Z0-9]{10,14}$)

When I use in java8 with Matcher.matches(), it does not match instead if I use matcher.find(), it gives me true.

I tried this pattern: (?=^[A-Z]{3,}[0-9]{7,}$)(?=^[A-Z0-9]{10,14}$) with Matcher.matches() and was expecting to give me true, but give me false.

If I try this pattern with matcher.find(), it gives me true, but I also have other patterns in use, and that don`t have start and end sign, so find() function gives true for that pattern (gives wrong result) if the input string contains other characters too (so I would not use find because other patterns if not neccessarry).

input should work: ROM1234567 ROMM1234567 ROM123456789

input should not work: RO1234567 RO123456 ROM123456 ROM123456789012

Holger
  • 285,553
  • 42
  • 434
  • 765
carloska
  • 51
  • 1
  • 4

2 Answers2

2

Matcher.matches() checks if full string matches provided pattern. But you pattern doesn't actually matches anything: lookaheads (and lookarounds in general) do not consume input.

You can either use pattern that actually matches string. Like this:

^(?=[A-Z]{3,}[0-9]{7,}$)[A-Z0-9]{10,14}$

or

^(?=[A-Z]{3,}[0-9]{7,}$)(?=[A-Z0-9]{10,14}$).*

Demo of the first example here. Notice, how it matches full line, instead of empty string in the beginning, like your attempt did it.

Or use matcher.find() since it looks for substring and perfectly happy with pattern that matches empty string in the beginning of the input.

markalex
  • 8,623
  • 2
  • 7
  • 32
  • 2
    Side note: when you use `matches` without the look-ahead, like in your first example, `^` at the beginning and `$` at the end are unnecessary. – Holger Jun 22 '23 at 17:47
  • @Holger, I prefer them anyway, because such regexes are compatible with other methods (like mentioned `find()`), and `^` before lookahead makes it easier to understand. – markalex Jun 22 '23 at 17:58
1

You night try as your regex:

^(?=.{10,14}$)[A-Z]{3,}[0-9]{7,}\Z
  1. ^ - Matches start of string.
  2. (?.{10,14}$) - Positive lookahead assertion that the string contains from 10 to 14 non-newline characters.
  3. [A-Z]{3,}[0-9]{7,} - Matches 3 or more alpha followed by 7 or more digits.
  4. \Z - Matches the end of string.

Note that in I have used \Z instead of $, which also will match a newline character at the end of the string, which presumably you do not want as part of the input. That is, the input should consist exclusively of alphanumeric characters. If you know that a newline character cannot be entered or one at the end of the line is acceptable, then use $ instead.

Booboo
  • 38,656
  • 3
  • 37
  • 60