4

I have issue with following example:

import java.util.regex.*;
class Regex2 {
    public static void main(String[] args) {
        Pattern p = Pattern.compile(args[0]);
        Matcher m = p.matcher(args[1]);
        boolean b = false;
        while(b = m.find()) {
            System.out.print(m.start() + m.group());
        }
    }
}

And the command line:

java Regex2 "\d*" ab34ef

Can someone explain to me, why the result is: 01234456

regex pattern is d* - it means number one or more but there are more positions that in args[1],

thanks

Unmitigated
  • 76,500
  • 11
  • 62
  • 80
Karlo
  • 65
  • 5

1 Answers1

11

\d* matches 0 or more digits. So, it will even match empty string before every character and after the last character. First before index 0, then before index 1, and so on.

So, for string ab34ef, it matches following groups:

Index    Group
  0        ""  (Before a)
  1        ""  (Before b)
  2        34  (Matches more than 0 digits this time)
  4        ""  (Before `e` at index 4)
  5        ""  (Before f)
  6        ""  (At the end, after f)

If you use \\d+, then you will get just a single group at 34.

Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • Does the result have something with position of characters or is result indexes of characters or? – Karlo Aug 22 '13 at 15:37
  • @Karlo. Result is the index of the character before which the empty string is matched. – Rohit Jain Aug 22 '13 at 15:38
  • @RohitJain how can it be that the if the `0` index is referring to "Before a" and that the `2` index is at the `34`? if `0` is _before a_, then wouldn't it be the `3` index that is at `34`? Is the scanner at one position and looking to the position to the left of it? Is the scanner position at a but looking at _Before a_? – sdc Sep 29 '16 at 09:24