0

I need help to understand the output of the code below. I am unable to figure out the output for System.out.print(m.start() + m.group());. Please can someone explain it to me?

import java.util.regex.*;
class Regex2 {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("\\d*");
        Matcher m = p.matcher("ab34ef");
        boolean b = false;
        while(b = m.find()) {
            System.out.println(m.start()  + m.group());
        }
    }
}

Output is:

0
1
234
4
5
6

Note that if I put System.out.println(m.start() );, output is:

0
1
2
4
5
6
Duncan Jones
  • 67,400
  • 29
  • 193
  • 254
jai
  • 547
  • 7
  • 20

3 Answers3

5

Because you have included a * character, your pattern will match empty strings as well. When I change your code as I suggested in the comments, I get the following output:

0 ()
1 ()
2 (34)
4 ()
5 ()
6 ()

So you have a large number of empty matches (matching each location in the string) with the exception of 34, which matches the string of digits. Use \\d+ if you want to match digits without also matching empty strings..

Community
  • 1
  • 1
Duncan Jones
  • 67,400
  • 29
  • 193
  • 254
  • Hi, i have 6 strings with the length of 5 .But it prints even 6 also .And How i can identify the empty string.I couldn't understand what u mean by empty String.pls.. – jai Jan 03 '14 at 09:51
  • 1
    @azeemj. An empty string is `""`. I'm not exactly sure why the match includes index 6, perhaps someone else can elaborate on that. – Duncan Jones Jan 03 '14 at 09:58
2

You used this regex - \d* - which basically means zero or more digits. Mind the zero!

So this pattern will match any group of digits, e.g. 34 plus any other position in the string, where the matched sequence will be the empty string.

So, you will have 6 matches, starting at indices 0,1,2,4,5,6. For match starting at index 2, the matched sequence is 34, while for the remaining ones, the match will be the empty string.

If you want to find only digits, you might want to use this pattern: \d+

Andrei Nicusan
  • 4,555
  • 1
  • 23
  • 36
1

d* - match zero or more digits in the expresion.

expresion ab34ef and his corresponding indices 012345

On the zero index there is no match so start() prints 0 and group() prints nothing, then on the first index 1 and nothing, on the second we find match so it prints 2 and 34. Next it will print 4 and nothing and so on.

Another example:

Pattern pattern = Pattern.compile("\\d\\d");
Matcher matcher = pattern.matcher("123ddc2ab23");
while(matcher.find()) {
    System.out.println("start:" + matcher.start() + " end:" + matcher.end() + " group:" + matcher.group() + ";");
}

which will println:

start:0 end:2 group:12;
start:9 end:11 group:23;

You will find more information in the tutorial

piobab
  • 1,352
  • 2
  • 13
  • 21