5

I have a to rewrite a part of an existing C#/.NET program using Java. I'm not that fluent in Java and am missing something handling regular expressions and just wanted to know if I'm missing something or if Java just doesn't provide such feature.

I have data like

2011:06:05 15:50\t0.478\t0.209\t0.211\t0.211\t0.205\t-0.462\t0.203\t0.202\t0.212

The Regex pattern I'm using looks like:

?(\d{4}:\d{2}:\d{2} \d{2}:\d{2}[:\d{2}]?)\t((-?\d*(\.\d*)?)\t?){1,16}

In .NET I can access the values after matching using match.Group[3].Captures[i].

In Java I haven't found anything like that. matcher.group(3) just returns an empty string.

How can I achieve a behaviour like the one I'm used to from C#?

ekad
  • 14,436
  • 26
  • 44
  • 46
signpainter
  • 720
  • 1
  • 7
  • 22
  • Should be the same. Compile your Pattern, match, that's it. java regex is just normal regex. I'm tempted to say "check your input" etc. – Bohemian Jun 05 '11 at 14:18
  • The problem is that when doing a multiple value group fit, `(something)*` or `(something){1,16}`, Java only returns the LAST fitted item in the fit. So I guess in your case, group 1 is `2011:06:05 15:50` and group 2 is `0.212` and that's that. – toto2 Jun 05 '11 at 14:23
  • @toto: Exactly, that's the problem. Additionally I can put some parenthesis around the multiple value group like `((something){1,16})`. Then I get the whole string `(something){1,16}` matches, i.e. `0.478\t0.209\t0.211\t0.211\t0.205\t-0.462\t0.203\t0.202\t0.212`. But then I have to split the resulting string again. That's not what I think is the right way to go... – signpainter Jun 05 '11 at 14:28
  • see my answer. It is I think the only way to go. But I'd like to be proven wrong and see someone coming up with a more elegant answer. – toto2 Jun 05 '11 at 14:31

1 Answers1

3

As I mentioned in the comments, Java will only return the last value of a multiple valued group fit. So you should first use regex to isolate the last part of your string with the values:

strg = "0.478\t0.209\t0.211\t0.211\t0.205\t-0.462\t0.203\t0.202\t0.212"

and then just split around the tabs:

String[] values = strg.split("\\t");

toto2
  • 5,306
  • 21
  • 24
  • 1
    Yeah, that's my temporary solution so far. Also waiting for a better solution... ;) – signpainter Jun 05 '11 at 14:34
  • @signpainter: .NET and Perl 6 are the only regex implementations that currently support captures of repeated groups. So there is no alternative to this solution if you're using Java. – Tim Pietzcker Jun 05 '11 at 14:59