0

So I'm trying to separate the following two groups formatted as:

FIRST - GrouP              second.group.txt

The first group can contain any character The second group is a dot(.) delimited string.

I'm using the following regex to separate these two groups:

([A-Z].+).*?([a-z]+\.[a-z]+)

However, it gives a wrong result:

1: FIRST - GrouP second.grou
2: p.txt

I don't understand because I'm using "nongreedy" separater (.*?) instead of the greedy one (. *)

What am I doing wrong here?

Thanks

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
user2492270
  • 2,215
  • 6
  • 40
  • 56

2 Answers2

2

You can this regex to match both groups:

\b([A-Z].+?)\s*\b([a-z]+(?:\.[a-z]+)+)\b

RegEx Demo

Breakup:

\b               # word boundary
([A-Z].+?)       # match [A-Z] followed by 1 or more chars (lazy)
\s*              # match 0 or more spaces
\b               # word boundary
([a-z]+          # match 1 or more of [a-z] chars
(?:\.[a-z]+)+)   # match a group of dot followed by 1 or more [a-z] chars
\b               # word boundary

PS: (?:..) is used for non-capturing group.

anubhava
  • 761,203
  • 64
  • 569
  • 643
0

This is one possible solution that should be pretty compact:

(.*?-\s*\S+)|(\S+\.?)+

https://regex101.com/r/iW8mE5/1

It is looking for anything followed by a dash, zero or more spaces, and then non-whitespace characters. And if it doesn't find that, it looks for non-whitespace followed by an optional decimal.

lintmouse
  • 5,079
  • 8
  • 38
  • 54