-1

I am trying to create a regex which would match text between commas in csv like text.

Example text:

192.168.0.1,London,19.11.2018

Expected output:

London

Find the nth ocurrance of comma and catch text till the next comma.

How can i match other ocurrances?

Like

192.168.0.1 or `19.11.2018`

I can't just split the text. I can't use any programming language, just regex.

HowToGo
  • 327
  • 1
  • 4
  • 14
  • Use [`^[^,]+,([^,]+)`](https://regex101.com/r/WyrMpB/1/) but there a couple of duplicates, really. Alternatively, just split on the `,`. – Jan Nov 19 '18 at 12:06
  • How can i catch the 192.168.0.1 or 19.11.2018 with it? – HowToGo Nov 19 '18 at 12:07
  • i can't use javascript and split it – HowToGo Nov 19 '18 at 12:11
  • 1
    Not a duplicate. Read the question properly. Question that was marked as the existing duplicate is NOT THE SAME, it asks "I want the 3rd occurence of a pattern." This question asks "I want to split on commas, and select nth match". DIFFERENT. GOD people that mark as dupe really rack me off. Just answer the f'kin question – Dan Rayson Nov 19 '18 at 12:13
  • How can we mark or downvote person marking questions as duplicates? – HowToGo Nov 19 '18 at 12:18
  • @DanRayson: See, I did provide an answer in the comments section plus added two other answers to reflect the underlying principle. Of course, you could ask the same question 1000 times but the answer remains the same - hence a duplicate. – Jan Nov 19 '18 at 12:29
  • @HowToGo: You could vote to reopen the question. Of course, you could very well downvote other questions and answers from me which would be a bit unfair but certainly possible. – Jan Nov 19 '18 at 12:30
  • Here's an answer, shocking I know: `[,]?([^,]*)[,]?` This regex will get stuff from between commas and return each as a match. Tested on https://regex101.com/ against `192.168.0.1,London,19.11.2018` It basically says "match everything that's not a comma, that may or may not be between commas". – Dan Rayson Nov 19 '18 at 12:35
  • ^(?:[^,]*\,){1}([^,]*) this works pretty well – HowToGo Nov 19 '18 at 12:44
  • @HowToGo According to my tester tool, your version misses off the last value from your CSV. Maybe works for you, but my tester says "Nope" :) – Dan Rayson Nov 19 '18 at 12:48
  • @Jan Thanks for pointing out needless brackets. Blind leading the blind here. – Dan Rayson Nov 19 '18 at 12:49

2 Answers2

1

The following regex in python should do

import re


def main():
    '''The Main'''
    data = '192.168.0.1,London,19.11.2018'

    print(re.match(r'^([^,]+,){0}([^,]+),?([^,]+,?)*', data).group(2))
    print(re.match(r'^([^,]+,){1}([^,]+),?([^,]+,?)*', data).group(2))
    print(re.match(r'^([^,]+,){2}([^,]+),?([^,]+,?)*', data).group(2))


if __name__ == '__main__':
    main()

Observe the changing number in the middle curly braces {}

The number in .group(2) at the end should always be 2

Krishna
  • 924
  • 1
  • 7
  • 28
  • I get London twice, for 1 and 2 indexes. – Dan Rayson Nov 19 '18 at 13:23
  • Can you please recheck. I just ran it. Below is the output `192.168.0.1 London 19.11.2018` – Krishna Nov 19 '18 at 13:25
  • Perhaps it's the regex flavour I'm using, but this regex (ugly I know) did what you described, yours didn't when using Regex101.com. `(?:(?:[,]{0,1})(?:[^,]*)){0}(?:[,]{0,1})([^,]*)(?:[,]{0,1})` – Dan Rayson Nov 19 '18 at 13:27
  • Yeah! Regex flavors behave differently in different languages. That's how they are designed. Guess people get an idea from our solutions. – Krishna Nov 19 '18 at 13:28
  • After seeing your regex, it gave me some ideas, so +1'ing this one cus I kinda stole from it :P – Dan Rayson Nov 19 '18 at 13:31
0

To achieve what you want, you could use a regex like this;

,?([^,]*),?

What that says in Englishish is "Whether I'm between commas or not, match all characters that are not commas."

My logic was because your CSV values are either at the start of the line, end of the line, or between commas. I've also allowed for blank values between commas (see * inside capturing group).

EDIT:

After seeing you can ONLY use Regex, no looping structures allowed, I (actually mostly @Krishna) came up with this one. It'll return you the result for the nth value in a CSV.

(?:(?:[,]{0,1})(?:[^,]*)){XXX}(?:[,]{0,1})([^,]*)(?:[,]{0,1})

You'd change the {XXX} to whatever N you wanted, zero based.

It's ugly, but it works. I'm sure you could shorten it down yourself ^^.

Tested on Regex101.com.

Dan Rayson
  • 1,315
  • 1
  • 14
  • 37
  • it is not want i want to achive – HowToGo Nov 19 '18 at 13:10
  • @HowToGo Is it not? This gives you a list of matches, then you can loop over them in whatever code called the Regex, selecting the nth one. ... I just realised you can ONLY use Regex. hmm... Will repost again soon – Dan Rayson Nov 19 '18 at 13:17
  • I can't loop. If i could use programming language i would use split by commas and then just go with array[25] or something. I can't use programming language but regex. – HowToGo Nov 19 '18 at 13:20