2

I am attempting to write a regex to validate user input (asp.net, c#) which has the following conditions:

- single digits within a range of 1 - 6
- comma separated, but list should not begin or end with a comma
- digits cannot be repeated
- digits should be in ascending order

For example:

- 1,2,3,4,5,6   - valid
- 2,5,6         - valid
- 4             - valid
- 2,5,6,        - invalid
- 3,6,5         - invalid
- 2,2,5,6       - invalid

So far I've got:

^((1,)?(2,)?(3,)?(4,)?(5,)?(6)?)$

The issue with this is the numbers 1-5 have to be followed by a comma which, if they are the only number being input, is not correct.

Ant20
  • 87
  • 10
  • 3
    That can be easily done without a regex. A regex for this will be unreadable, too long. – Wiktor Stribiżew May 13 '16 at 10:28
  • 1
    @Wiktor Stribiżew. Thanks for the response. Are you suggesting that I just perform the check in the c# code? Also, whether using regex is the most efficient option or not I would still like to now how it would be done in regex even if it is just for learning purposes. – Ant20 May 13 '16 at 10:32
  • 1
    No, I am not going to spend some 40 minutes on a pattern no one is going to use. The point is you need to capture each digit you have and then use conditional construct to see what matched. Or a brute force approach to list all alternatives. Regex is not meant for such tasks – Wiktor Stribiżew May 13 '16 at 10:35
  • Repetition checks and sorting are better done via a programming language. – CinCout May 13 '16 at 10:37
  • @CinCout But he said he *wants* this as regex for learning purposes, not because it is the correct way of doing the task – Alfie Goodacre May 13 '16 at 10:38
  • I was just making a suggestion. – CinCout May 13 '16 at 10:39
  • I'm happy to implement the check in c# code (which i am more familiar with anyway) but knowing how this would be done in regex would still be valuable for learning purposes. I'm still learning regex so am not exactly clear on what tasks it is best suited to – Ant20 May 13 '16 at 11:04

4 Answers4

2

You can use \b to ensure that you are at the boundary of the word, and ,* to have a comma OR no comma. This results in the working - albeit quite long

^((1)?(\b,?2)?(\b,?3)?(\b,?4)?(\b,?5)?(\b,?6)?)$
Alfie Goodacre
  • 2,753
  • 1
  • 13
  • 26
  • 1
    Nice, just change `*` to `?` to avoid `2,,,,,,5,6` to pass. `^((1)?(\b,?2)?(\b,?3)?(\b,?4)?(\b,?5)?(\b,?6)?)$` – SamWhan May 13 '16 at 10:40
  • 1
    That's great, thank you! I'm still in the early stages of learning regex and have not come across the \b symbol before so I'm going to look into it a bit more. Thanks again :) – Ant20 May 13 '16 at 11:01
0

(This is a silly answer)

Given that there are six values, and each can be either present or not present, there are 2^6 = 64 possible correct values; except that I'm guessing we want to exclude the possibility of no numbers at all being present, so there are only 63 possible correct values. This regex allows them and only them:

^(6|5|5,6|4|4,6|4,5|4,5,6|3|3,6|3,5|3,5,6|3,4|3,4,6|3,4,5|3,4,5,6|2|2,6|2,5|2,5,6|2,4|2,4,6|2,4,5|2,4,5,6|2,3|2,3,6|2,3,5|2,3,5,6|2,3,4|2,3,4,6|2,3,4,5|2,3,4,5,6|1|1,6|1,5|1,5,6|1,4|1,4,6|1,4,5|1,4,5,6|1,3|1,3,6|1,3,5|1,3,5,6|1,3,4|1,3,4,6|1,3,4,5|1,3,4,5,6|1,2|1,2,6|1,2,5|1,2,5,6|1,2,4|1,2,4,6|1,2,4,5|1,2,4,5,6|1,2,3|1,2,3,6|1,2,3,5|1,2,3,5,6|1,2,3,4|1,2,3,4,6|1,2,3,4,5|1,2,3,4,5,6)$

Please don't actually use this. You will make us both look bad.

AakashM
  • 62,551
  • 17
  • 151
  • 186
0

Non-regex version. Simple and precise

string str = ",1,  2,3, 4, 5, 6";
bool valid = false;
var invalidString = str.Split(',').Any(p =>
{
    int num = 0;
    return int.TryParse(p, out num);
});
if (!invalidString)
{
    List<int> list = str.Split(',').Select(p => int.Parse(p)).ToList();
    var sorted = list.SequenceEqual(list.OrderBy(p => p));
    var hasDuplicates = list.Count != list.Distinct().Count();
    valid = sorted && !hasDuplicates;
}
M.S.
  • 4,283
  • 1
  • 19
  • 42
0

Strictly for learning purposes, break the problem down into pieces.

It must be formed as a digit followed by zero or more comma plus digits.

^\d(?:,\d)*$

There are only 6 digits and they must be in ascending order. So just list them and their intervening commas; each of which is optional.

^1?,?2?,?3?,?4?,?5?,?6?$

The difficulty is that both of the above regular expressions must match at the same time. We can use a zero width look aheads on one of them. This will do the match but will not "consume" any characters. Hence after it matches the next piece of the regular expression will start at the same place as the look ahead. The look ahead is achieved by wrapping an expression in (?= and ')'. Giving:

(?=^\d(?:,\d)*$)

Combining the two regular expressions gives the following:

(?=^\d(?:,\d)*$)^1?,?2?,?3?,?4?,?5?,?6?$
AdrianHHH
  • 13,492
  • 16
  • 50
  • 87
  • I didn't quite understand how the look ahead worked after your initial post but your update has made it much clearer - thanks! – Ant20 May 13 '16 at 11:36