0

I need a RegEx pattern which will be sent by the client where the starting characters will be alphanumeric, the length of this starting String will be defined by the number after this String. This is followed by a special character which will always be a single character. This is again followed by a variable length string of alphanumeric characters.

I have come closest to the below String and formats.

[A-Za-z0-9]{4}-[A-Za-z0-9]{5}         - RegEx Input String

[A-Za-z0-9]{2}#[A-Za-z0-9]{6}         - RegEx Input String

[0-9]{3}#[0-9]{5}                     - RegEx Input String

[a-z]{5}#[a-z]{5}                     - RegEx Input String

[A-Z]{4}#[a-z]{4}                     - RegEx Input String

[\w]{\d{1,1}}(\S{1,1})[\w]{\d{1,1}}   - RegEx Format

Is the above pattern and format correct?

Can we validate the RegEx input string against the required RegEx format?

This is a web service which will have an input as [A-Za-z0-9]{4}-[A-Za-z0-9]{5}. I need two things here. First, how do I validate this input to see if it matches the format I want and the proceed. The format is the one I mentioned above as RegEx format.

Tiny
  • 683
  • 5
  • 18
  • 36
  • can you give an example of an String ? I didn't get the last sentence of your question. – Sujal Mandal Jan 24 '17 at 10:28
  • Is it meta regex ? – Igoris Azanovas Jan 24 '17 at 10:29
  • This is a web service which will have an input as [A-Za-z0-9]{4}-[A-Za-z0-9]{5}. I need two things here. First, how do I validate this input to see if it matches the format I want and the proceed. The format is the one I mentioned above as RegEx format in my post. – Tiny Jan 24 '17 at 10:31
  • 1
    Your meta-regex is incorrect, for example it doesn't describe literal brackets. I'm pretty sure there is an easier solution to your problem than regex validation, would you care to explain why you want to do that? – Aaron Jan 24 '17 at 10:43
  • Here's a regex that matches your input example : `\[[\w-]+\](?:\{\d(?:-\d)?\})?-\[[\w-]+\](?:\{\d(?:-\d)?\})?` Of course with only one example I'm far from sure this is actually what you need. – Aaron Jan 24 '17 at 10:50
  • [\w]{\d{1,1}}(\S{1,1})[\w]{\d{1,1}} this one seems to be inconsistent with what you specified in your question. [\w]{\d{1,1}} - here you are fixing the /d{1,1} or the length of the digit which specifies your string's length to be just within 0 to 9 i.e your string can't be longer than 9 chars secondly you make no mention of the last part \d{1,1} at the end of your RegEx format in the question. it would be better if you could show us multiple inputs & explain on what you are trying to do. – Sujal Mandal Jan 24 '17 at 11:01
  • @Aaron - Your pattern looks good.. it seems to miss the special character inclusion at the center. Any one special character can be included.Here is the modified version of the Regex pattern: Is this ok .. \[[\w-]+\](?:\{\d(?:-\d)?\})?(\S)?\[[\w-]+\](?:\{\d(?:-\d)?\})? – Tiny Jan 24 '17 at 11:52
  • @Sujata - Yes the length of the String will not be greater than 9 .. I have edited the post with diff input strings .. – Tiny Jan 24 '17 at 11:54
  • @Aaron - There are some modifications in the Input String.. Could you pls check my post which I have edited – Tiny Jan 24 '17 at 11:56
  • @User1 how about `\[[\w-]+\](?:\{\d\})?\S\[[\w-]+\](?:\{\d\})?` ? – Aaron Jan 24 '17 at 13:13
  • A few notes though : `\w` also contains `_` and `\S` much more than special chars, so `[_]A[_]` would match. You might want to use `[0-9a-zA-Z]` instead. It only handles `{n}` quantifiers, no `*`, `+` or `{m,n}`, which could however be easily added as an alternation in the optional groups – Aaron Jan 24 '17 at 13:17
  • @Aaron : Thanks Aaron .. This worked : \[[A-Za-z0-9-]+\](?:\{\d\})\S\[[A-Za-z0-9-]+\](?:\{\d\}). But any idea why does it include hyphen too. Because the input : "[0--]{3}-[A-Za-z0-9]{5}" should not work as it contains a hyphen .. – Tiny Jan 24 '17 at 13:57
  • @Aaron- you may put your answer in the answer section so that i can mark it as your answer and upvote it .. – Tiny Jan 24 '17 at 13:58
  • @User1 concerning your regex, your character classes have a trailing `-` which you should remove to solve your problem. – Aaron Jan 24 '17 at 14:03
  • @Aaron, Whenever i try to remove the trailing - the pattern does not compile to true .. "\[[A-Za-z0-9]+\](?:\{\d\})\S\[[A-Za-z0-9]+\](?:\{\d\}) – Tiny Jan 24 '17 at 14:45
  • Nevermind removing the `-`, I had initially put it here for a good reason. Here, try `\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\](?:\{\d\})?\S\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\](?:\{\d\})?`, which specifies that a character class can contain a sequence of character and character ranges – Aaron Jan 24 '17 at 14:57
  • @Aaron - Thanks a lot.. Your help is highly appreciated .. – Tiny Jan 24 '17 at 16:12
  • @User1 You're welcome, I've added an answer which explains the regex and adds a few notes. – Aaron Jan 24 '17 at 16:26

1 Answers1

0

This regular expression should match the subset of regular expressions you're interested in :

\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\](?:\{\d\})?\S\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\](?:\{\d\})?

Let's break it down :

  • it matches a line which contains in sequence a character class, an optional quantifier, a separator, a second character class and its second optional quantifier
  • the separator is any non-whitespace character, \S (you might want to change that to something more specific, or which includes some whitespaces)
  • the optional quantifier is easy, it's a digit surrounded with literal curly brackets, the whole enclosed in an unbound group we use to make it optional : (?:\{\d\})?. Note that this will not accept multiple digits length, so you might want to change the \d to \d+, nor the more specific {m,n} range quantifier.
  • a character class is a sequence of 0 or more characters or character-ranges, enclosed in literal brackets.
  • a character is a letter or digit : [a-zA-Z0-9](?:-[a-zA-Z0-9])? when the unbounded group isn't matched
  • a character range is a character followed by the literal - followed by another character : [a-zA-Z0-9](?:-[a-zA-Z0-9])? when the unbounded group is matched
Aaron
  • 24,009
  • 2
  • 33
  • 57
  • Hello Aaron, Could you please include one more possibility where we can include only a bunch of chars at the center instead of all chars (\S) and this char length should be one always.. – Tiny Jan 25 '17 at 08:43
  • This has been done: "\\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\\](?:\\{\\d\\})?[-:,'=^!%>ç`¦$£§;_]{1}\\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\\](?:\\{\\d\\})?"; – Tiny Jan 25 '17 at 09:16
  • @User1 I think I had a typo in this answer. ```\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\](?:\{\d\})?​[-:,'=^!%>ç`¦$£§;_]\[(?:[a-zA-Z0-9](?:-[a-zA-Z0-9])?)*\](?:\{\d\})?``` should work fine. Check out this [regex101](https://regex101.com/r/fxNdIt/1) (I had to add `#` to your character class to make it work with the samples you provided) – Aaron Jan 25 '17 at 13:48