Java Regex with "Joker" characters

Question

I try to have a regex validating an input field. What i call "joker" chars are '?' and '*'. Here is my java regex :

"^$|[^\\*\\s]{2,}|[^\\*\\s]{2,}[\\*\\?]|[^\\*\\s]{2,}[\\?]{1,}[^\\s\\*]*[\\*]{0,1}"

What I'm tying to match is :

Minimum 2 alpha-numeric characters (other than '?' and '*')
The '*' can only appears one time and at the end of the string
The '?' can appears multiple time
No WhiteSpace at all

So for example :

abcd = OK

?bcd = OK

ab?? = OK

ab*= OK

ab?* = OK

??cd = OK

*ab = NOT OK

??? = NOT OK

ab cd = NOT OK

abcd = Not OK (space at the begining)

I've made the regex a bit complicated and I'm lost can you help me?

You can use website to test your regex and see what's wrong like https://regexr.com/ — Hearner, Aug 10 '18 at 09:37
Try `^(?:\?*[a-zA-Z\d]){2}[^\s*]*\*?$`. See live demo here https://regex101.com/r/XgqAej/1 — revo, Aug 10 '18 at 09:49
Sidenote: those "joker" chars are actually called [*wildcards*](https://en.wikipedia.org/wiki/Wildcard_character) — Lino, Aug 10 '18 at 09:54

Sweeper · Accepted Answer · 2018-08-10T10:34:22.997

4

^(?:\?*[a-zA-Z\d]\?*){2,}\*?$

Explanation:

The regex asserts that this pattern must appear twice or more:

\?*[a-zA-Z\d]\?*

which asserts that there must be one character in the class [a-zA-Z\d] with 0 to infinity questions marks on the left or right of it.

Then, the regex matches \*?, which means an 0 or 1 asterisk character, at the end of the string.

Demo

Here is an alternative regex that is faster, as revo suggested in the comments:

^(?:\?*[a-zA-Z\d]){2}[a-zA-Z\d?]*\*?$

Demo

edited Aug 10 '18 at 10:34

answered Aug 10 '18 at 09:51

Sweeper

213,210
22
193
313

1

This is the right answer but I'd argue that there is no need to go through the whole group after second alphanumeric character is found. A bit modified version from my comment above: `^(?:\?*[a-zA-Z\d]){2}[a-zA-Z\d?]*\*?$` – revo Aug 10 '18 at 10:03
I made a mistake in my question with white space, I want white space allowed but not allowed when is before or after a "wildcard" character. like so: ab cd =ok | a? cd =not ok | ab ?cd =not ok (same with '*') – Matt Zdj Aug 14 '18 at 08:23
Your edit invalidates most of the answers here. This behaviour is not encouraged. I have rolled back the edit. You can post a new question. @MattZdj – Sweeper Aug 14 '18 at 09:08
Ok I post a new question – Matt Zdj Aug 14 '18 at 09:11

Nikolas Charalambidis · Answer 2 · 2018-08-10T09:58:44.553

0

Here you go:

^\?*\w{2,}\?*\*?(?<!\s)$

Both described at demonstrated at Regex101.

^ is a start of the String
\?* indicates any number of initial ? characters (must be escaped)
\w{2,} at least 2 alphanumeric characters
\?* continues with any number of and ? characters
\*? and optionally one last * character
(?<!\s) and the whole String must have not \s white character (using negative look-behind)
$ is an end of the String

edited Aug 10 '18 at 09:58

answered Aug 10 '18 at 09:54

Nikolas Charalambidis

40,893
16
117
183

Op clarified that the two chars do not have to be adjacent: a?b would be ok as well. And there can only be 0..1 "*", and only at the end. – Malte Hartwig Aug 10 '18 at 09:57
The negative lookbehind is redundant. Also it doesn't match `a?b` – revo Aug 10 '18 at 10:06

Pshemo · Answer 3 · 2018-08-10T11:52:47.457

Other way to solve this problem could be with look-ahead mechanism (?=subregex). It is zero-length (it resets regex cursor to position it was before executing subregex) so it lets regex engine do multiple tests on same text via construct

(?=condition1)  
(?=condition2)
(?=...)
conditionN

Note: last condition (conditionN) is not placed in (?=...) to let regex engine move cursor after tested part (to "consume" it) and move on to testing other things after it. But to make it possible conditionN must match precisely that section which we want to "consume" (earlier conditions didn't have that limitation, they could match substrings of any length, like lets say few first characters).

So now we need to think about what are our conditions.

We want to match only alphanumeric characters, ?, * but * can appear (optionally) only at end. We can write it as ^[a-zA-Z0-9?]*[*]?$. This also handles non-whitespace characters because we didn't include them as potentially accepted characters.
Second requirement is to have "Minimum 2 alpha-numeric characters". It can be written as .*?[a-zA-Z0-9].*?[a-zA-Z0-9] or (?:.*?[a-zA-Z0-9]){2,} (if we like shorter regexes). Since that condition doesn't actually test whole text but only some part of it, we can place it in look-ahead mechanism.

Above conditions seem to cover all we wanted so we can combine them into regex which can look like:

^(?=(?:.*?[a-zA-Z0-9]){2,})[a-zA-Z0-9?]*[*]?$

Java Regex with "Joker" characters

So for example :

3 Answers3

Demo

Demo

Linked