Regex to match opening parentheses not preceded by "]"

Question

I have pieces of text where normal markdown and a custom markdown extension are mixed together. This works quite well.

[My Link Title](http://example.com)

(extension: value attribute: value)

However, I have one problem: to apply some stylings when editing the text, I need a way to match the opening bracket of extension snippet without matching the opening bracket of the markdown link.

In other words: I need a regular expression (that works in javascript) to match an opening bracket (and only the bracket) when it is

proceeded by [a-z0-9]+: and
not preceded by a ] character.

My current regular expression for that (which works well to match the extension tags opening brackets but unfortunately includes the markdown link opening brackets, too) looks like this: /\((?=[a-z0-9]+:)/i.

I have seen people use positive lookaheads with a negation at the beginning of the regular expression like this /(?=[^\]])\((?=[a-z0-9]+:)/i to check for this in PHP. Unfortunately, this doesn't seem to work in javascript.

Update

Thanks for your tips!

The problem I'm having is that I'm creating a "Simple Mode" syntax mode for CodeMirror to apply the highlighting. This allows you to specify a regex and a token that will be applied to the matched characters but doesn't allow any further operation on the matches. You could however write a full syntax mode where you can do this kind of operations, but I'm not capable of that :-s

After all, I went with another solution. I just created two regular expressions:

Match all opening extension brackets with a preceding character other then "]":
/[^\]]\((?=[a-z0-9]+:)/i
Matches all opening extension brackets without any preceding character:
/^\((?=[a-z0-9]+:)/i

Even though it isn't the cleanest possible way it seems to work quite well for now.

You are asking for a negative look-behind that is unavailable in JS. Also, there is no known workaround for a case when you need both look-ahead and look-behind (like string reversing). You will have to match more than just a bracket and use capturing groups. Something like `/(^|[^\])\((?=[a-z0-9]+)/i`. — Wiktor Stribiżew, Apr 29 '15 at 08:25
@stribizhev - That is generally true, but in this case you can probably reverse the string and use `\b\((?!\])`, which is pretty simple. — Kobi, Apr 29 '15 at 09:05
@Kobi: True, the only `]` is a non-word character. Great! So, there would be no solution if the look-ahead and look-behind were of variable width. — Wiktor Stribiżew, Apr 29 '15 at 09:22
@Kobi **Thanks for you hints!** Please see my updated question for the solution I used for now. — DieserJonas, Apr 29 '15 at 11:36

score 4 · Accepted Answer · edited May 23 '17 at 12:06

4

Using a skip and match trick:

\[[^\]]+\]\([^\)]+\)|(\(\b)

\[[^\]]+\]$[^$]+\) - match []() links (you can also write \[.*?\]$.*?$ if this is too confusing), OR -
(\(\b) - match and capture an open parentheses that is directly before an alphanumeric character.

Working example: https://regex101.com/r/tY9sS4/1

You would have to see the result and process only matches where the $1 grouped captured, and ignore the other matches.

edited May 23 '17 at 12:06

Community

1
1

answered Apr 29 '15 at 08:40

Kobi

135,331
41
252
292

Nice solution! The only problem I'm having is that I'm creating a "Simple Mode" syntax mode for CodeMirror for the highlighting. This allows you to specify a regex and a token that will be applied to the matched characters and doesn't allow any further operation on the matches. You could however write a full syntax mode where you can do this kind of operations, but I'm not capable of that :-s – DieserJonas Apr 29 '15 at 11:24
@DieserJonas - Oh, that's a shame! You did say "match an opening bracket (and only the bracket)". You might have a hard time with the JavaScript regex engine, it is pretty weak... Thanks! – Kobi Apr 29 '15 at 11:29
@DieserJonas - Also, I don't know CodeMirror, but can you match `[...](...)` in one syntax rule before you are looking for `(`? Usually these tools have priorities. – Kobi Apr 29 '15 at 11:31
I just edited my original question with the solution I'm using for now. The problem is that this "Simple Mode" is basically a state machine that switches between states and applies token types based on regexes. This is kind of limited. – DieserJonas Apr 29 '15 at 11:44
If I would be able to write a full syntax mode, I would be able to use your solution quite well, I think. Unfortunately, that's a bit above my current skill level. I will probably revisit this issue some time in the future and make a nicer solution based on your answer. – DieserJonas Apr 29 '15 at 11:48

Regex to match opening parentheses not preceded by "]"

1 Answers1