0

I'm having a hard time with vscode's oniguruma regex parsing for TextMate. Apparently you can't use a newline inside a lookahead, even though oniguruma actually supports it, it's probably not enabled in vscode's version of oniguruma.

I need to match the beginning of a string if, and only if, after element there is desiredAttr1="desiredValue1" or desiredAttr2="desiredValue2":

<element attribute="value" desiredAttr1="desiredValue1" desiredAttr2="desiredValue2">

So far so good, but the thing is, these attributes can be in any order, and there can be a newline in between them. Eg.:

<!-- Should match -->
<element
   attribute="value"
   desiredAttr1="desiredValue1"
   desiredAttr2="desiredValue2"
>

<!-- Should match -->
<element
   attribute="value"
   desiredAttr2="desiredValue2"
>

<!-- Should match -->
<element attribute="value" desiredAttr1="desiredValue1">

<!-- Should match -->
<element desiredAttr2="desiredValue2" attribute="value">

<!-- Should NOT match -->
<element
   attribute="value"
   notDesiredAttr1="desiredValue1"
   notDesiredAttr2="desiredValue2"
>

This is what I got so far (and it works on rubular):

/(^[\t]+)?(?=<(?i:element)\b(?!-)[\s\w\W]*(?:((desiredAttr1="desiredValue1")|(desiredAttr2="desiredAttr2"))))/

Note: I tried also replacing \s with [:space:] and [^/]

This is what I need to match:

<span style="background: red;">&nbsp;</span><code>&#60;element<br/>
&nbsp;&nbsp;attribute="value"<br/>
&nbsp;&nbsp;desiredAttr1="desiredValue1"<br/>
&nbsp;&nbsp;desiredAttr2="desiredValue2"<br/>
&#62;</code>

Is there any other alternative I could use? Thanks in advance.

ghaschel
  • 1,313
  • 3
  • 20
  • 41

1 Answers1

0

Assuming there are no angle brackets in between, you could use:

^[\p{Zs}\t]*(?=<element\b[^<>\r]*\bdesiredAttr([12])="desiredValue\1"[^<>\r]*>)

The pattern matches:

  • ^ Start of string
  • [\p{Zs}\t]* Match optional spaces or tabs
  • (?= Positive lookahead
    • <element\b Match element followed by a word boundary
    • [^<>\r]* Optionally repeat matching any char except < > or \r
    • \bdesiredAttr([12])= match desiredAttr and capture either 1 or 2 in group 1
    • "desiredValue\1" match "desiredValue\1" where \1 is a backreference to the captured digit in group 1 (to match the same digit)
    • [^<>\r]* Optionally repeat matching any char except < > or \r
    • > Match literally
  • ) Close the lookahead

See a regex demo.

enter image description here

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • This would match the strings itself, I need to match the beginning of the string like I put highlighted in the code snippet... – ghaschel Sep 28 '22 at 21:32
  • @ghaschel It is not clear for me what you want to match, do you want ``? – The fourth bird Sep 28 '22 at 21:53
  • The beginning of the string itself. I think you'll understand better if you click on Run code snippet at the end of my question. There is a character marked in red. – ghaschel Sep 28 '22 at 21:56
  • 1
    @ghaschel Like this including newlines etc..? https://regex101.com/r/gNR6Wu/1 – The fourth bird Sep 28 '22 at 21:56
  • Or like this https://regex101.com/r/hMF5Rw/1 `^[\p{Zs}\t]*(?=\r]*\bdesiredAttr([12])="desiredValue\1"[^<>\r]*>)` – The fourth bird Sep 28 '22 at 21:58
  • Just like that, yes. But not using lookahead, as apparently it is not supported. Is there any alternative? – ghaschel Sep 28 '22 at 21:58
  • @ghaschel did you uncheck the button with the 3 lines "Find in selection" – The fourth bird Sep 28 '22 at 22:01
  • @ghaschel There is also an option with 2 capture groups, and if you are replacing for example, you can use those groups in the replacement. See https://regex101.com/r/973PFN/1 and https://imgur.com/CfgnEqb – The fourth bird Sep 28 '22 at 22:09
  • 1
    I am using it for developing a textmate grammar. But I think I might have ran into a limitation and I'll have to try to do things in a different way. Thanks a lot for your time either way – ghaschel Sep 28 '22 at 22:26