5

I have this regex:

/(((\w+)|(\.\w+)|(\#\w+)|\*)(\[(.+(=".+"|\*".+"|\^".+"|))\])?(::|:)?)+(?=[ \S]*\{)/gm

Which I am trying to use to match CSS selectors. Consider this pseudo-code CSS input:

.main {
  property: value;
}

.one, .two a[href$=".com"] {
  .subclass {
    property: value;
  }
}

.test:before, .test:after:active {}

The pattern above will return the following matches:

['.body', '.one', '.two', 'a[href$=".com"]', '.subclass', '.test:before', '.test:after:active']

I am trying to modify the pattern so that psuedo selectors are not matched. So all the other matches should still be valid, but .test:before should just be .test and .test:after:active should also just match .test. I can't think of a way to do this without either a negative look-behind, or a way to not match if the first character is a :.

I'm implementing this in Node, and I don't want to lock my script to Node > 9.2.0 just to use negative look-behinds in my regex.

Any ideas would be greatly appreciated!

Anthony Raymond
  • 7,434
  • 6
  • 42
  • 59
topherlicious
  • 173
  • 3
  • 13
  • What's wrong with just using `(\.\w+)`? – CAustin Nov 17 '17 at 18:49
  • @CAustin that only matches class selectors. There are tons of other non-pseudo-selector patterns that must be accounted for. – Patrick Roberts Nov 17 '17 at 18:51
  • Could you provide an updated input example that includes the other things you're trying to match? – CAustin Nov 17 '17 at 18:52
  • @CAustin One CSS pattern is the attribute selector, which I have added to my example. – topherlicious Nov 17 '17 at 19:03
  • 1
    Regex is going to be a shaky solution at best for something like this. I would suggest looking into a parser such as JSCSSP http://glazman.org/JSCSSP/ – CAustin Nov 17 '17 at 19:29
  • Thanks for the suggestion! I think I still like to accomplish this with RegEx, however. – topherlicious Nov 17 '17 at 20:53
  • Answer: it isn't possible with a single regex. As an aside, before trying to make crazy things, you should first try to make your pattern more simple, with more things in factor, without useless groups and using capture groups only when it is necessary. Writing something like `(::|:)?` or `(\w+)|(\.\w+)|(\#\w+)` grrrrrr! – Casimir et Hippolyte Nov 19 '17 at 21:00
  • 1
    As a suggestion, why not use the regex you have, then remove the pseudo selectors afterwards? To the best of my knowledge CSS only has a limited number of pseudo classes/elements. Again, not using only regex, and not quite as pretty of a one liner, but still, might be easier and not require a whole parser or anything. – zlmonroe Dec 18 '17 at 08:04

1 Answers1

1

(?!.*:)(?:[.#]\w+|\w+\[.*\])

You could use something like this?

This uses a negative lookahead to ensure it doesn't capture anything with a colon beside it, as well as using a simplified version of your matcher for psuedo css elements.

See it working here

KyleFairns
  • 2,947
  • 1
  • 15
  • 35