As part of a security related project written in Node.js, I'm looking at some of the work done by the team behind PHPIDS, specifically their filter list, which is composed of a large amount of regular expressions that matches a variety of different attack payloads.
I want to make it clear that I am of course fully aware that this project hasn't been maintained for almost eight years now, but I still definitely see how these filters could play a valuable role in a larger detection system.
With that out of the way, I have been struggling to find a good way to "convert" some of these PCRE specific expressions to a format that is compatible with the standard JavaScript implementation.
So far I've tried using different tools, such as regex 101, pcre-to-regexp and babel-plugin-transform-modern-regexp, but they all choke on the same features: "negative lookbehinds" and "group conditionals".
From that I understand, many features that have been lacking in the JS implementation are on their way, which is great - but there's basically no word on these two specifically (as far as I can find).
My hope is that for someone who actually understands the inner workings of these features, rewriting these could be fairly straight forward, maybe using a combination of significantly less complex expressions and/or some extra processing before/after these are run, to sort of act like a "polyfill" more or less.
I'm attaching a link to one of these patterns on RegExr, because of their incredibly helpful autogenerated explanation of the pattern and all of the different parts, as well as the full expression here as well.
RegExr: Pattern with PCRE features
([^*:\\s\\w,.\\\/?+-]\\s*)?(?<![a-z]\\s)(?<![a-z\\\/_@\\-\\|])(\\s*return\\s*)?(?:create(?:element|attribute|textnode)|[a-z]+events?|setattribute|getelement\\w+|appendchild|createrange|createcontextualfragment|removenode|parentnode|decodeuricomponent|\\wettimeout|(?:ms)?setimmediate|option|useragent)(?(1)[^\\w%\"]|(?:\\s*[^@\\s\\w%\",.+\\-]))
It can't be impossible to achieve the same thing as this nearly decade old expression in JavaScript, could it?