In answering a Splunk question on SO, the following sample text was given:
msg: abc.asia - [2021-08-23T00:27:08.152+0000] "GET /facts?factType=COMMERCIAL&sourceSystem=ADMIN&sourceOwner=ABC&filters=%257B%2522stringMatchFilters%2522:%255B%257B%2522key%2522:%2522BFEESCE((json_data-%253E%253E'isNotSearchable')::boolean,%2520false)%2522,%2522value%2522:%2522false%2522,%2522operator%2522:%2522EQ%2522%257D%255D,%2522multiStringMatchFilters%2522:%255B%257B%2522key%2522:%2522json_data-%253E%253E'id'%2522,%2522values%2522:%255B%25224970111%2522%255D%257D%255D,%2522containmentFilters%2522:%255B%255D,%2522nestedMultiStringMatchFilter%2522:%255B%255D,%2522nestedStringMatchFilters%2522:%255B%255D%257D&sorts=%257B%2522sortOrders%2522:%255B%257B%2522key%2522:%2522id%2522,%2522order%2522:%2522DESC%2522%257D%255D%257D&pagination=null
The person wanted to extract everything in the "filters" portion of the URL if "factType" was "COMMERCIAL"
The following all-in-one regex pulls it out neatly (presuming the URL is always in the right order (ie factType
coming before filters
):
factType=(?<facttype>\w+).+filters=(?<filters>[^\&]+)
According to regex101, it finds its expected matches with 670 steps
But if I break it up to
factType=(?<facttype>\w+)
followed by
filters=(?<filters>[^\&]+)
regex101 reports the matches being found with 26 and 16 steps, respectively
What about breaking up the regex into two makes it so much more (~15x) efficient to match?