2

I am working on a tool that allows users to enter a regular expression for a find&replace, and then my tool will execute that find&replace and return the changed text. However, I recently ran into a situation where the find&replace simply froze, so I decided it would probably be best to somehow detect issues with regular expression matching, and abort after a certain amount of time has passed.

I've checked around, and what I was able to find using this answer was that the problem I'm experiencing was called 'catastrophic backtracking'. That's ideal to know, because that way I can make a minimal working example of where it goes wrong, however not ideal if the solution to change the regular expression isn't possible, because I have no control over the user's regex input (and there's no way I can write an advanced enough regex parser to limit the user's regex usage to exclude situations like this).

So in an attempt to solve this, I tried using promises, as suggested in this answer. I've made the 'catastrophic' match string in this example just long enough for the effect to hang my tab for a few seconds, without completely crashing the tab. Different computer specs may see different results though, I'm not sure.

Just one heads-up: Executing this code might freeze your current tab. PLEASE make sure you do not have a partial answer written when executing this code, as it might cause loss of work.

var PTest = function () {
    return new Promise(function (resolve, reject) {
    setTimeout(function() {
      reject();
    }, 100)
    "xxxxxxxxxxxxxxxxxxxxxxxxx".match(/(x+x+)+y/)
    resolve();
  });
}
var myfunc = PTest();
myfunc.then(function () {
     console.log("Promise Resolved");
}).catch(function () {
     console.log("Promise Rejected");
});

On my computer, this causes the tab to freeze for about 4 seconds before showing "Promise Resolved" in the console.

My question now is: is it at all possible to "abort" the execution of a script like this, if execution takes too long (in the example: over 0.2 seconds)? I'd rather kill the regex find&replace than completely crash the tool, causing loss of work for the user.

Joeytje50
  • 18,636
  • 15
  • 63
  • 95
  • I've never implemented these but [async functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function) could be of interest. – MonkeyZeus Feb 11 '21 at 14:15
  • The example they give on that page is effectively the same as my example, with the only exception that the bottom 6 lines are wrapped in an async function. That could indeed have made a difference, but even modifying their example to execute the regex match within their Promise doesn't make the script reject after the specified 100ms. – Joeytje50 Feb 11 '21 at 15:25

1 Answers1

3

I recommend using a Web Worker since it will run in its own sandbox: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API

The Web Worker is its own script that you need to include in your JavaScript, such as:

var worker = new Worker('/path/to/run-regex.js');

The following is untested code, but should get you going.

Your run-regex.js does the (potentially long running) regex match:

function regexMatch(str, regexStr, callback) {
  let regex = new RegExp(regexStr);
  let result = str.match(regex);
  callback(result, '');
}

onmessage = function(e) {
  let data = e.data;
  switch (data.cmd) {
    case 'match':
      regexMatch(data.str, data.regex, function(result, err) {
        postMessage({ cmd: data.cmd, result: result, err: err });
      });
      break;
    case 'replace':
      //regexMatch(data.str, data.regex, data.replace, function(result, err) {
      //  postMessage({ cmd: data.cmd, result: result, err: err });
      //});
      break;
    default:
      break;
      postMessage({ err: 'Unknown command: ' + data.cmd });
  }
}

In your own script, load the Web Worker, and add an event listener:

if(window.Worker) {
  const myWorker = new Worker('/path/to/run-regex.js');

  myWorker.onmessage = function(e) {
    let data = e.data;
    if(data.err) {
      // handle error
    } else {
      // handle match result using data.result;
    }
  }

  function regexMatch(str, regex) {
    let data = { cmd: 'match', str: str, regex: regex.toString() };
    myWorker.postMessage(data);
  }

  regexMatch('xxxxxxxxxxxxxxxxxxxxxxxxx', /(x+x+)+y/);

} else {
  console.log('Your browser does not support web workers.');
}

With this, your main JavaScript thread is non blocking while the worker is working.

In case of a long running worker, you may add code to either:

Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20