I've been working with a C# regular expression which is used heavily as part of a custom templating system in a web application. The expression is complex, and I have noticed real performance gains from using the Regex.Compiled option. However, the initial cost of compilation is irritating during development, especially during iterative unit testing (this general tradeoff is mentioned here).
One solution I'm currently trying is lazy regex compilation. The idea is that I can get the best of both worlds by creating a compiled version of the Regex in a separate thread and subbing it in when ready.
My question is: is there any reason why this might be a bad idea performance or otherwise? I ask because I'm not sure whether distributing the cost of things like jitting and assembly loading across threads really works (although it appears to from my benchmarks). Here's the code:
public class LazyCompiledRegex
{
private volatile Regex _regex;
public LazyCompiledRegex(string pattern, RegexOptions options)
{
if (options.HasFlag(RegexOptions.Compiled)) { throw new ArgumentException("Compiled should not be specified!"); }
this._regex = new Regex(pattern, options);
ThreadPool.QueueUserWorkItem(_ =>
{
var compiled = new Regex(pattern, options | RegexOptions.Compiled);
// obviously, the count will never be null. However the point here is just to force an evaluation
// of the compiled regex so that the cost of loading and jitting the assembly is incurred here rather
// than on the thread doing real work
if (Equals(null, compiled.Matches("some random string").Count)) { throw new Exception("Should never get here"); }
Interlocked.Exchange(ref this._regex, compiled);
});
}
public Regex Value { get { return this._regex; } }
}