We are working on an extension to google-code-prettify which does the code-coloring for source-code on a webpage. We have a very long list of keywords (approx 4000) in Mathematica and while the performance is still very good, I wondered whether I can speed things up.
The regular expression for our keyword list looks like this
var keywords = 'AbelianGroup|Abort|AbortKernels|AbortProtect|Above|Abs|Absolute|\
AbsoluteCurrentValue|AbsoluteDashing|AbsoluteFileName|AbsoluteOptions|\
AbsolutePointSize|AbsoluteThickness|AbsoluteTime|AbsoluteTiming|AccountingForm';
new RegExp('^(?:' + keywords + ')\\b')
Can such an or-ed regex be made faster when it is compiled? Would it in the first place make sense to compile it, since google-code-prettify is a JavaScript running on the server. I don't know whether this script is loaded freshly every time a web-page is loaded. In this case, it is maybe not worth the overhead to compile it.