8

I have a strait forward aggregator/minimizer/cacher I've written in node.js. It works quite well now.

I am however wondering if there is any way to improve my minimizing regex calls. Some comments are not striped from the CSS entirely, and I notice a few other hiccups here and there.

Also, considering my abilities with regex, I might be able to do the same in half the calls. :)

Any suggestions will be greatly appreciated.

Thanks.

function minimizeData( _content ) {
    var content = _content;
    content = content.replace( /(\/\*.*\*\/)|(\n|\r)+|\t*/g, '' );
    content = content.replace( /\s{2,}/g, ' ' );
    content = content.replace( /(\s)*:(\s)*/g, ':' );
    content = content.replace( /(\s)+\./g, ' .' );
    content = content.replace( /(\s|\n|\r)*\{(\s|\n|\r)*/g, '{' );
    content = content.replace( /(\s|\n|\r)*\}(\s|\n|\r)*/g, '}' );
    content = content.replace( /;(\s)+/g, ';' );
    content = content.replace( /,(\s)+/g, ',' );
    content = content.replace( /(\s)+!/g, '!' );
    return content;
}
Spot
  • 7,962
  • 9
  • 46
  • 55

1 Answers1

12
function minimizeData( _content ) {
    var content = _content;
    content = content.replace( /\/\*(?:(?!\*\/)[\s\S])*\*\/|[\r\n\t]+/g, '' );
    // now all comments, newlines and tabs have been removed
    content = content.replace( / {2,}/g, ' ' );
    // now there are no more than single adjacent spaces left
    // now unnecessary: content = content.replace( /(\s)+\./g, ' .' );
    content = content.replace( / ([{:}]) /g, '$1' );
    content = content.replace( /([;,]) /g, '$1' );
    content = content.replace( / !/g, '!' );
    return content;
}

should be a bit clearer and avoids repetition. After the first replace, there will only be spaces left; after the second replace, only single spaces. This makes the following replaces easier.

To explain the comment-removing regex (shown here as a pure verbose regex without delimiters):

/\*       # Match /*
(?:       # Match (any number of times)...
 (?!\*/)  # ... as long as we're not right before a */:
 [\s\S]   # any character (whitespace or non-whitespace).
)*        # (End of repeated non-capturing group)
\*/       # Match */
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Thank you, this has improved it quite a bit. However, some comments are still not getting parsed. A paste of one is /* Interaction Cues----------------------------------*/ Thoughts? – Spot Dec 09 '10 at 21:42
  • position:absolute;opacity:0;filter:Alpha(Opacity=0);}/* Interaction Cues----------------------------------*/.ui-state-disabled{cursor:default!important;} – Spot Dec 09 '10 at 21:43
  • 1
    It's probably the `|[\r\n\t]*` in the first regex. It matches the nothing between the `}` and `/`, then the regex engine bumps ahead one place as it's supposed to do after a zero-length match, so it's trying to match a comment starting *after* the slash. That `|[\r\n\t]*` shouldn't be there anyway; I'd get rid of it and change the second regex to `/\s+/g`. – Alan Moore Dec 09 '10 at 22:28
  • @Alan Moore: I'm changing it to `[\r\n\t]+` - the `*` was really stupid, but replacing all `\s+` with nothing or a single space will change the outcome of the operation. – Tim Pietzcker Dec 10 '10 at 07:00
  • What about css selectors like `div[style="width: 6px; height: 10px; border: 0;"]` ? – 350D Jan 12 '16 at 02:09
  • 1
    @350D I added this line `content = content.replace( /([{:}]) /g, '$1' );` after `content = content.replace( / ([{:}]) /g, '$1' );` – AEQ Oct 12 '17 at 15:14