11

I have a pattern of js promises that I want to identify for several keywords

For example if I put code like:

var deferred = Q.defer();

And in the file I have also the following respective value

deferred.reject(err);
deferred.resolve();
return deferred.promise;

The complete code

EXAMPLE 1

function writeError(errMessage) {
    var deferred = Q.defer();
    fs.writeFile("errors.log", errMessage, function (err) {
        if (err) {
            deferred.reject(err);
        } else {
            deferred.resolve();
        }
    });
    return deferred.promise;
}

And I want that if I put large code file (as string) to find that this file contain the pattern

Another example

var d = Q.defer(); /* or $q.defer */

And in the file you have also the following respective value

d.resolve(val);
d.reject(err); 
return d.promise;

Complete EXAMPLE 2

function getStuffDone(param) {           
    var d = Q.defer(); /* or $q.defer */ 

    Promise(function(resolve, reject) {
        // or = new $.Deferred() etc.        
        myPromiseFn(param+1)                 
        .then(function(val) { /* or .done */ 
            d.resolve(val);                  
        }).catch(function(err) { /* .fail */ 
            d.reject(err);                   
        });                                  
        return d.promise; /* or promise() */ 

}                  

There is open sources which can be used to do such analysis(provide a pattern and it will found...)

There is some more complex patters with childProcess but for now this is OK :)

eAbi
  • 3,220
  • 4
  • 25
  • 39
  • `console.log('std: ' + data);` is it getting printed? – thefourtheye Jan 20 '16 at 11:24
  • 2
    Maybe you should spend a little time explaining what the code is supposed to do and what all the variables are that are referenced but never defined (e.g. `post`, `envO`, `child_process` etc). Your code does not exactly strike me as clean and self-explanatory. – Tomalak Jan 20 '16 at 15:38
  • Well, you resolve the promise on the first `data` event. Any further calls to resolve (like the one on `close`) will be ignored - you can resolve a promise only once. – Tomalak Jan 20 '16 at 17:07
  • Generally speaking, if a certain event does never occur but you base a promise on that event, then this promise will never be resolved. *(Promise or not makes no difference, if a certain event never occurs then no callbacks are invoked, period.)* So what you're saying is that your `child` does not emit events? – Tomalak Jan 20 '16 at 17:40
  • No, if your `child` emits no events whatsoever then there really is no way to overcome this. What would you expect to happen in such a case anyway? – Tomalak Jan 20 '16 at 18:07
  • The child process does not run forever, right? So you will get an event when it ends. That event will either indicate an error or success. And that's when you resolve or reject the promise. Sounds easy enough to me. – Tomalak Jan 20 '16 at 18:34

2 Answers2

7

The following regular expression may look a bit scary but has been built from simple concepts and allows a little more leeway than you mentioned - e.g. extra whitespace, different variable names, omission of var etc. Seems to work for both examples - please see if it meets your needs.

([^\s\r\n]+)\s*=\s*(?:Q|\$q)\.defer\s*\(\s*\)\s*;(?:\r?\n|.)*(?:\s|\r?\n)(?:\1\.reject\(\w+\)\s*;(?:\r?\n|.)*(?:\s|\r?\n)\1\.resolve\(\s*\w*\)\s*;|\1\.resolve\(\s*\w*\)\s*;(?:\r?\n|.)*(?:\s|\r?\n)\1\.reject\(\w+\)\s*;)(?:\r?\n|.)*(?:\s|\r?\n)return\s+(?:\1\.)?promise\s*;

Regular expression visualization

Debuggex Demo

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
  • Thanks a lot! 1+ , few short questions :) 1. why did you decide to go with regular expiration approach ? this is the best bet for this kind of issue? 2. Can you please elaborate on the diagram ? 3. there is a tool which can help you to build such complicated regular expression ? –  Feb 17 '16 at 09:14
  • Thanks! 1. Mainly for the challenge rather than it being the "best" way. TBH it may be simpler to search for the components individually but this would involve storing variables rather than the back references (`\1`) so went for a single approach to avoid mixing regex and Javascript. 2. The diagram was generated by http://debuggex.com to show what it's doing - easier to read than the regex itself as you don't have to keep track of brackets, work out which characters are literals, control characters etc. 3. Have seen other tools but think some practice/experience is really what's needed. – Steve Chambers Feb 17 '16 at 09:26
  • Thanks :), why I see the following error ? Result: Does not match starting at the black triangle slider 2. Do you know how does this diagram called? –  Feb 17 '16 at 09:31
  • To borrow from [someone else's comment](http://stackoverflow.com/questions/28768307/javascript-using-match-to-find-two-words-and-return-all-text-between-them-in#comment-45817104): *"On Debuggex, the message means it doesn't match from the very beginning of your text (which is correct). The orange/yellow area is the actual match meaning that the regex is correct."* 2. Don't think it's got a name (AFAIK it's not a "standard" diagram, just an invention conjured up by the website - but the "embed" link calls the the image "Regular expression visualization" if that helps?) – Steve Chambers Feb 17 '16 at 09:35
3

UPDATE: I made one correction to the code, i.e. changed set[2] to set[set.length - 1] to accommodate query sets of any size. I then applied the exact same algorithm to your two examples.

The solution I provide follows some rules that I think are reasonable for the type of search you are proposing. Assume you are looking for four lines, ABCD (case insensitive, so it will find ABCD or abcd or aBcD):

  • Multiple match sets can be found in a single file, i.e. it will find two sets in ABCDabcd.
  • Regex's are used for individual lines, meaning that variations can be included. (As only one consequence of this, it won't matter if you have a comment at the end of a matching line in your code.)
  • The patterns sought must always be on different lines, e.g. A and B can't be on the same line.
  • The matched set must be complete, e.g. it will not find ABC or ABD.
  • The matched set must be uninterrupted, i.e. it will not find anything in ABCaD. (Importantly, this also means that is will not find anything in overlapping sets, e.g. ABCaDbcd. You could argue that this is too limiting. However, in this example, which should be found, ABCD or abcd? The answer is arbitrary, and arbitrariness is difficult to code. Moreover, based on the examples you showed, such overlapping would not typically be expected, so this edge case seems unlikely, making this limitation reasonable.)
  • The matched set must be internally non-repeating, e.g. it will not find ABbCD. However, with AaBCD, it will find a set, i.e. it will find aBCD.
  • Embedded sets are allowed, but only the internal one will be found, e.g. with ABabcdCD, only abcd will be found.

The code snippet below shows an example search. It does not demonstrate all of the edge cases. However, it does show the overall functionality.

var queryRegexStrs = [
  "I( really)? (like|adore) strawberry",
  "I( really)? (like|adore) chocolate",
  "I( really)? (like|adore) vanilla"
];

var codeStr =
  "....\n" +
  "Most people would say 'I like vanilla'\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "Amir's taste profile:\n" +
  "....\n" +
  "I like strawberry\n" +
  "....\n" +
  "....\n" +
  "I told Billy that I really adore chocolate a lot\n" +
  "....\n" +
  "I like vanilla most of the time\n" +
  "....\n" +
  "Let me emphasize that I like strawberry\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "Juanita's taste profile:\n" +
  "....\n" +
  "I really adore strawberry\n" +
  "I like vanilla\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "Rachel's taste profile:\n" +
  "I adore strawberry\n" +
  "....\n" +
  "Sometimes I like chocolate, I guess\n" +
  "....\n" +
  "I adore vanilla\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "....\n" +
  "";

// allow for different types of end-of-line characters or character sequences
var endOfLineStr = "\n";

var matchSets = search(queryRegexStrs, codeStr, endOfLineStr);





function search(queryRegexStrs, codeStr, endOfLineStr) {

  // break the large code string into an array of line strings
  var codeLines = codeStr.split(endOfLineStr);

  // remember the number of lines being sought
  var numQueryLines = queryRegexStrs.length;

  // convert the input regex strings into actual regex's in a parallel array
  var queryRegexs = queryRegexStrs.map(function(queryRegexStr) {
    return new RegExp(queryRegexStr);
  });

  // search the array for each query line
  //   to find complete, uninterrupted, non-repeating sets of matches

  // make an array to hold potentially multiple match sets from the same file
  var matchSets = [];

  // prepare to try finding the next match set
  var currMatchSet;

  // keep track of which query line number is currently being sought
  var idxOfCurrQuery = 0;

  // whenever looking for a match set is (re-)initialized,
  //   start looking again for the first query,
  //   and forget any previous individual query matches that have been found
  var resetCurrQuery = function() {
    idxOfCurrQuery = 0;
    currMatchSet = [];
  };

  // check each line of code...
  codeLines.forEach(function(codeLine, codeLineNum, codeLines) {

    // ...against each query line
    queryRegexs.forEach(function(regex, regexNum, regexs) {

      // check if this line of code is a match with this query line
      var matchFound = regex.test(codeLine);

      // if so, remember which query line it matched
      if (matchFound) {

        // if this code line matches the first query line,
        //   then reset the current query and continue
        if (regexNum === 0) {
          resetCurrQuery();
        }

        // if this most recent individual match is the one expected next, proceed
        if (regexNum === idxOfCurrQuery) {

          // temporarily remember the line number of this most recent individual match
          currMatchSet.push(codeLineNum);

          // prepare to find the next query in the sequence
          idxOfCurrQuery += 1;

          // if a whole query set has just been found, then permanently remember
          //   the corresponding code line numbers, and reset the search
          if (idxOfCurrQuery === numQueryLines) {
            matchSets.push(currMatchSet);
            resetCurrQuery();
          }

          // if this most recent match is NOT the one expected next in the sequence,
          //   then start over in terms of starting to look again for the first query
        } else {
          resetCurrQuery();
        }
      }
    });
  });

  return matchSets;

}




// report the results
document.write("<b>The code lines being sought:</b>");
document.write("<pre>" + JSON.stringify(queryRegexStrs, null, 2) + "</pre>");
document.write("<b>The code being searched:</b>");
document.write(
  "<pre><ol start='0'><li>" +
  codeStr.replace(new RegExp("\n", "g"), "</li><li>") +
  "</li></ol></pre>"
);
document.write("<b>The code line numbers of query 'hits', grouped by query set:</b>");
document.write("<pre>" + JSON.stringify(matchSets) + "</pre>");
document.write("<b>One possible formatted output:</b>");

var str = "<p>(Note that line numbers are 0-based...easily changed to 1-based if desired)</p>";
str += "<pre>";
matchSets.forEach(function(set, setNum, arr) {
  str += "Matching code block #" + (setNum + 1) + ": lines " + set[0] + "-" + set[set.length - 1] + "<br />";
});
str += "</pre>";
document.write(str);

Here is the exact same algorithm, just using your original examples 1 and 2. Note a couple of things. First of all, anything that needs escaping in the regex strings actually needs double-escaping, e.g. in order to find a literal opening parenthesis you need to include "\\(" not just "\(". Also, the regex's perhaps seem a little complex. I have two comments about this. First: a lot of that is just finding the literal periods and parentheses. However, second, and importantly: the ability to use complex regex's is part of the power (read "flexibility") of this entire approach. e.g. The examples you provided required some alternation where, e.g., "a|b" means "find a OR b".

var queryRegexStrs = [
  "var deferred = Q\\.defer\\(\\);",
  "deferred\\.reject\\(err\\);",
  "deferred\\.resolve\\(\\);",
  "return deferred\\.promise;"
];

var codeStr =
  'function writeError(errMessage) {'                           + "\n" +
  '    var deferred = Q.defer();'                               + "\n" +
  '    fs.writeFile("errors.log", errMessage, function (err) {' + "\n" +
  '        if (err) {'                                          + "\n" +
  '            deferred.reject(err);'                           + "\n" +
  '        } else {'                                            + "\n" +
  '            deferred.resolve();'                             + "\n" +
  '        }'                                                   + "\n" +
  '    });'                                                     + "\n" +
  '    return deferred.promise;'                                + "\n" +
  '}'                                                           + "\n" +
  '';

// allow for different types of end-of-line characters or character sequences
var endOfLineStr = "\n";

var matchSets = search(queryRegexStrs, codeStr, endOfLineStr);





function search(queryRegexStrs, codeStr, endOfLineStr) {

  // break the large code string into an array of line strings
  var codeLines = codeStr.split(endOfLineStr);

  // remember the number of lines being sought
  var numQueryLines = queryRegexStrs.length;

  // convert the input regex strings into actual regex's in a parallel array
  var queryRegexs = queryRegexStrs.map(function(queryRegexStr) {
    return new RegExp(queryRegexStr);
  });

  // search the array for each query line
  //   to find complete, uninterrupted, non-repeating sets of matches

  // make an array to hold potentially multiple match sets from the same file
  var matchSets = [];

  // prepare to try finding the next match set
  var currMatchSet;

  // keep track of which query line number is currently being sought
  var idxOfCurrQuery = 0;

  // whenever looking for a match set is (re-)initialized,
  //   start looking again for the first query,
  //   and forget any previous individual query matches that have been found
  var resetCurrQuery = function() {
    idxOfCurrQuery = 0;
    currMatchSet = [];
  };

  // check each line of code...
  codeLines.forEach(function(codeLine, codeLineNum, codeLines) {

    // ...against each query line
    queryRegexs.forEach(function(regex, regexNum, regexs) {

      // check if this line of code is a match with this query line
      var matchFound = regex.test(codeLine);

      // if so, remember which query line it matched
      if (matchFound) {

        // if this code line matches the first query line,
        //   then reset the current query and continue
        if (regexNum === 0) {
          resetCurrQuery();
        }

        // if this most recent individual match is the one expected next, proceed
        if (regexNum === idxOfCurrQuery) {

          // temporarily remember the line number of this most recent individual match
          currMatchSet.push(codeLineNum);

          // prepare to find the next query in the sequence
          idxOfCurrQuery += 1;

          // if a whole query set has just been found, then permanently remember
          //   the corresponding code line numbers, and reset the search
          if (idxOfCurrQuery === numQueryLines) {
            matchSets.push(currMatchSet);
            resetCurrQuery();
          }

          // if this most recent match is NOT the one expected next in the sequence,
          //   then start over in terms of starting to look again for the first query
        } else {
          resetCurrQuery();
        }
      }
    });
  });

  return matchSets;

}




// report the results
document.write("<b>The code lines being sought:</b>");
document.write("<pre>" + JSON.stringify(queryRegexStrs, null, 2) + "</pre>");
document.write("<b>The code being searched:</b>");
document.write(
  "<pre><ol start='0'><li>" +
  codeStr.replace(new RegExp("\n", "g"), "</li><li>") +
  "</li></ol></pre>"
);
document.write("<b>The code line numbers of query 'hits', grouped by query set:</b>");
document.write("<pre>" + JSON.stringify(matchSets) + "</pre>");
document.write("<b>One possible formatted output:</b>");

var str = "<p>(Note that line numbers are 0-based...easily changed to 1-based if desired)</p>";
str += "<pre>";
matchSets.forEach(function(set, setNum, arr) {
  str += "Matching code block #" + (setNum + 1) + ": lines " + set[0] + "-" + set[set.length - 1] + "<br />";
});
str += "</pre>";
document.write(str);

Here is the exact same algorithm, just using your original example 2:

var queryRegexStrs = [
  "var d = (Q\\.defer\\(\\)|\\$q\\.defer);",
  "d\\.resolve\\(val\\);",
  "d\\.reject\\(err\\);",
  "return d\\.promise(\\(\\))?;"
];

var codeStr =
  "...."                                         + "\n" +
  "...."                                         + "\n" +
  "...."                                         + "\n" +
  "function getStuffDone(param) {"               + "\n" +
  "    var d = Q.defer();"                       + "\n" +
  ""                                             + "\n" +
  "    Promise(function(resolve, reject) {"      + "\n" +
  "        // or = new $.Deferred() etc."        + "\n" +
  "        myPromiseFn(param+1)"                 + "\n" +
  "        .then(function(val) { /* or .done */" + "\n" +
  "            d.resolve(val);"                  + "\n" +
  "        }).catch(function(err) { /* .fail */" + "\n" +
  "            d.reject(err);"                   + "\n" +
  "        });"                                  + "\n" +
  "        return d.promise;"                    + "\n" +
  ""                                             + "\n" +
  "}"                                            + "\n" +
  "...."                                         + "\n" +
  "...."                                         + "\n" +
  "...."                                         + "\n" +
  "function getStuffDone(param) {"               + "\n" +
  "    var d = $q.defer;"                        + "\n" +
  ""                                             + "\n" +
  "    Promise(function(resolve, reject) {"      + "\n" +
  "        // or = new $.Deferred() etc."        + "\n" +
  "        myPromiseFn(param+1)"                 + "\n" +
  "        .then(function(val) { /* or .done */" + "\n" +
  "            d.resolve(val);"                  + "\n" +
  "        }).catch(function(err) { /* .fail */" + "\n" +
  "            d.reject(err);"                   + "\n" +
  "        });"                                  + "\n" +
  "        return d.promise();"                  + "\n" +
  ""                                             + "\n" +
  "}"                                            + "\n" +
  "...."                                         + "\n" +
  "...."                                         + "\n" +
  "...."                                         + "\n" +
  "";

// allow for different types of end-of-line characters or character sequences
var endOfLineStr = "\n";

var matchSets = search(queryRegexStrs, codeStr, endOfLineStr);





function search(queryRegexStrs, codeStr, endOfLineStr) {

  // break the large code string into an array of line strings
  var codeLines = codeStr.split(endOfLineStr);

  // remember the number of lines being sought
  var numQueryLines = queryRegexStrs.length;

  // convert the input regex strings into actual regex's in a parallel array
  var queryRegexs = queryRegexStrs.map(function(queryRegexStr) {
    return new RegExp(queryRegexStr);
  });

  // search the array for each query line
  //   to find complete, uninterrupted, non-repeating sets of matches

  // make an array to hold potentially multiple match sets from the same file
  var matchSets = [];

  // prepare to try finding the next match set
  var currMatchSet;

  // keep track of which query line number is currently being sought
  var idxOfCurrQuery = 0;

  // whenever looking for a match set is (re-)initialized,
  //   start looking again for the first query,
  //   and forget any previous individual query matches that have been found
  var resetCurrQuery = function() {
    idxOfCurrQuery = 0;
    currMatchSet = [];
  };

  // check each line of code...
  codeLines.forEach(function(codeLine, codeLineNum, codeLines) {

    // ...against each query line
    queryRegexs.forEach(function(regex, regexNum, regexs) {

      // check if this line of code is a match with this query line
      var matchFound = regex.test(codeLine);

      // if so, remember which query line it matched
      if (matchFound) {

        // if this code line matches the first query line,
        //   then reset the current query and continue
        if (regexNum === 0) {
          resetCurrQuery();
        }

        // if this most recent individual match is the one expected next, proceed
        if (regexNum === idxOfCurrQuery) {

          // temporarily remember the line number of this most recent individual match
          currMatchSet.push(codeLineNum);

          // prepare to find the next query in the sequence
          idxOfCurrQuery += 1;

          // if a whole query set has just been found, then permanently remember
          //   the corresponding code line numbers, and reset the search
          if (idxOfCurrQuery === numQueryLines) {
            matchSets.push(currMatchSet);
            resetCurrQuery();
          }

          // if this most recent match is NOT the one expected next in the sequence,
          //   then start over in terms of starting to look again for the first query
        } else {
          resetCurrQuery();
        }
      }
    });
  });

  return matchSets;

}




// report the results
document.write("<b>The code lines being sought:</b>");
document.write("<pre>" + JSON.stringify(queryRegexStrs, null, 2) + "</pre>");
document.write("<b>The code being searched:</b>");
document.write(
  "<pre><ol start='0'><li>" +
  codeStr.replace(new RegExp("\n", "g"), "</li><li>") +
  "</li></ol></pre>"
);
document.write("<b>The code line numbers of query 'hits', grouped by query set:</b>");
document.write("<pre>" + JSON.stringify(matchSets) + "</pre>");
document.write("<b>One possible formatted output:</b>");

var str = "<p>(Note that line numbers are 0-based...easily changed to 1-based if desired)</p>";
str += "<pre>";
matchSets.forEach(function(set, setNum, arr) {
  str += "Matching code block #" + (setNum + 1) + ": lines " + set[0] + "-" + set[set.length - 1] + "<br />";
});
str += "</pre>";
document.write(str);
Andrew Willems
  • 11,880
  • 10
  • 53
  • 70
  • Thanks Andrew ,looks interesting voted up! Can you please provide this example with the context of the question, I mean put the promise as example? Thanks in advance! –  Feb 21 '16 at 11:33
  • Thanks Andrew can somehow the +\n which you add to the line will removed ? I mean from the codeStr... –  Feb 23 '16 at 10:39
  • How can I run it in jsFiddle? –  Feb 23 '16 at 10:43
  • (1) The `"\n"` is already being removed or, more accurately, is already being ignored, i.e. does not need to be part of the query regex strings (although you can include `"$"` as an end-of-string matcher in the regex if you need to). I actually include `var endOfLineStr = "\n";` so that you change what end-of-line character is used in your particular context if you need to. I set off the `+ "\n"` in the code simply for visual clarity. (2) Here is a [jsfiddle](https://jsfiddle.net/andr3ww/fnack40p/) created by literally just copying the above code. – Andrew Willems Feb 23 '16 at 13:14
  • Note: Using regexes on code can be miserable, because there are a lot of special characters that require double-escaping. To avoid this, you can use a regex hack. e.g. To find "d.reject(err);" you need "d\\.reject\\\(err\\\);" which is ugly. Instead, replace all special characters with "." which represents any (non-newline) character. In this example, use "d.reject.err.;". This _will_ find other variants, but if you're sure, for example, that your code does not include "d&rejectQerr~" etc., then this hack might make your regexes easier both to write and to read. – Andrew Willems Feb 24 '16 at 23:38
  • Thank you very much Andrew you are super smart! you give me points I didn't thought before...last two question before I mark it as solved 1. can I use this code as a js module to give just the pattern (queryRegexStrs) and the code (codeStr) and it will bring somehow all the output that you provide? 2. can I re-use it for other JS patterns? Thank you very much sir!!! –  Mar 01 '16 at 11:48
  • The code is re-usable & is at least self-contained in the sense that it is written as a single function that requires clear input & provides clear output. Run it as follows: `var myMatchSets = search(myQueryRegexStrs, myCodeStr, myEndOfLineStr);`. You would need to format the input parameters as shown in the examples. The output would be an array of arrays as shown in the example, and it would follow this pattern: `[ [, , ...], [, , ...], ...]`. How you format that is up to you, again as shown in the example. – Andrew Willems Mar 01 '16 at 15:03