0

I have written a function that gets a list of hyperlink anchors through webscraping.

I want to push all these anchors onto an object array, which will later be serialized to a Json string.

The Api.GetCourseSubmenuUrl method and the Api.FilterSubmenuContentList both return promises.

The following code however keeps running without waiting for the array to be filled in the .each() cheerio function. Why does this happen?

Please note that the each method in cheerio is synchronous.

My code uses the packages:

Code:

Connection.prototype.FillCourseWithSubmenuContent = function(course){
    var self = this; //This class
    var submenuItems = [];
    return new BPromise(function(resolve, reject){
      return Api.GetCourseSubmenuUrl(ApiConnection.authToken).then(function(response){
        return request.get({url: self.url + response.url + course.id, followRedirect: false, jar: cookiejar}, function(err,httpResponse,body){
          if(err){
            reject(err);
          }
          var cheerio = require('cheerio');
          var dashboardhtml = cheerio.load(body, {
                  normalizeWhitespace: true,
                  decodeEntities: true
              }
          );
          //Find all the links on the page
          dashboardhtml('a').each(function(i, elem) {
              console.log("Object:");
              console.log({"text":dashboardhtml(elem).text(), "url":dashboardhtml(elem).attr('href')});
              submenuItems.push({"text":dashboardhtml(elem).text().trim(), "url":dashboardhtml(elem).attr('href')});
          });
          resolve();
        });
      }).then(function(){
        console.log(submenuItems);
        return Api.FilterSubmenuContentList(ApiConnection.authToken, submenuItems);
      });
    }).catch(function(error){
      return reject(error);
    });
};
Dragon54
  • 313
  • 6
  • 19
  • Avoid the [`Promise` constructor antipattern](http://stackoverflow.com/q/23803743/1048572?What-is-the-promise-construction-antipattern-and-how-to-avoid-it)! – Bergi Mar 29 '17 at 10:32
  • That `reject` function you're calling is not even in scope. – Bergi Mar 29 '17 at 10:43
  • @Bergi I don't really grasp how the antipattern appies to my code? Where does it go wrong? Is it the usage of a promise within a promise? You're talking about the first or second reject? – Dragon54 Mar 29 '17 at 10:46
  • Yes, the usage of promises within the `Promise` callback. And the second `reject`. – Bergi Mar 29 '17 at 16:21

1 Answers1

1

The problem was fixed by taking the Promise constructor antipattern into account, pointed out by @Bergi.

Since the Request library doesn't have promise support, I still had to wrap it inside a (Bluebird) promise.

Please note that it is also possible to promisify libraries, which makes life a lot easier. But for the demonstration of the solution I went the promise wrapping route.

The solution:

Connection.prototype.FillCourseWithSubmenuContent = function(course){
    var self = this; //This class
    var submenuItems = [];
    return Api.GetCourseSubmenuUrl(ApiConnection.authToken).then(function(response){
      console.log(self.url + response.url + course.id);
      return new BPromise(function(resolve, reject){
        request.get({url: self.url + response.url + course.id, followRedirect: false, jar: cookiejar}, function(err,httpResponse,body){
          if(err){
            reject(err);
          }
          var cheerio = require('cheerio');
          var dashboardhtml = cheerio.load(body, {
                  normalizeWhitespace: true,
                  decodeEntities: true
              }
          );
          //Find all the links on the page
          dashboardhtml('a').each(function(i, elem) {
              // console.log("Object");
              // console.log({"text":dashboardhtml(elem).text(), "url":dashboardhtml(elem).attr('href')});
              submenuItems.push({"text":dashboardhtml(elem).text().trim(), "url":dashboardhtml(elem).attr('href')});
          });
          return resolve();
        });
      });
    }).then(function(){
      console.log(submenuItems);
      return Api.FilterSubmenuContentList(ApiConnection.authToken, submenuItems);
    });
};
Dragon54
  • 313
  • 6
  • 19
  • 1
    +1, this is what I meant. I would have even gone as far as doing `resolve(body)`, and putting the `cheerio` stuff into a `then` handler where exceptions will automatically be caught – Bergi Mar 29 '17 at 16:22