reducing phantomjs/casperjs/spookyjs memory usage in infinte loop

Question

I'm trying to scrape a page that uses infinite scroll using phantomjs casperjs and spooky. It is supposed to continue clicking the more button and taking the new links from the results until its stopped manually. The script however starts using more and more memory until it crashes. I wrote the following script, is there a way to optimise it so it won't use as much memory:

function pressMore(previousLinksLength) {
    this.click('#projects > div.container-flex.px2 > div > a');
    this.wait(1000, function() {
      links = this.evaluate(function() {
        var projectPreview = document.querySelectorAll('.project-thumbnail a');
        return Array.prototype.map.call(projectPreview, function(e) {
          return e.getAttribute('href');
        });
    });
      this.emit('sendScrapedLinks', links.slice(previousLinksLength));
    // repeat scrape function
      pressMore.call(this, links.length);
  });
}
// spookyjs starts here
spooky.start(scrapingUrl);

//press the more button
spooky.then(pressMore);

spooky.run();

How complex is the `sendScrapedLinks` event handler? Other than that, you can't do anything better. — Artjom B., Sep 07 '14 at 17:18

score 1 · Answer 1 · answered Sep 09 '14 at 12:15

I've also run into this problem on infinite scrolling sites. I could never find away around the memory leaks.

In short what I ended up doing is using scroll to. Essentially I would run the app for awhile log the last scrolled to position and then restart the app using the logged values to prevent memory from getting to high. It's a pain because many sites you have to sequentially scroll to a certain position to load more and more. Finding those positions to divide up your last scrolled to position can be challenging.

How does this help? Just because you know the last scroll position before a crash doesn't mean that you get further on a second attempt. — Artjom B., Sep 09 '14 at 16:28

reducing phantomjs/casperjs/spookyjs memory usage in infinte loop

1 Answers1