0

Trying to handle unlimited scrolling on Twitter, page doesn't seem to be loading dynamic content even though I'm scrolling to the bottom.

I'm doing a quick test to see if content even loads with the following code:

casper.open('https://twitter.com/<account>', function() {
  this.evaluate(function() {
    window.scrollTo(0, document.body.scrollHeight);
  });
  this.capture('twitter-screenshot.png');
});

and the screenshot seems like it's loaded the entire viewport even though I've set the viewport to casper.options.viewportSize = { width: 1400, height: 600 };

I've checked the number of followers on on an actual browser and the number on CasperJS is the exact same as the initial number loaded before you scroll to the bottom. So either:

  1. It's not triggering the dynamic load on scroll because the entire viewport is loaded, so there is no scroll.
  2. The scroll is not triggering properly.
  3. Something I'm completely missing

I've read that the screenshot feature actually screenshots the entire DOM and isn't actually how the current viewport is. I've gotten to this point in horseman/zombie/casper/phantom and have never gotten the scroll to work properly. Any hints would be great.

Edit 1: Using Vaviloff's code on pure phantomJS, I'm still seeing the same errors. This is the terminal output:

Writing twitter-1.png...
[1] top = 10064
Writing twitter-2.png...
[2] top = 10064
Writing twitter-3.png...
[3] top = 10064
Writing twitter-4.png...
[4] top = 10064
Writing twitter-5.png...
[5] top = 10064

I've noticed that only twitter-1.png is written to my filesystem and it is incredibly long in height. My viewport height is set to 900 before any page.open() is invoked.

I should add that I've tested on Windows 10 and OSX Yosemite with phantomJS 2.1.1 installed via npm.

Edit 2: Looks like there is some issue because I've logged into my test account first.

Edit 3: If you log into twitter, it runs additional scripts, and phantomJS isn't compatible and throws this error TypeError: undefined is not a constructor (evaluating 't.canPlayType(e)'). This will kill all JS on the page. Not sure how to get around this.

PGT
  • 1,468
  • 20
  • 34
  • Yes, screenshots will be long because PhantomJS renders full page's height, this is by the program's design. The screenshots will vary in height because different tweets will be of different height. **Edit 1** Please add `page.onError` callback to check for errors. **Edit 3** is probably an issue for another question. – Vaviloff Feb 22 '17 at 08:33

1 Answers1

1

PhantomJS has native scroll emulation: http://phantomjs.org/api/webpage/property/scroll-position.html

A sample from a very fine book on PhantomJS scripting, that opens a Twitter page ans scrolls it five screens down:

var webpage = require('webpage').create();
webpage.viewportSize = { width: 1280, height: 800 };
webpage.scrollPosition = { top: 0, left: 0 };
webpage.open('https://twitter.com/founddrama', function(status) {
  if (status === 'fail') {
    console.error('webpage did not open successfully');
    phantom.exit(1);
  }
  var i = 0,
      top,
      queryFn = function() {
        return document.body.scrollHeight;
      };
  setInterval(function() {
    var filename = 'twitter-' + (++i) + '.png';
    console.log('Writing ' + filename + '...');
    webpage.render(filename);
    top = webpage.evaluate(queryFn);
    console.log('[' + i + '] top = ' + top);
    webpage.scrollPosition = { top: top + 1, left: 0 };

    if (i >= 5) {
      phantom.exit();
    }

  }, 3000);
});

Added

Calbacks for debugging your script, especially page.onError is invaluable:

webpage.onConsoleMessage = function (msg) {
    console.log(msg);
};   

webpage.onError = function (msg, trace) {
    var msgStack = ['ERROR: ' + msg];
    if (trace && trace.length) {
      msgStack.push('TRACE:');
      trace.forEach(function(t) {
        msgStack.push(' -> ' + t.file + ': ' + t.line + (t.function ? ' (in function "' + t.function +'")' : ''));
      });
    }
    console.log(msgStack.join('<br />'));
};   
Vaviloff
  • 16,282
  • 6
  • 48
  • 56
  • Thanks. I believe I tried this and it didn't seem to work. But let me try again with your code. Also, wouldn't the `webpage.render` take the screenshot of the entire DOM? So each screenshot will look the same. I've noticed this on all my screenshots that the `viewportSize` didn't do anything. /cc @Vaviloff – PGT Feb 21 '17 at 20:41
  • Checked yesterday before posting, worked fne, scroll functioning. PhantomJS version 2.1.1. – Vaviloff Feb 22 '17 at 02:28
  • Using your code, something is weird, I'll update my description above with results because formatting is easier to see. – PGT Feb 22 '17 at 06:41