2

Trying to scrape a web page with CasperJS. Webpage checks to see if the browser is an IE 6/7.

Passing an userAgent with casperjs doesn't seem to satisfy its condition. UserAgent passed: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Following is the check being made by the page to determine the browser

agt = navigator.userAgent.toLowerCase();
browserType = navigator.appName;

if( ((browserType.indexOf("xplorer") != -1) 
    && (agt.indexOf("msie 6.") != -1))
    ||  ((browserType.indexOf("xplorer") != -1) 
    && (agt.indexOf("msie 7.") != -1)) )
{

}
else
{
    alert("This "+ browserType + " Version is not supported by this application. Please use Internet Explorer  6.x or Internet Explorer 7.x.");
    window.close();
}

Following is the debug info from casperjs.

[info] [remote] [alert] This Netscape Version is not supported by this applicat on. Please use Internet Explorer 6.x or Internet Explorer 7.x.

[warning] [phantom] Loading resource failed with status=fail (HTTP 200): http://

Any pointers on setting window.navigator object before page redirect?

Artjom B.
  • 61,146
  • 24
  • 125
  • 222
Prashanth
  • 23
  • 3
  • Also keep an eye on the TrifleJS project ( http://triflejs.org/ ). It is not working with CasperJS yet (which is why I am not posting this as an answer!), but that is one of their goals. (It can emulate IE7, so might be your best choice, if you are not tied to CasperJS.) – Darren Cook Dec 18 '14 at 08:04

1 Answers1

5

The navigator properties are read only, so you cannot set them and PhantomJS doesn't provide a capability to set it.

The solution is to make a proxy of the navigator object. The old navigator stays in the background, but it is replaced with a new one that behaves the same, but with an appName of "Internet Explorer". This whole bootstrapping process can be triggered from the page.initialized callback.

casper.on('page.initialized', function(){
    this.evaluate(function(){
        (function(oldNav){
            var newNav = {};
            [].forEach.call(Object.getOwnPropertyNames(navigator), function(prop){
                if (prop === 'appName') {
                    Object.defineProperty(newNav, prop, {
                        enumerable: false,
                        configurable: false,
                        writable: false,
                        value: 'Internet Explorer'
                    });
                } else {
                    Object.defineProperty(newNav, prop, {
                        enumerable: false,
                        configurable: false,
                        get: function(){
                            return oldNav[prop];
                        }
                    });
                }
            });
            window.navigator = newNav;
        })(window.navigator);
    });
});

The same goes for vanilla PhantomJS with the page.onInitialized event handler.

Working around the browser detection doesn't guarantee that the page works or looks good on PhantomJS. There is a reason some pages are "optimized" for IE and the reason is most of the time that some propietary features where used that are not there in other browsers.

Artjom B.
  • 61,146
  • 24
  • 125
  • 222