2

I have been trying since long to parse and run javascript webpages using python(2.4). Unfortunately I cannot use qt,webkit so most of the python based headless browsers are ruled out. I however recently found out WWW::Scripter in perl(using perl 5.8.8) which also seems to be a scripting engine for javascript. I also installed the javascript plugin required to run it.

use WWW::Scripter;
$w = new WWW::Scripter;
$w->use_plugin('JavaScript');  # packaged separately
$w->get('some javascript website');
print $w->content;

Well it prints lots and lots of errors and eventually terminates and the output does not seem to be anywhere close to expected. I tried this for 3-4 sites but same result. By expected output I mean the source code as can be seen from the inspect element from a google chrome browser. Any idea what I am doing wrong with the perl scripter? Secondly any quick alternative way of getting a javascript engine running to parse the websites in python2.4 or perl (or even ruby, constraint being cant use qt) Hopeful I could present my problem without confusing a lot.

EDIT: First few lines of errors:

Day too big - 52263 > 24855
Sec too small - 52263 < 74752
Sec too big - 52263 > 11647
Day too big - 52263 > 24855
Sec too small - 52263 < 74752
Sec too big - 52263 > 11647
<></> at /usr/lib/perl5/site_perl/5.8.8/HTML/DOM/Element.pm line 320.
 at /usr/lib/perl5/site_perl/5.8.8/HTML/DOM/Element.pm line 320.
        HTML::DOM::Element::getAttribute('HTML::DOM::Element::Input=HASH(0xcc309f0)', 'checked') called at /usr/lib/perl5/site_perl/5.8.8/HTML/DOM/Element.pm line 379
        HTML::DOM::Element::_attr('HTML::DOM::Element::Input=HASH(0xcc309f0)', 'checked') called at /usr/lib/perl5/site_perl/5.8.8/HTML/DOM/Element/Form.pm line 965
        HTML::DOM::Element::Input::defaultChecked('HTML::DOM::Element::Input=HASH(0xcc309f0)') called at /usr/lib/perl5/site_perl/5.8.8/HTML/DOM/Element/Form.pm line 975
        HTML::DOM::Element::Input::checked('HTML::DOM::Element::Input=HASH(0xcc309f0)') called at /usr/lib/perl5/site_perl/5.8.8/JE.pm line 1719
        JE::__ANON__('JE::Object::Proxy=REF(0xcb53f44)', undef) called at /usr/lib/perl5/site_perl/5.8.8/JE/Object.pm line 385
        JE::Object::prop('JE::Object::Proxy=REF(0xcb53f44)', 'checked') called at /usr/lib/perl5/site_perl/5.8.8/JE/LValue.pm line 91
        JE::LValue::get('JE::LValue=ARRAY(0xcc4eac8)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1197
        JE::Code::Expression::eval('JE::Code::Expression=ARRAY(0xc5fa87c)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1377
        JE::Code::Expression::_eval_term('JE::Code::Expression=ARRAY(0xc5fa87c)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1150
        JE::Code::Expression::eval('JE::Code::Expression=ARRAY(0xc5fa78c)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 349
        JE::Code::Statement::eval('JE::Code::Statement=ARRAY(0xc5e50e8)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 186
        eval {...} called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 157
        JE::Code::execute('JE::Code=HASH(0xcb4a5c0)', 'WWW::Scripter::Plugin::JavaScript::JE=REF(0xa7c76e8)', 'JE::Scope=ARRAY(0xcb4fc7c)', 2) called at /usr/lib/perl5/site_perl/5.8.8/JE/Object/Function.pm line 486
        JE::Object::Function::apply('JE::Object::Function=REF(0xcb4aaac)', 'WWW::Scripter::Plugin::JavaScript::JE=REF(0xa7c76e8)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Object/Function.pm line 351
        JE::Object::Function::call('JE::Object::Function=REF(0xcb4aaac)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1287
        JE::Code::Expression::eval('JE::Code::Expression=ARRAY(0xc607808)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1377
        JE::Code::Expression::_eval_term('JE::Code::Expression=ARRAY(0xc607808)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1182
        JE::Code::Expression::eval('JE::Code::Expression=ARRAY(0xc5a0798)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1377
        JE::Code::Expression::_eval_term('JE::Code::Expression=ARRAY(0xc5a0798)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 1150
        JE::Code::Expression::eval('JE::Code::Expression=ARRAY(0xc5a0600)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 349
        JE::Code::Statement::eval('JE::Code::Statement=ARRAY(0xc3abbc0)') called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 186
        eval {...} called at /usr/lib/perl5/site_perl/5.8.8/JE/Code.pm line 157

Thanks

Ajay Nair
  • 1,827
  • 3
  • 20
  • 33
  • What errors are you getting? And why in the world are you using Perl 5.8, which is more than ten years old? – friedo Feb 27 '13 at 17:35
  • Thanks for replying. I am working on a server machine and hence upgrading requires lot of requests :( I have initiated the requests though but until then I need to be satisfied with what I have. I am pasting the first few lines of errors when I ran it on reuters website. – Ajay Nair Feb 27 '13 at 17:44

1 Answers1

1

In case it may be of any use: jsPhantom is a single file headless webkit (no installation). I used it several times (just put the exe next to perl), and put it to work. Next version (1.9) (March/April?) is expected to handle stdin to simplify piping.

Code example of interacting between Perl and PhantomJs via temp-files in This answer

Community
  • 1
  • 1
FtLie
  • 773
  • 4
  • 14