4

I am using Perl with the WWW::Mechanize module to submit a form to a webpage and save the result to a file. I know how to submit forms and save the data, but I can't save data after this six-second redirection.

After the form is submitted, the page is redirected to a page that says

Results should appear in this window in approximately 6 seconds...

and it is redirected again to the page with the result I want. My script can follow the first redirection, but not the second, and there is no link says something like "click here if not redirected".

Here is my script

use WWW::Mechanize;

my $mech = WWW::Mechanize->new(autocheck => 1);

$mech->get( "http://tempest.wellesley.edu/~btjaden/TargetRNA2/index.html");

$result = $mech->submit_form(
    form_number =>  1,
    fields      =>  {
        text    => 'Escherichia coli str. K-12 substr. MG1655',
        sequence    => '>RyhB' . "\n" .
                        'GCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACGACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTT',
    }    
);
$mech->save_content(result);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 1
    It is impolite of you to present code in such a mess and ask for help with it. It wouldn't hurt you to indent it properly so that it was at least readable. As it stands it's a mess. – Borodin Feb 13 '15 at 22:17
  • The *Wellesley College* site doesn't appear to like the sequence ID `>RyhB` in the RNA sequence field. It would help if you posted the *actual* code that you're having problems with. – Borodin Feb 13 '15 at 23:21

2 Answers2

3

What you need to do is extract the redirect URL and ran it manually:

Try this:

use WWW::Mechanize;

my $mech = WWW::Mechanize->new( autocheck => 1 );

$mech->get( "http://tempest.wellesley.edu/~btjaden/TargetRNA2/index.html");

$result = $mech->submit_form(
    form_number => 1, 
    fields      => 
    {
        text        => 'Escherichia coli str. K-12 substr. MG1655', 
        sequence    => '>RyhB GCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACGACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTT',
    }
);

my $content =  $mech->content;
my $url1 = 'http://tempest.wellesley.edu/~btjaden/cgi-bin/';
my ($url2) = $content =~ /URL=(targetRNA2\.cgi?.+)?">/;

$mech->get($url1.$url2);

$mech->save_content(result);
Andrey
  • 1,808
  • 1
  • 16
  • 28
  • Please *always* `use strict` and `use warnings` at the start of *every* Perl program. There is very little point in declaring variables with `my` without `use strict` in place. – Borodin Feb 13 '15 at 23:30
0

WWW::Mechanize and meta refresh

Does the "6 seconds" contain something line the line below? [You may use save_content method of WWW::Machenize to save page to file]

<meta http-equiv="refresh" content="5; url=http://example.com/">

YES=>

Take a look at sources of WWW::Mechanize::Plugin::FollowMetaRedirect.
It shows how WWW::Mechanize may follow meta refresh with redirect.
It may quite likely solve your problem.

AnFi
  • 10,493
  • 3
  • 23
  • 47
  • 2
    This is rather like a link-only answer. Might want to add some of the details from these links into your post. – AKHolland Feb 13 '15 at 22:28
  • It is not cost effective without knowing which "delayed redirection" method is used by the page e.g. I have seen redirections based on javascript only, – AnFi Feb 13 '15 at 23:42