0

Hi is curl the best method for using POST with cookie and also navigated to another page for scrapping? I'm using the coding below and I can't get it to work.

include('simple_html_dom.php');

    $data = array(
     '__EVENTTARGET' => '',
     '__EVENTARGUMENT' => '',
     '__VIEWSTATE' => '%2FwEPDwUKLTcyODA2ODEwMGRk',
     'Myname' => 'justdev12345',
     'Mypassword' => '12345',
     'idLogin' => 'Login',
     'tmplang2' => '6',
     'fm' => '',
     'jc' => '',
     'LoginRef' => ''
     );

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://secure.site.com/mainframe.aspx");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt($ch, CURLOPT_COOKIEJAR, "sdc_cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "sdc_cookies.txt");
curl_setopt($ch, CURLOPT_COOKIESESSION, true);

$output = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

$output = new simple_html_dom();   
$output = file_get_html('http://profiles.site.com/profile_900.aspx?AccountID=ShopCartUpdate');
print $output;
acctman
  • 4,229
  • 30
  • 98
  • 142

1 Answers1

2

Your code is fine up until the end where you throw away the curl response and load the url with simple-html-dom. If you want to use the curl response it should look like this:

$html = str_get_html($output);
$html->find('title');

Otherwise you should probably remove all the curl code.

pguardiario
  • 53,827
  • 19
  • 119
  • 159
  • My problem is I need to login with one URL and then navigate to the second profile url to scrape. how do I do that while maintaining the connection created by curl – acctman Jan 23 '13 at 07:24
  • You would need to make both requests with curl. file_get_html will ignore your curl cookie jar. – pguardiario Jan 23 '13 at 08:07