I am trying to scrape a shopping cart which uses the cookies for different currencies. When I load the site in chrome browser and inspecting with Cookie Inspector
for Chrome
, it shows the following cookies.
When I try loading the same link with cURL
.example.com TRUE / FALSE 1462357306 SSNC CCSUBMIT-N
.example.com TRUE / FALSE 1462357306 SSOE PSORT-Y::CWR-on
.example.com TRUE / FALSE 1464947780 SSLB 1
.example.com TRUE / FALSE 1493891506 SSID_C CACeuh1GAAAAAAAYxilXjl6BJhjGKVcBAAAAAABEVFFXGMYpVwANyBJPAAP1PQoAGMYpVwEAF04AA6sdCgAYxilXAQAOUAAD7V4KABjGKVcBACNQAAFUYgoAGMYpVwEAbk8AAQBICgAYxilXAQA
.example.com TRUE / FALSE 0 SSSC_C 333.G6280768962372394638.1|19991.662955:20242.671221:20334.673792:20494.679661:20515.680532
.example.com TRUE / FALSE 1493891506 SSRT_C MsYpVwIBAw
.example.com TRUE / FALSE 0 JSESSIONID CDZHXpGSHymLMz4v!-751026475
.example.com TRUE / FALSE 3609839127 mapp 0
.example.com TRUE / FALSE 3609839153 dpi 2097201|2|release20160420v10t155721155722
.example.com TRUE / FALSE 3609839153 lpi 2114737|2|release20160420v10t155721155722
.example.com TRUE / FALSE 0 TS0119d048 01efad4706976f70b8f767b422999889abdfa7e7a9a300a247ca3f6dec4997a3ea8a5c9dbe800783f83027f6f389b2fc4134a3806b1de11ca96bf39add105698b8c22f1d300d568ea4395ae6adf29723d2f482180be92caa38977c2da954baebe461814696e5ca8be3f2f7087360909df7e5694ec8f5965475bfd2591cc6c843a2b4aac4752758d5cb2659b390c7632b7047ffdfe2
www.example.com FALSE / FALSE 0 TS01472329 01efad4706512021fdee50b1b891941c232f4ef7f5bf2d184606446c9ebf492848a3eab610
.example.com TRUE / FALSE 3609839153 uui 800.606.6969%20/%20212.444.6615|
.example.com TRUE / FALSE 0 ci NS=Y|CM_MMC=|
.example.com TRUE / FALSE 0 TS01c1e793 01efad47067448a038c37bf93bcdabbce3f89810c9711adfcf2561c8b38484b01c4523479562e5435383034ba6b231a0e3428234fab56386e2af0810f02b7abcf5f2d79d6e
.example.com TRUE / FALSE 3609839153 sessionKey CDZHXpGSHymLMz4v!-751026475!1462355506133
.example.com TRUE / FALSE 3609839127 cookieID 89789790961462355480485
.example.com TRUE / FALSE 0 dlc NS=Y|CM_MMC=|EMLH=|
Which clearly misses the highlighted cookies in the image. I also tried removing all the cookies and disabled the JS and reloaded the page in browser and still those two cookies exist. So these cookies are not created using JS.
The code that I have used:
$URL = "http://www.example.com/";
//ini_set('user_agent', 'Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.0 FirePHP/0.5 ');
//$context = stream_context_create (array ('http' => array ('timeout' => 60)));
$this->ch = curl_init();
$curlHeaders = array(
'Host: www.example.com',
'Connection: keep-alive',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Upgrade-Insecure-Requests: 1',
'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36',
'Accept-Encoding: gzip, deflate, sdch',
'Accept-Language: en-US,en;q=0.8',
'Cookie: _gat=1'
);
$cookie = 'cookies.txt';
// visit the homepage to set the cookie properly
//$ch = curl_init();
$agent= 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13';
curl_setopt($this->ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($this->ch, CURLOPT_VERBOSE, true);
curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($this->ch, CURLOPT_HEADER, false);
curl_setopt($this->ch, CURLOPT_HTTPGET, true);
curl_setopt($this->ch, CURLOPT_USERAGENT, $agent);
curl_setopt($this->ch, CURLOPT_HTTPHEADER, $curlHeaders);
curl_setopt($this->ch, CURLOPT_URL, $URL);
curl_setopt($this->ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($this->ch, CURLOPT_COOKIESESSION, true);
curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, true);
ob_start(); // prevent any output
curl_exec ($this->ch); // execute the curl command
ob_end_clean(); // stop preventing output
//URL that loads when I change the currency from USD to AUD
$ausURL = "http://www.example.com/bnh/controller/home?O=RootPage.jsp&A=SetCurrency&Q=&saveCUR=Y&code=AUD";
curl_setopt($this->ch, CURLOPT_URL, $ausURL);
$url="www.example.com/productPage/";
curl_exec ($this->ch);
curl_setopt($this->ch, CURLOPT_ENCODING, "gzip");
curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($this->ch, CURLOPT_REFERER, "http://www.example.com/bnh/controller/home?O=RootPage.jsp&A=SetCurrency&Q=&saveCUR=Y&code=AUD");
curl_setopt($this->ch, CURLOPT_URL,$url);
curl_setopt($this->ch, CURLOPT_COOKIEFILE, $cookie);
$buffer = curl_exec($this->ch);
$fh = fopen($this->myFile,'w') or die("can't open file");
fwrite($fh, $buffer." -----------------buffer--------------------");
//fclose($fh);
return $buffer;
It still yields USD Pricing through CURL.