0

I'm trying to load the stats data generated by Bing webmaster tools. I'm building urls based on desired data and trying to load that. Since file_get_contents() doesn't work with https, I've tried both a curl-based function and fopen.

Is this even possible, or does Bing somehow block this data stream from being remotely accessed? I know Google has a login process, but I have found no such thing for Bing. Instead, I've set a certificate with cURL, turned on allow_url_fopen, and enabled ssl. Var dumps and prints give me nothing except for the following messages:

when using fopen(): resource(3) of type (stream) Resource id #3

when using getBingdata(): bool(false)

Here is my function. Much of this was pieced together from tutorials on SO and elsewhere. I apologize in advance for any huge errors or omissions

function getBingData($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);

    curl_setopt($ch, CURLOPT_HEADER, true);
    curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]); //
    curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
    curl_setopt($ch, CURLOPT_TIMEOUT, 30);

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);

    // goes to Bing login page if set to false
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);

    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_ENCODING, true);

    if(substr($url,0,4)=='http') { $temp = parse_url($url); }
    else if(substr($url,0,5)=='https') { $temp = parse_url($url); }
    else { $temp = parse_url('https://'.$url); }

    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); // 0, 1, and 2 make no difference
    curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "./certificates/ssl.bing.com.cer");
    curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "./certificates/wmstat.bing.com.cer");

    $result = curl_exec($ch);
    $info = curl_getinfo($ch);
    curl_close($ch);
    return ($info['http_code']!=200) ? false : $result;
}

I've also tried sending my bing webmaster login and password through curl, but found it made no difference. Is there something I need to do with cookies? Is there a login process for Bing? Is there a better method of getting web data from https urls? Or does everything from Bing just have to be dumped into a file for other uses?

Many thanks in advance!

ps. I'm using the output given by https://wmstat.bing.com/webmaster/data.ashx?wmkt=en-CA&wlang=en-CA&type=sitelinks&url=CLIENTURLGOESHERE&out=plain, which I know can be set to file (csv format) or saved from the browser. However, I need all or various parts of this dynamically loaded for SEO analysis and possibly dumped to a database. If I can get the contents of these generated pages directly instead of saving them to files and then reading those, it will save a lot of time and effort.

John
  • 19
  • 1
  • 5
  • "Is there a login process for Bing?" When I click your example link, I get one, so signs point to yes. – ceejayoz Jun 20 '12 at 17:49
  • Yes, there is. I am running the script while logged in through the same browser. I've also dumped the certificate and pointed to it via curl. Though, that seems to make no difference at this point, which leads me to think there is a cookie dependency (or the people at Bing don't like their data extracted in this manner). – John Jun 20 '12 at 18:06
  • Uh, being logged in on your browser has **nothing** to do with PHP. They are completely and utterly separate. Hell, even Safari vs. Firefox use totally different cookies. You'd need to script the login process, handle the cookies within cURL for future requests, etc. – ceejayoz Jun 20 '12 at 18:09
  • That's where my problem stems from. Bing doesn't seem to provide any support at all for scripted logins, and I haven't find any equivalent to, say, Google's webmaster tools login -- that and with my mediocre PHP skills, I don't know how to create a similar process. – John Jun 20 '12 at 18:12
  • Of course they don't provide support for scripted logins. They want you to use [the API](http://www.bing.com/webmaster/documentation/api/index.html). – ceejayoz Jun 20 '12 at 18:17
  • And of course that provides me with no leads since that's mostly in C#. You think I haven't gone through their site? You've given me little more than what looks to me as arrogant commentary and general process description. I realize I'm new to this forum, but if you can't give anything besides cynical "help" (or at least links to pages that will), then please don't. Thanks. – John Jun 20 '12 at 18:22
  • Read more closely. There's nothing preventing you from hitting their API via PHP. It's a simple JSON over HTTP API. http://www.bing.com/webmaster/documentation/api/html/T_Microsoft_Bing_Webmaster_Api_IWebmasterApi.htm https://ssl.bing.com/webmaster/api.svc/json/METHOD_NAME?apikey=API_KEY&param1=VALUE&param2=VALUE&...&paramN=VALUE – ceejayoz Jun 20 '12 at 18:31
  • Thanks much! I haven't used JSON often, but that looks like a good start. – John Jun 20 '12 at 18:40
  • Glad my "arrogant commentary" and "cynical 'help'" could be of assistance. Heh. – ceejayoz Jun 20 '12 at 18:47
  • Touché! I'm still having trouble though. I've had no major issues authenticating via cookie or certificate. I think it's something else in the way I'm trying to catpure the data/stream. When I use the newer API on the ssl.bing subdomain and pass the APIKey along, I can automate the process of dumping the csv files for any type of stats/query report for any site being managed in a list. Yet even if I set the output to "plain", I can't load that report directly via fopen, stream_get_contents, or curl. It always returns empty. – John Jun 21 '12 at 22:58

1 Answers1

0

it work if use only

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);

option for curl, good luck