0

So iam trying to scrape this youtube chart with php curl DOM.

https://charts.youtube.com/charts/TrendingVideos/at?hl=en-GB

But when i do that by using this easy code.

include_once 'includes/db.inc.php';
include_once 'includes/simple_html_dom.php';
include_once 'includes/curl_init.php';
$html=curl_get('https://charts.youtube.com/charts/TrendingVideos/at?hl=en-GB');
    $dom = new simple_html_dom();
    $dom->load($html);
    echo $dom;

The result i get is empty. (Can't show it any other then screenshot)

My scrape result

I would like to know, is that possible?

My curl_init code.

function curl_get($url) {
        $ch = curl_init();
        $agent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)';
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HEADER, FALSE);
        $config['useragent'] = 'Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0';
        curl_setopt($ch, CURLOPT_USERAGENT, $config['useragent']);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_AUTOREFERER, true); 
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch, CURLOPT_HTTPHEADER, array('Expect:'));
        $data = curl_exec($ch);
        if(curl_errno($ch)){
            echo 'Curl error: ' . curl_error($ch);
        }else{
            return $data;   
        }

    }

The curl_get function.

And i use simple_html_dom.

* @author S.C. Chen <me578022@gmail.com>
 * @author John Schlick
 * @author Rus Carroll
 * @version 1.5 ($Rev: 196 $)
 * @package PlaceLocalInclude
 * @subpackage simple_html_dom
  • `curl_get()` is not a standard PHP function. What does yours look like? Also, which `simple_html_dom` plugin are you using; there's a few, and they're slightly different. – Obsidian Age Jul 05 '18 at 00:22
  • That page you're trying to load looks like it uses "late loading" of content. If `cURL` doesn't wait for the page to load it content via javascript then you're not going to get what you want. – Mr Glass Jul 05 '18 at 00:33
  • @Mr Can you recommend any other way to scrape a website of that sort? It doesn't have to be php. – Harijs Vahrusevs Jul 05 '18 at 00:35
  • I have no experience with it. But, you can read the answer to this post: https://stackoverflow.com/questions/14625915/is-there-a-way-to-let-curl-wait-until-the-pages-dynamic-updates-are-done – Mr Glass Jul 05 '18 at 00:38
  • 1
    Scraping directly from a regular web page is not the correct way to do this. Instead use the [Youtube Data API](https://developers.google.com/youtube/v3/docs/). – Mike Jul 05 '18 at 00:52

0 Answers0