I want to scrape this website in PHP using cURL.
I use similar webscraping scripts in PHP, they work well.
Nevertheless, I get the following error:
Fatal error: Uncaught ValueError: DOMDocument::loadHTML(): Argument #1 ($source) must not be empty
, followed by Stack trace: #0 [...](29): DOMDocument->loadHTML() #1 {main} thrown in [...] on line 29
.
The error message references line 29, i.e. $doc->loadHTML($html);
(the final line in the following code):
<?php
ini_set('display_errors', '1');
ini_set('display_startup_errors', '1');
error_reporting(E_ALL);
$ch = curl_init();
// Set the cURL options
curl_setopt($ch, CURLOPT_URL, "https://link.springer.com/book/10.1007/978-3-031-10453-4");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
// Execute the cURL request and fetch the HTML source code
$html = curl_exec($ch);
if (curl_error($ch)){
$output = "\n". curl_error($ch);
echo $output;
die();
}
// Close the cURL handle
curl_close($ch);
// Create a new DOMDocument object
$doc = new DOMDocument();
// Load the HTML source code
$doc->loadHTML($html);
I do not think that I am blocked by the website I want to reach -- it is my very first test with that website.
What could be the issue behind the error?