I've written a script to process html files from URLs, however, due to a 30's script runtime restriction with my cheap host provider I've had to alter the script to store the html as txt files and run it from a local WAMP server.
I am trying to load each file up, extract what I need, then move onto the next file.
URL's as source file_get_html
was doing the job perfectly (I could ->find the required elements)
Txt file as source file_get_html
is returning a blank object.
Based on some advice in the below post I changed file_get_html
for file_get_contents
which created an array with a single large string containing the contents of the text file.
First, make sure that file_get_contents
can get data. If it can, file_get_htm
l will be able to load data to simplehtml
Dom
If file_get_contents
returns a string, which it does, how would I "load data to simplehtml Dom?"
File not getting read using file_get_html
I then tried to convert the string into an object str_get_html
, however, this didn't work either.
include('simple_html_dom.php');
$html = file_get_html('file.txt');
var_dump($html);
Returns: object(simple_html_dom)[1]
but with no other contents or arrays.
include('simple_html_dom.php');
$html = file_get_contents('file.txt');
var_dump($html);
Returns: string < ! DOCTYPE html PUBLIC.....
Questions:
Can anyone give me any advice? What's the best way to load up a text file containing html markup into an object so that I can utilise the find method on it's contents. I want to avoid loading the file into an array of strings and using regex to process contents.
Are there any considerations I need to make if using a local WAMP server?