1
$html = file_get_html("http://www.vegasinsider.com/mlb/odds/las-vegas/?s=316");

echo $html; 

$html is returned as a bunch of strnage symbols that include vۺ�(

I though that using:

header('Content-Type: text/html; charset=utf-8');

would help, but it didn't. Any suggestions?

Lance
  • 4,736
  • 16
  • 53
  • 90

3 Answers3

0

Try this:

$url = 'http://www.vegasinsider.com/mlb/odds/las-vegas/?s=316';
$html = str_get_html(utf8_encode(file_get_contents($url)));

echo $html;
mpyw
  • 5,526
  • 4
  • 30
  • 36
0

Try this one

$encoded = htmlentities(utf8_encode(file_get_html('yoururl')));
echo $encoded;

It will convert the special characteres to HTML entity.

Please see the doc here.

Val
  • 762
  • 8
  • 32
  • 1
    Doesn't echo anything. It's just a plain white page. – Lance May 17 '13 at 07:34
  • var_dump returns string(0) – Lance May 17 '13 at 07:39
  • Please try my edit, I added `utf8_encode` function inside `htmlentities`. – Val May 17 '13 at 07:44
  • Corrupted page content ? You really should "force" all page to be UTF-8 so. – Val May 17 '13 at 07:58
  • How can I force the page to be UTF-8? – Lance May 17 '13 at 07:59
  • Look at [here](http://stackoverflow.com/questions/7809931/how-to-force-utf-8-encoding-in-browser) for example.. Encode your database as UTF-8, and in all your pages `header('Content-type: text/html; charset=utf-8');` – Val May 17 '13 at 08:04
  • I have that header at the top of my page. I mentioned that in the question. – Lance May 17 '13 at 08:06
  • And I mentioned in **ALL** your pages. So, the pages you're requesting with `file_get_html` and the page which call that function. – Val May 17 '13 at 08:10
  • Well, if the page I'm requesting doesn't have the charset to UTF-8, does that mean I should just try a different URL? – Lance May 17 '13 at 08:13
  • Nope, but that's always boring to work with some different encodings.. Btw, please look at the code of Wordpress, some months ago I found code which converts all characters (accents, specials one, etc) but I don't have the time to find it out now.. That could maybe solve your problem! – Val May 17 '13 at 08:24
0

file_get_contents is screwy sometimes. Change the code in simple_html_dom.php to use gzopen instead. Under file_get_html()

//$contents = file_get_contents($url, $use_include_path, $context, $offset);

//get the contents of the page
$fp = gzopen($url,'r');

$contents = '';

while($html = gzread($fp , 256000))
{
    $contents .= $html;
}

gzclose($fp);
Guillermo Gutiérrez
  • 17,273
  • 17
  • 89
  • 116