I'm trying to get the first 1,000 characters from an uploaded text file. I'm doing:
if($file->simpletype=="document"){
//get first 1000 chars in here
$snippet = file_get_contents($_FILES['upload']['tmp_name'], false, null, -1, 1000);
file_put_contents('/var/www/my_logs/log.log', $snippet);
$file->snippet = $snippet;
}
This works fine for a .txt file and I can open and read the log.log file with gedit. However for .doc, .docx, .odt and .pdf files, file_get_contents()
returns gibberish such as: PK\00\00\00\
I have tried another solution I found on stackoverflow:
function file_get_contents_utf8() {
$content = file_get_contents($_FILES['upload']['tmp_name'], false, null, -1, 1000);
return mb_convert_encoding($content, 'UTF-8',
mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
But I get the same results. Any ideas? Thanks!