I am trying to detecting Word documents but it fails for some Word documents. I can open the problematic documents just fine in Word and extract them since they are basically just zip files.
I have also tried adding msooxml
to /etc/magic
on my Ubuntu 18.04 server and restart nginx & PHP FPM with no luck. I use PHP 7.4.4
- Is there a way to check if
msooxml
are used? - Do you any suggestions to make it work?
Here is my code:
$file = 'broken.docx';
$mime = mime_content_type($file);
echo '<p>'.$mime.'</p>';
$finfo = new finfo(FILEINFO_MIME, 'msooxml');
echo '<p>'.$finfo->file($file).'</p>';
Result:
application/octet-stream
application/octet-stream; charset=binary
Expected result:
application/vnd.openxmlformats-officedocument.wordprocessingml.document
application/vnd.openxmlformats-officedocument.wordprocessingml.document; charset=binary
msooxml
file taken from: https://github.com/file/file/blob/master/magic/Magdir/msooxml