3

What is the best way to get the mime type of a file? Before any answers are given, here are a list of a few things to consider.

  1. Can't rely on upload form for accuracy
  2. Fileinfo has been inaccurate in 5.3 with testing
  3. mime_content_type has been inaccurate with test like FileInfo
  4. This goes beyond just image types so getImageSize() is not a viable option.

Could this also be an apache/server/pear thing and not just rely on php functions?

Devin Dixon
  • 11,553
  • 24
  • 86
  • 167
  • 1
    Can you describe the inaccuracy of mime_content_type/fileinfo? (i.e. how do you feel it's inaccurate; it doesn't recognize specific types or just doesn't have as comprehensive a list as you'd like?) – Brad Christie Sep 21 '11 at 15:14
  • You could use `exec("file -i ..")` alternatively, but it utilizes the same mime.magic list as Fileinfo and mime_content_type. So depends on what kind of problems you encountered. – mario Sep 21 '11 at 15:15
  • I describe inaccurate like I upload a PDF and it comes back as application/unknown. – Devin Dixon Sep 21 '11 at 15:17
  • 2
    @Devin: update your 'magic' table, then. fileinfo on my servers has no trouble reporting 'application/pdf' on every PDF I've thrown at them. – Marc B Sep 21 '11 at 15:21

1 Answers1

1

Sad to say, there is no perfect way to do so in any programming language.

The 2 most standard way to get the minetype is

  1. Determine from the file extension... So something.jpg is a jpeg file and something.doc is a word document.
  2. Determine from the magic string in the file... so if the first 2 bytes of the file is 0xFF 0xD8, it's a jpeg file. and a office document begins with 0xD0 0xCF 0x11 0xE0.

Both ways have pros and cons.

I can upload a exe file, but have it named with the extension of ".jpg" to defeat the first way or determining mime type.

And for both types, I basically need a large database to search from so that i can tell what mimetype the file belongs to.

However, if you are only interested in determining the mimetype for a few types of files. (Maybe just jpg, png, gif, etc), then the best way (imho) would be way number 2. Just keep a database or array of all the magic strings, and test the file against that.

It is easy to get the magic strings, just Google.

iWantSimpleLife
  • 1,944
  • 14
  • 22