11

Why is it that on some mp3s files, when I call mime_content_type($mp3_file_path) it returns application/octet-stream?

This is my code:

if (!empty($_FILES)) {
    $tempFile = $_FILES['Filedata']['tmp_name'];
    $image = getimagesize($tempFile);
    $mp3_mimes = array('audio/mpeg', 'audio/x-mpeg', 'audio/mp3', 'audio/x-mp3', 'audio/mpeg3', 'audio/x-mpeg3', 'audio/mpg', 'audio/x-mpg', 'audio/x-mpegaudio'); 
   
    if (in_array(mime_content_type($tempFile), $mp3_mimes)) { 
        echo json_encode("mp3");
    } elseif ($image['mime']=='image/jpeg') {
        echo json_encode("jpg");
    } else{
        echo json_encode("error");
    }
}

EDIT: I've found a nice class here:

http://www.zedwood.com/article/127/php-calculate-duration-of-mp3

Amy
  • 1,114
  • 13
  • 35
robertdd
  • 325
  • 1
  • 8
  • 22

3 Answers3

14

MP3 files are a strange beast when it comes to identifying them. You can have an MP3 stored with a .wav container. There can be an ID3v2 header at the start of the file. You can embed an MP3 essentially within any file.

The only way to detect them reliably is to parse slowly through the file and try to find something that looks like an MP3 frame. A frame is the smallest unit of valid MP3 data possible, and represents (going off memory) 0.028 seconds of audio. The size of the frame varies based on bitrate and sampling rate, so you can't just grab the bitrate/sample rate of the first frame and assume all the other frames will be the same size - a VBR mp3 must be parsed in its entirety to calculate the total playing time.

All this boils down to that identifying an MP3 by using PHP's fileinfo and the like isn't reliable, as the actual MP3 data can start ANYWHERE in a file. fileinfo only looks at the first kilobyte or two of data, so if it says it's not an MP3, it might very well be lying because the data started slightly farther in.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • 3
    so what's the solution? – Moshe Shaham Mar 31 '16 at 08:09
  • 1
    Add an exception in your code/logic that, when sees a `.mp3` extension in the filename, skips mime type detection and assumes it to be `audio/mp3`. Of course, this is a tradeoff and depending on the use case (e.g. if you want 100% reliability but slower detection) - you might want to go with the "parse slowly through the file and try to find something that looks like an MP3 frame" approach mentioned in the answer. – Dzhuneyt Mar 12 '18 at 12:06
3

application/octet-stream is probably mime_content_type s fallback type when it fails to recognize a file.

The MP3 in that case is either not a real MP3 file, or - more likely - the file is a real MP3 file, but does not contain the "magic bytes" the PHP function uses to recognize the format - maybe because it's a different sub-format or has a variable bitrate or whatever.

You could try whether getid3 gives you better results. I've never worked with it but it looks like a pretty healthy library to get lots of information out of multimedia files.

If you have access to PHP's configuration, you may also be able to change the mime.magic file PHP uses, although I have no idea whether a better file exists that is able to detect your MP3s. (The mime.magic file is the file containing all the byte sequences that mime_content_type uses to recognize certain file types.)

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • 2
    I had a chance to dig into getid3 and if I remember correctly it turned out that it figures out file formats by extensions only. – jayarjo Jun 24 '10 at 16:54
0

Fleep is the answer to this question. Allowing application/octet-stream is dangerous since .exe and other dangerous files can display with that mime type.

See this answer https://stackoverflow.com/a/52570299/14482130

garchompstomp
  • 120
  • 10
  • Would it be a problem allowing application/octet-stream if you use is_executable to disallow executables before allowing certain types? – Niels Pein Feb 15 '21 at 15:59
  • I don't know your specific situation, so I can't say for sure. All I know is that allowing application/octet-stream without further checks is potentially a huge hazard since you really have no idea what the file is. It could literally be just about anything. To my eye it completely defeats the purpose of mimetype checking in the first place if you decide to allow application/octet-stream. I don't know how is_executable works, so I can't vouch for it. I found that fleep could find that it was an mp3 file, without relying on extension, despite the mimetype showing up as application/octet-stream. – garchompstomp Feb 15 '21 at 21:30