2

I'm working on an audio project in java that requires me to determine audio file types based on their data (as opposed to file extension) and I've hit a wall with MP3s. As far as I understand, MP3 files are broken into frames where each frame has a 4 byte header containing 11's for a frame sync and an assortment of other data. Right now my code can accurately identify a WAVE file but when I start reading my test MP3 file's bytes I can't find a 11111111 byte (the first 8 of 11 frame sync bits) anywhere.

try {
        FileInputStream fis = new FileInputStream(f);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        byte[] buff = new byte[11];
        byte[] byteData;
        int bytes = 0;

        while ((bytes = fis.read(buff)) != -1) {
            baos.write(buff, 0, buff.length);
        }

        byteData = baos.toByteArray();

        fis.close();
        baos.close();

        if ((int)byteData[0] == 255) {
            type = "MP3";
        } else if (("" + (char)byteData[8] + (char)byteData[9] + 
                (char)byteData[10] + (char)byteData[11]) == "WAVE") {
            type = "WAVE";
        }

    }
Chris
  • 1,465
  • 1
  • 18
  • 36

1 Answers1

2

You will probably find the first three bytes of the MP3 file are:

49 44 33

This is the 'magic number' for an MP3 with ID3v2 tags .... at least according to wikipedia

EDIT

OK, so I have looked at my system, and the MP3's I have contain the magic number:

73 68 51

which in ascii is 'ID3'.

Note that you have some problems with your byte manipulation.... when you test byte values against int values you need to make sure you do the conversion right.... the test:

byte x = ....;
if (x == 255) {...}

will never be true for any value of 'x' because (byte)x will have the range -128 to +127.

To make this test work you need to do:

if ((x & 0xff) == 255) { .... }

I have modified your method to test things on my system, and have tried a WAV file and a few MP3's. This is the code I have:

public static final String getValid(File f) throws IOException {
    FileInputStream fis = new FileInputStream(f);
    byte[] buff = new byte[12];
    int bytes = 0, pos = 0;

    while (pos < buff.length && (bytes = fis.read(buff, pos, buff.length - pos)) > 0) {
        pos += bytes;
    }

    fis.close();

   // this is your test.... which should bitmask the value too:
    if ((buff[0] & 0x000000ff) == 255) {
        return "MP3 " + f;
    }
    // My testing indicates this is the MP3 magic number
    if (   'I' == (char)buff[0]
        && 'D' == (char)buff[1]
        && '3' == (char)buff[2]) {
        return "MP3 ID3 Magic" + f;
    }
    // This is the magic number from wikipedia (spells '1,!')
    if (49 == buff[0] && 44 == buff[1] && 33 == buff[2]) {
        return "MP3 ID3v2" + f;
    }
    if (   'W' == (char)buff[8]
        && 'A' == (char)buff[9]
        && 'V' == (char)buff[10]
        && 'E' == (char)buff[11]) {
        return "WAVE " + f;
    }

    return "unknown " + f;

}
rolfl
  • 17,539
  • 7
  • 42
  • 76
  • Right now the first bytes I'm reading are `-1 -5 -78` which somehow doesn't terminate the while loop. Could there be an error in the way I'm storing the information? – Chris Nov 02 '13 at 15:26
  • Yes, you are doing things wrong, I'm going to edit my answer.... but first, is your intention to read only the first 11 bytes? – rolfl Nov 02 '13 at 15:36
  • Yes, the full contents of the file are read elsewhere. I realized that the currently returned bytes are correct, as `-1 -5` contain the initial 11 bits I'm looking for. I'm still not entirely sure if this is the best way to approach MP3 validation however. – Chris Nov 02 '13 at 15:38
  • Edited my answer. Note two things - no need for ByteArrayOutputStream and also the byte manipulation. – rolfl Nov 02 '13 at 15:59
  • `49 44 33` comes from the “Hex signature” column of [the linked Wikipedia article](https://en.wikipedia.org/wiki/List_of_file_signatures) (equivalent to decimal `73 68 51` and US-ASCII `ID3`). – lumato - Reinstate Monica Feb 07 '19 at 11:17