1

This might be a wide answer but i would like to see answers and discuss this thread with SO users.

So far i guess a Audio File(WAV) has a Sample Rate which could be 44000 or 48000 (i've seen most these 2), and from that we can determine that a single Second into a File (second 00:00:01) has exactly 44000 Integer Values which means here we have an Int[], so if an Audio File Duration is 5 Seconds it has 5 * 44000 Integers (or 5 Samples).

So my question is, how can we calculate the difference (or similarity) of content between two time spans, like Audio1.wav and Audio2.wav at 00:00:01 with same Sample Rate.

Artem Koshelev
  • 10,548
  • 4
  • 36
  • 68
Rosmarine Popcorn
  • 10,761
  • 11
  • 59
  • 89
  • 1
    -1. I don't understand your question at all. I'm guessing your Audio1.wav and Audio2.wav have different sample rates? What exactly do you want to calculate? The difference in number of samples over 1 second? Also, the common sample rates are 44100 samples/sec and 48000 samples/sec. – mtrw Oct 26 '11 at 07:23
  • @mtrw i want to calculate the Difference in Content between 2 Time Spans or Similarity – Rosmarine Popcorn Oct 26 '11 at 07:25
  • 1
    What do you mean by "Difference in Content" ? – Paul R Oct 26 '11 at 08:01
  • What do you mean by difference? Like if Audio1 is 1 second of a sine wave at 100 Hz, and Audio2 is a cosine at 100 Hz, what would you expect the answer to be? Or if Audio1 is 1 second of happy birthday, and Audio2 is 1 second of happy birthday sung by a different singer, what do you expect the answer to be? You're going to have to give more detail. – mtrw Oct 26 '11 at 08:02
  • @mtrw im a newbie at this point but as i mentioned i need to get the Similarity ,and yes i need to find for example Happy Birthday only from the Same singer ,i don't need to assign a Meaning i just need similarity. – Rosmarine Popcorn Oct 26 '11 at 08:30
  • "similarity" is a matter of application. Without explanation about your application (or the meaning of "similarity" for you), it's impossible to understand what you ask. "similarity" can be "is it the same singer", "is it the same tempo", "do both file have piano playing", or "how many samples are less then epsilon apart". Different meanings, different answers. – Itamar Katz Oct 26 '11 at 11:59

2 Answers2

1

There are couple assumptions in your reasoning: 1. The file is the raw uncompressed (PCM encoded) data. 2. There is only one channel (mono).

It's better to start from reading some format descriptions and sample implementations, then search for some audio comparison algorithms (1, 2, 3).

Linked Q: Compare two spectogram to find the offset where they match algorithm

Community
  • 1
  • 1
Artem Koshelev
  • 10,548
  • 4
  • 36
  • 68
1

One way to do this would be to resample the signal from 44100 Hz to 48000 Hz, so both signals have the same samplerate, and perform a cross-correlation. The shape of the cross-correlation could be a measure of similarity. You could look at the height of the peak, or the ratio of energy in the peak to the total energy.

Note however that when the signal repeats itself, you will get multiple cross-correlation peaks.

Han
  • 2,017
  • 17
  • 23