0

I'm working on an XNA script in which I want to read data from the microphone every couple of frames and estimate its pitch. I took input based almost exactly on this page (http://msdn.microsoft.com/en-us/library/ff827802.aspx).

Now I've got a buffer full bytes. What does it represent? I reset everything and look at my buffer every 10th frame, so it appears to be a giant array that has 9 instances of 1764 bytes at different points in time (The whole thing is 15876 bytes large). I'm assuming it's the time domain of sound pressure, because I can't find any information on the format of microphone input. Anybody know how this works? I have a friend who has an FFT up and running, but we're trying to learn as much as we can about that data I'm collecting before we attempt to plug it in.

Warman Steve
  • 17
  • 1
  • 6
  • I'm curious why you're basing your update interval on frames, and not on elapsed time. – FreeAsInBeer Sep 09 '14 at 16:45
  • I eventually want the pitch of the player's voice to be the way they control their character, so I would like it to update consistently with the framerate. For now, I just want to get the pitch detection up and running. – Warman Steve Sep 09 '14 at 16:47
  • It's been a while since I've worked with XNA, but I believe the preferred method is to perform logical updates within the `Update` method based upon the elapsed `GameTime` object. Forgive me if they've changed this since I last worked with XNA. – FreeAsInBeer Sep 09 '14 at 16:49
  • I'll keep that in mind, but for now I need to figure out what format the audio buffer stores data in. – Warman Steve Sep 09 '14 at 16:54

1 Answers1

0

The samples are in Little-Endian 16 bit Linear PCM. Convert each pair of bytes into a signed short as

short sample = (short)(buffer[i] | buffer[i+1] << 8);
p10ben
  • 425
  • 1
  • 6
  • 17