Why does adding a renderer to my DirectShow Filter Graph smooth out audio input to the graph?

Question

I have a DirectShow filter graph in my Delphi 6 application built with the DSPACK component library. The structure of the graph is as follows:

Custom push source audio filter
Sample Grabber
Tee Filter (but only when I turn on both the WAV File Writer and Renderer)
Renderer (preferred PC output device)
WAV File Writer

The Tee Filter is added to the graph only if I have both the Renderer and the WAV File Writer filters turned on. Otherwise I connect only the filter that is turned on directly to the Sample Grabber.

The audio is being delivered over a WiFi connected RTSP audio server that is streaming audio in real-time. If I don't turn on the Wav File Writer, the audio coming out my headphones has the typical pumping and occasional clicking sounds associated with an unbuffered audio stream. Strangely enough, as soon as I turn on the WAV File Writer filter the audio becomes smooth as glass.

I have the source code for the WAV File Writer and it basically handles the tasks of outputting the proper WAV file header when needed and writing the audio buffers as necessary, not much more than that. So I find it strange that turning it on smooths the incoming audio stream, especially since it is not upstream of the Renderer (filter) but instead is a peer filter hanging off the end of the Tee Filter alongside the Renderer.

Can anyone tell me what might be happening to make the audio delivery smooth out when when I turn on the File Writer filter? Does the Tee Filter do any inherent buffering? I want to duplicate the same mechanism so I can have smooth audio when the File Writer is not turned on. I'm trying to avoid adding my own buffering because I don't want to add any more delay to the real time audio stream than I have to.

Perhaps the real question would be why exactly audio was choppy in first place. Audio renderer stats could give you a clue. — Roman R., Jan 13 '12 at 18:41

score 2 · Accepted Answer · answered Jan 13 '12 at 23:02

2

If you have a live source and you can listen to it and the delivered audio at the same time, you may be able to tell whether adding File Writer introduces a delay, that may be accountable for the difference. Or there may be a change in size or the number of negotiated buffers in DecideBufferSize.

I would suggest introducing explicit buffering in your push filter, like adding an offset to media sample time-stamps. Inherent buffering in Tee filter may be not reliable. Variations in delivery time are inevitable.

A more sophisticated approach, if you need to run with minimal or no buffering, could be to stretch/compress the audio while preserving the pitch.

answered Jan 13 '12 at 23:02

Dmitry Shkuropatsky

3,902
2
21
13

Can you elaborate on your intriguing "stretch/compress" idea? I understand the concept. I've seen it done in audio editing programs. But I'm not sure how to implement it (what sequence of steps) with a real-time stream without introducing as much delay as I save by doing that. Also, if you know of a good open source implementation for the time-shrinking algorithim please let me know. – Robert Oschler Jan 13 '12 at 23:51
1

I saw mentioning of this approach in one of the major media players, do not remember exactly, but can only guess how it is implemented. Its applicability may be limited by the following requirements/constraints: small buffer time (standard is at least 3-5 seconds), reliable/low latency network, at most 5-6% variation in playback speed. [SoundTouch](http://www.surina.net/soundtouch) library can be used to stretch audio. It has little processing overhead and can be done real-time. – Dmitry Shkuropatsky Jan 18 '12 at 04:12

Why does adding a renderer to my DirectShow Filter Graph smooth out audio input to the graph?

1 Answers1