Codecs, Wideband & Stereo: A Conversation At AMOOCON

September 1, 2009
Michael Graves

Then John Todd broaches a subject that I find very interesting;

“…I keeping wondering why stereo is never on anyone’s lips when we talk about these next generation codecs?”

A number of related issues are mentioned by various parties;

“…stereo is required from the transmitting site…”

“…you’re just thinking about it from a voice perspective…”

“…you don’t have to send the full stereo data. You can send enough to let the client synthesize it.”

“…if you’re going to go to the effort of creating stereo you should really do actual stereo.”

This starts to get under the covers of sending directionally encoded audio. The matter of “stereo” is a leap in logic that might not be well considered. Stereo from the perspective of music is one thing, but what’s truly required is some form of surround sound. The fact that we use the equivalent of two audio streams for the encoded channel is purely incidental.

All surround sound systems make some effort to be “stereo compatible” so that they will present an acceptable, if less than optimal result when conveyed through only two speakers.

At 13:40 into the recording someone asks;

“…do we hear phase? Do we need the temporal difference?”

Yes, absolutely! Our hearing is a differential mechanism. Both phase and delay are temporal issues relative to wavelength and distance.

“…if you have a hifi that allows you to invert the phase on one of the channels you can absolutely hear that. But I don’t know if you can hear say, one of the channels being 5ms ahead or behind the other one.”

Time is the key factor here. If we use a digital delay device to insert delay into one channel of a stereo signal the effect of the delay will be wildy different and various settings. When the delay is long, for example 300ms, we hear it as a directional cue between the two channels.

As the delay is shortened its effect narrows. When the delay is around the same duration as the wavelengths of sound in our target range, say 50 Hz-12 KHz, it becomes less pronounced. If the delay should be variable we hear it the “phasing” or “flanging” effects commonly used on guitars in recording studios.

Just as extended frequency response makes a call sound more natural, the timing of sounds between channels is important to conveying the natural directional perspective.

1 2 3 4

This Post Has One Comment

Tim Panton says:

September 1, 2009 at 10:29 am

The first speaker is
Kevin Fleming from Digium
also there is Zoa from http://www.zoiper.com
and Diana from yate.null.ro

Comments are closed.