HDVoice & Asterisk: Hearing The Siren’s Song – The Finale

Michael Graves

17 years ago

Having read & listened this far into this series you should now have some grasp of how narrowband (G.711) compares to wideband (G.722/G.722.1) and even super-wideband (G.722.1C) audio for telephony applications. The differences in many cases are quite pronounced, even startling. What you hear in the examples are just the most obvious properties of the encoding, sampling rate and by implication, the available audio bandwidth. It’s worth understanding a bit more about the evolution of the role of the codec over time. This will help you frame up how the Siren codecs fit into the Asterisk realm.

Back in the early days of the transition to digital in the PSTN the entire network architecture was TDM-based. This is the technological foundation that gives us the codec most commonly deployed, namely G.711. On a TDM network when a call is placed a pathway is created that provides absolutely assured bandwidth to the call. Since the network is such a known quantity the historical role of the codec was very simple. True to the roots of the term “codec” it merely encodes (or decodes) the raw digitized audio to/from a digital stream in a standard format and at a constant bit-rate. The codec is blissfully unaware of its operating environment.

There are all kinds of other processes that may be performed on the audio streams. There are source audio issues of gain control, echo cancellation and background noise suppression, then network issues like jitter buffering and packet loss concealment, just to name a few. All of these additional processes, when they occurred, were done outside of the codec. They were traditionally not part of the codec itself.

Flash forward about forty years to 2008 and the introduction of Skype’s SILK codec, the newest development on the codec landscape. Things are now very different. Over time the role of the codec has grown to include various of the processes that were once pre- or post-processing. Whereas bandwidth was once a constant, the new-found dominance of IP networks means that bandwidth is always a variable.

The current state-of-the-art in codecs, SILK is adaptive. That is, it’s aware of it’s operating environment. It changes its operation to best leverage the network conditions of the moment. It can alter its sampling frequency as well as qualitative factors in the compression process, ensuring that it gets the best possible call quality out of the available bandwidth. SILK can sound worse than cell phone or almost like a CD. It does all of this this dynamically in response to instantaneous network conditions.

In preparing the library of audio examples for this series I found that Tim Panton was at the same time developing his Astricon demo of Skype-For-Asterisk. I spoke with Tim about the possibility of adding SILK into the mix. We decided that it wasn’t going to be either fair or practical. Since all the samples were created using hardware/software on my local network, Skype/SILK would simply dial everything up to optimal. The isolated network would not present it any problems. It would be difficult to build a network that would be a genuine test of SILK under varying network conditions. In some ways only the wild internet is the ideal test network for such a codec.

All this simply illustrates the fact that codecs have evolved their roles over time. The transition from TDM to IP networks, improvements in CPU power, the falling cost of memory…these are all factors supporting the development of increasingly complex codecs.

Let’s return our gaze to the focus of my presentation at Astricon; Siren 7 and Siren 14. The Siren codecs are part of Polycom’s stable of intellectual property. Now middle-aged in the realm of technologies, they have served Polycom well over the years. However, there is a natural life cycle to technologies, and eith the passing of time their value as intellectual property is diminishing.

This being the case Polycom took the very forward looking decision to release them under a royalty free license. Their hope is that if access to this technology is both convenient and cheap their use would be widespread. This might not represent a revenue stream to Polycom, but it would enhance the value of all of the products that support the Siren codecs.

For their part Digium are a little late to the table with wideband IP telephony, but better late than never. Initially supporting only G.722 and SPEEX Asterisk was able to play in the HDVoice space, but wasn’t especially flexible. By implementing Siren7 in Asterisk Digium is able to incrementally enhance Asterisk with respect to HDVoice. It improves its flexibility in some circumstances and with specific hardware.

The current implementation of Siren 14 in Asterisk is obviously less than ideal since everything is downsampled to a 16 KHz sample rate. This eliminates that primary advantage offered by Siren 14. But in time, as Asterisk grows to handle higher sample rates, the availability of Siren 14 will all provide incremental benefits when dealing with certain hardware. It’s not hard to envision an Asterisk system being used as a gateway to bridge a Skype call using SILK to a Polycom conference phone running Siren 14. Both ends of the call would be enjoying excellent call quality, with Asterisk handing the transcoding.

In the opening to the Astricon presentation on HDVoice Tim Yankee made a point of highlighting the ubiquity of the aged G.722 wideband codec. He pointed out that every major player in telecom hardware and software is moving forward with HDVoice, and the most common codec amongst all their efforts remains G.722. It is at present THE standard codec in wideband telephony for wired networks. Period.

So, if you must implement only one HDVoice codec then it simply must be G.722. But if you have some flexibility, or are addressing specific applications where lower bit-rates are key, you may be able to use one of the Siren codecs in your Asterisk installation. It’s easy and doesn’t cost you anything.

In the end I’ve decided that codecs are much like arrows in a bow hunter’s quiver. You probably need more than one to do the job. And if you’re hunting for different sorts of game you’ll need different types of arrowheads as well. There is no single arrow that will handle every situation. So it is with codecs. If you use Asterisk and own Polycom hardware then the availability of the Siren codecs give you some added flexibility in using HDVoice. That just can’t be a bad thing, but it’s not a revolution either.