HDVoice: A Visual Analogy

HDV IP550 Here’s something of a challenge; find a visual way to represent the information density of HDVoice vs a narrowband PSTN call…and try to make it something the everyone can relate to. This is part of my recent attempt at such a display.

The human voice can create sound energy in the range of 80 Hz to 14 KHz. In contrast, the PSTN conveys a much more limited pass-band, typically 300 Hz – 3.4 KHz. That means that the PSTN fails to convey more than 70% of the potential energy in a voice.

To find a visual parallel to this consider displaying a picture of some text. If the original speech is 100% of the information, then the following image is the equivalent of HDVoice.

Kenedy Speech sharp 600

The text is from President John F. Kennedy’s address to congress seeking funding for NASA’s lunar exploration program. That image started out at it’s original dimension, then I reduced it to 80% of its original dimensions.

By extension then, the PSTN version of the same information would be reduced to only 30% of its original pixel count, which looks like the following:

Kenedy Speech Blurry 600

The contrast in detail makes for a striking change in slides when shown in a Powerpoint presentation.

The point I’m trying to make is that HDVoice is very simply easier on the brain. The presence of greater detail in the audio stream reduces cognitive workload, making it easier to understand what’s being said. This is why conference calls in HDVoice (ZipDX!) don’t leave us feeling as drained as similar calls hosted on traditional, narrowband conference services.

This was part of the presentation that I gave at the CloudComm Summit 4 during ITExpo this week. Given a little time I think that I’ll record some narration to go along with the slides and offer the whole presentation hereabouts.

  • vlad

    interesting point, though I can’t say that I find G722 a huge breakthrough in quality compared to ULAW. “Somewhat better” – yes, but far from WOW point.

    • What end points have you tried? I use G.722 based telephony almost daily, and find it markedly better than ulaw. There’s something of a problem when people try end-points that feature a G.722 codec implementation but pair it with questionable build quality. Without better quality transducers and mechanical construction you won’t get the full benefit of the better codec.

      FWIW, I do appreciate the further quality improvement provided by even better codecs, like G.719, Siren 22, CELT, etc. However, there’s much less support for such codecs in common hardware and software, making them less common in daily use.