Site icon Graves On SOHO Technology

Technology & The Art Of The Podcast

Last week longstanding VoIP blogger and fellow Canuck Alec Saunders penned a nice post on the Calliflower Blog offering a collection of guidance for podcasters called “10 Podcaster Tips!” It’s a good read…not long…you should go read it now…then come back here. I’ll wait.

Over the past few years I’ve listened to a number of Alec’s Squawkbox podcasts, even attended a handful live & in-person. I respect and admire the man.

Taken in the context of the Calliflower conference service Alec’s post provides some sound, well-considered advice. Even so, I find there to be merit recasting it in a broader context and revisiting some of his points.

By “broader context” I mean specifically considering how someone could record a better sounding podcast by bypassing the legacy public switched telephone network (PSTN).

Have you ever listened to a radio call-in show and thought, “Wow, the host sounds great, but the caller can barely be understood?” Clearly no matter how ubiquitous, a traditional phone call is not an ideal way to record an interview. It can be even worse in the case of a conference call with multiple participants.

What I find simply amazing is the way that few people are aware that wideband telephony (aka HDVoice) even exists. HDVoice could be the single most dramatic improvement in the audio quality of your podcast, short of getting everyone gathered in a studio.

I do understand that some people prefer to focus on the topical material of their podcast, and not concern themselves with the underlying technology of its creation. However, if you have even a passing interest in improving the quality of your recorded audio it’s worth exploring the state of voice-over-IP (VoIP) and more specifically HDVoice.

Some will surely say that what I’m about to suggest is not worth the trouble. In my mind this is a matter of principle. While I’ve made a career in the technology sector I originally come from an arts background. From those early days I still harbor the belief that there can be no art without first achieving mastery of a craft.

A large part of mastering a craft is achieving intimacy with one’s tools. Only with mastery of the tools do they become transparent to the creative process, allowing you to create something truly artful.

If you’re a podcaster then amongst your tools you should come to understand a diversity of means for handling audio, including various aspects of telephony and conference services. It’s worth your time to dig a little deeper in understanding these things in order to produce consistently superior sounding podcasts.

If you’re not interested in producing superior sounding podcasts, well…thanks for coming out, pick up your coat at the door, please go on about your business.

An Example: The VoIP Users Conference

To be very clear about my background in this area, I’m not really a podcaster…I just play one now and then on the internet. While I am a regular contributor to the weekly VoIP Users Conference I only host calls periodically when VUC founder Randy (aka Zeeek) is not available. I’ve also conducted some interviews with Randy acting as host.

In November 2008 we did a call with David Frankel, CEO of ZipDX as our guest. The topic was audio conferencing and wideband telephony.

David was nice enough to donate a number of licenses for the wideband-capable Eyebeam SIP soft phone from Counterpath so that some callers could join the call in wideband using his service. This was one of those times when I conducted the interview, which you can still download here.

That call was recorded on the HDVoice bridge and the podcast certainly sounds very good. You can clearly make out which participants were called-in via traditional means and which were called-in via from wideband phones. The difference is rather startling.

As of this writing there have been 241 weekly VUC calls. The conference was founded in March 2007 using Talkshoe as the conference service. Their service, then vibrant, was truly built around the needs of podcasters wanting to develop an online community.

Since January 2009, roughly the last 70 calls, we have used the ZipDX wideband conference service in tandem with Talkshoe. We actually cross-connect the two conference bridges. The fundamental difference between Talkshoe and ZipDX is the audio quality. Of course, the quality of the recorded podcasts reflects that of the live call.

Since we’ve been using both conference bridges we have seen the audience shift dramatically. At first there were 20-30 people on Talkshoe and only 6-8 on ZipDX. That has changed such that in recent months we have seen 50 callers (!) via ZipDX and only 2-3 on Talkshoe!

It’s very clear to us that people prefer better audio. In fact, they prefer it so much so that they are willing to purchase new hardware to realize the benefits of HDVoice. Many VUC participants have purchased G.722 capable desk phones. Some use soft phones but have purchased high-quality headsets or speakerphone devices. There are many ways to enjoy HDVoice.

All of this experience is in marked contrast to Alec’s initial point:

Be careful using VoIP products, like Skype. These can also have unpredictable results sometimes superior to a landline, and sometimes grossly inferior.

While I completely understand his advising “be careful” I also know that using a VoIP service is the only way to achieve truly superior audio quality!

In truth, it’s not difficult to take advantage of wideband telephony. There are various issues to be considered, some technical and others practical. Over the past year I’ve written a series of How-To posts about this called, “Making Use of Wideband Voice Right Now!” Collectively this series describes how you might start using HDVoice by way of free or nearly-free services.

Incidentally, ZipDX is not the only company offering wideband conferencing, just my personal favorite. Other options include:

Also in his first point Alec recommends:

Avoid cordless handsets. Cordless handsets often have a noticeable background hum.

I don’t believe in condemning an entire class of devices in this manner. In fact, I’ve had great success using the Gigaset SIP/DECT systems. Those being both cordless and VoIP-based they probably shouldn’t work at all…at least based upon the recommendations as given.

There are about a zillion different kinds of phones, including cordless systems on the market. You get exactly what you pay for so don’t buy the cheapest thing that you can find.

If you really want to go cheap then use a wideband-capable freeware soft phone like PhonerLite, but buy a good headset. I like the Plantronics .Audio 615m USB headset which sells for under $40.

What About Skype?

I can see the attraction of using Skype. The Skype client software is free and nearly ubiquitous. However, I never use Skype to join a conference call.

If you’re making a Skype call to a normal phone number then Skype is going to use G.729a to pass the call to someone for PSTN termination. G.729a is the industry standard low-bitrate codec.

G.729a is generally regarded capable of offering good audio quality, with a best case Mean Opinion Score of 3.9. This should be compared to a land-line call using G.711, which has a best case MOS score of 4.3. A Skype call to a normal conference bridge won’t be as good as a land-line call, even under best case conditions.

Simply put, none of the traditional codecs of the legacy public telephone network (G.711, G.729a, G.723, G.726, AMR) can hold a candle to a wideband codec! And there are so many wideband codecs to choose from, including: G.722, G.722.1, iSac, AMR-WB, SPEEX, CELT & SILK!

As a practical matter the only wideband codec of concern today is the great-grand-daddy of them all, G.722. It’s the one codec that offers great audio quality and is implemented in a wide range of software and hardware.

You’d think that Skype would be a logical choice for podcasters since it implements SILK, the very latest wideband codec. In practice it really depends upon exactly what you need to do.

If you’re using Skype at both ends of a one-to-one interview then it can provide excellent audio quality, much better than a normal phone call. Amongst the things that I have listened to Dan York’s BlueBox Podcast on VoIP Security was often done using Skype in this manner.

Recent releases of Skype allow multi-way as well as one-to-one calling. If all participants are using Skype and a good headset then you should be able to enjoy a superior quality call.

However, if you’re using Skype to call a traditional conference bridge via a normal phone number you will not realize any improvement in audio quality.

Remember, the goal is to achieve the best audio quality possible…not just sound like a good phone call.

3. If possible, use a conference calling service that allows you to record the call from the conference bridge, rather than from one of the handsets. By recording the call from the bridge, you minimize the drop-off in volume that occurs as phone calls traverse multiple networks. In addition, if you record from the bridge, no additional equipment is required to make the recording.

It’s true that using the conference service to record the call is very convenient, and the quality is often very good. However, there can be advantages to making a local recording as well.

The conference bridge typically is going to make a recording that is compressed into MP3 format at some nominal bit-rate. That’s ok as you likely want to distribute in that format. But if you wish to do some audio post-production, as Alec suggests,  you might be better off starting with a higher-quality, uncompressed recording.

In my case, I often want to go into the recording and edit sections for content or perhaps replace my intro or extro if I think that I could have done better. I achieve a better quality final recording by starting out with an uncompressed WAV recording made locally.

I take one of three approaches to this:

1. The Polycom SoundPoint IP650 & IP670 desktop SIP phones can record a call in uncompressed WAV format to a USB stick when equipped with the optional Polycom Productivity Suite. That option costs under $10 per phone. If the call is a wideband call then you are assured that the captured call quality is as good as it can possibly be.

This is my preferred approach since the Polycom phone is essentially an appliance. It just works. It leaves my PC free to do anything else that I might need during the call.

2. Many soft phones, including my favorite Eyebeam, can record a call to an uncompressed WAV file. I tend to use this less since it means that I’m relying upon my PC which I may need to use for other tasks during the call.

3. V-Emotion software can be used in conjunction with a soft phone to record the call audio. While I had to buy this software it has the advantage of allowing you to keep your voice split from the rest of the call. It saves a stereo track with your voice on one channel and the other participants on the second channel. This can be invaluable for editing sections where, in a heated debate, the host and the guest are “stepping on” each others audio.

V-Emotion can also be used to playback pre-recorded audio files into a call. This is handy for injecting a musical opening & close into a live call, or adding game-show-like sound effects.

In reality, I have a belt-and-suspenders approach. That is, I record locally but I also keep the MP3 recording from the conference bridge to use as a backup. I can work from it if something happens to make my local recording unusable.

Returning to another of Alec’s points:

5. Use audio processing software to clean up the recordings afterward. Do not simply publish the raw audio file. It’s easy to improve the audio file with just a few minutes of work. I recommend using the open source package, Audacity. It’s excellent, and the price is right.

This is so very true! You simply must post-process your recordings to achieve the best recorded result in the downloadable file.

While he recommends Audacity, I would add another tool that every podcaster should use. It’s a piece of free software offered by The Conversations Network called The Levelator. The Levelator was specifically written to automatically adjust the audio levels in podcasts.

The Levelator requires an uncompressed WAV file as a source, and automatically generates a new copy of the file with a new filename as it completes its task.

Since the program requires an uncompressed source file it makes some sense to start with an uncompressed local recording rather than convert the MP3 from the conference bridge back to WAV format. Why suffer the extra generation of compression quality loss?

There are literally no settings to the software. Just drag the source file and drop it on The Levelator window…then wait until it’s done.

If it’s true that seeing is believing then here is some visual proof regarding The Levelator. The following pair of screenshots are one minute of audio from a recent VUC call. The audio is loaded into my preferred editor, Cool Edit Pro2, to show a waveform display.

Before Processing with The Levelator

After Processing with The Levelator

The Levelator is truly amazing. It’s far better than just normalization alone. In just minutes it adjusts the level for each instance of every participant speaking in a fashion that would take hours of manual editing.

Since it acts upon and creates only uncompressed WAV files using The Levelator requires a little extra manual work to perform a final encoding to MP3 for upload. I feel that this is a truly minor amount of extra effort given the amount of time saved by the automated level adjustment.

The all-volunteer development team behind The Levelator have even posted a nice description of how it works for anyone who is interested the details.

I hope that I’ve opened your eyes with respect to how a podcaster might achieve superior quality audio recordings by selecting HDVoice as a tool for their production arsenal. The key thing to do is simple…

…demand better audio quality than a traditional phone call!

It’s not much trouble, and it truly is worth the effort.

P.S. – I must commend Alec on the graphics used on the Calliflower Blog. I don’t know if they are commissioned originals or stock bought-in from somewhere, but they’re simple and very elegant. Well done!

Exit mobile version