skip to Main Content

Immersive Audio: Sound All Around You

The playback end of an Ambisonic systems is also something that needs to be carefully considered. Playback is normally via an array of loudspeakers. Yes, an array. Usually more than two, often 6-8 identical speakers.

The physical layout of the speakers is variable. Since the source microphone encodes the entirety of the soundfield at the source site the signal to each speaker can be tailored to reflect the speakers positions in at the other end. That is, if a given speaker is not in an ideal location the signal being fed to that channel can be tweaked (phase & delay) to compensate. This is not unlike the directional mixing described earlier when considering post-production signal processing. We can derive the appropriate signal for any given speaker location based upon the actual speaker location relative to the orientation of the soundfield.

Also, given the scope of information encoded in the source signals, any number of channels can be derived from the 4 basic signals. If you have a very large room like a theater you can extract any number of  completely separate channels from the base signals. Each speaker receives a unique signal that makes its location fit in the entirety of the playback environment.

One of the major benefits to Ambisonics is that it support fully periphonic reproduction. The term “Periphonic” means including height information. The common surround systems used in films are called “planar surround” systems as they only deal with sound in the left-right and front-back planes. An Ambisonic system can convey the sound of airplanes overhead, not that that’s a factor in a typical conference call. However, to know that a speaking party has stood up could be useful.

ambisonic-cube-400

(Photo above) A portable Ambisonic playback rig used to demonstrate truly effective periphonic surround sound. The eight small speakers form the corners of a cube. This system was demonstrated by Hugh Pyle at OpenDork in Boston, September 2009.

For our purposes in conferencing it may be less than desirable to transmit all four channels of the basic Ambisonic signal set. We simply may not want to use that much bandwidth. Thankfully, Ambisonics provides a two channel signal format referred to as UHJ encoding that we can use. UHJ encoding was intended to allow a full Ambisonic mix to be conveyed in the more common two-channel (stereo) media like FM radio, LPs, CDs, video & audio cassettes.

In fact, an Ambisonic recording when UHJ encoded can be played back and it will sound great! The dimensional effects are lost because much of the directional cues are folded back into the plane of the stereo speakers, but the recording…or in our case the conference call…will still sound like high-quality stereo. Simple stereo positional effects will be sustained.

The greatest service that I can provide is to point those still interested in this area to a list of references. There are literally hundreds of papers and articles on Ambisonics dating back to the early 1970s. Here’s a short list of resources that will no doubt lead to countless hours of reading:

This Post Has 4 Comments
    1. Indeed, there are similarities. The recent IBC show in Amsterdam was highlighted by very real movement towards 3D-HDTV. As that business becomes mature there will be economies of scale that can be leveraged in the video conferencing arena.

      The big difference between 3D sound and vision is that there is a lot more prior art in the realm of dimensional audio. Whether the various failed quad formats from the 60s & 70s, or Ambisonics, which was the one truly sensible technological approach, dimensional audio can be implemented today. Further, given the current capabilities of software, CPUs and DSPs, it doesn’t need to be costly.

  1. Thanks for your article and for focusing on ambisonics. I have a hard time understanding why you would argue for making a UHJ stereo from the 4ch recording. If you loose the ability to playback spheric audio in on the recieving end then what is the point?
    Kind regards,
    Jens Toyberg

    1. For the purposes of business conference applications like tele-presence height may have limited value. Planar surround may be enough to adequately convey goings on in a board room. 

      On this basis, some may prefer to limit the bandwidth requirements by passing only two channels. Existing video conference/tele-presence systems already do this. Thus passing UHJ encoded audio is an improvement that can be realized without additional bandwidth burden.Some approaches may elect to pass B format audio, and so deliver reproduction with height where bandwidth constraints are not a concern.

Comments are closed.

Back To Top