Augmented reality and binaural sound

11/18/2023

įor binaural playback, an Ambisonics signal must be “decoded” to two channels (left and right ears) by pairing it with a head-related transfer function (HRTF), which is how we refer to an HRIR dataset when expressed in the frequency-domain. commercial microphone arrays typically operate at order 4 or lower, while real-time acoustic simulations benefit from working with low orders, as it reduces computational costs. The spatial order of an Ambisonics signal is often constrained by the application, e.g. As a general rule, lower orders offer a coarser spatial resolution, leading to an increased width or “blurryness” of rendered sound sources, while higher orders offer finer resolution, leading to narrower and better-localised sources. When a sound field is encoded into the Ambisonics domain, it is assigned an inherent spatial order ( ), also known as truncation order, which dictates its spatial resolution. Under this representation, the signal can be conveniently manipulated through a mathematical framework known as spherical harmonics (SH) – an excellent introduction for its usage in acoustics is given in Rafaely’s book (, Chap. In essence, Ambisonics allows to “encode” a three-dimensional sound field by projecting it on a hypothetical sphere surrounding the listener. For instance, the framework has recently found use in VR-focused acoustic simulation engines by Facebook (formerly Oculus) and Google. headphone-based) audio reproduction, mostly due to an increased interest in virtual reality (VR) and augmented reality (AR). Although it was initially intended for loudspeaker playback, Ambisonics has recently found a niche in binaural (i.e. For such applications and features, it is common to employ Ambisonics instead.Īmbisonics, first introduced by Gerzon, is an audio signal processing framework that allows to conveniently record, represent, post-process and reproduce spatial audio. Furthermore, the implementation of rotations, in order to allow the listeners to turn their head and keep the sources fixed relative to the surrounding space, can be relatively inconvenient when using HRIRs. recorded with spherical microphone arrays. Convolving signals with HRIRs is a convenient method to simulate a limited number of sound sources in an anechoic environment, but it cannot be easily used to accurately render reverberation or “scene-based” spatial audio formats, e.g. Typically, HRIRs are measured or simulated for a set of directions on a specific listener in anechoic conditions. Traditionally, this is achieved by convolving an anechoic audio signal with a head-related impulse response (HRIR). 1 Introduction 1.1 Binaural rendering and Ambisonicsīinaural rendering allows to present auditory scenes through headphones while preserving spatial cues, so the listener perceives the simulated sound sources at precise locations outside their head. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The results, which were in line with previous literature, indirectly validate the perceptual models’ ability to predict listeners’ responses in a consistent and explicable manner. A newly proposed method, BiMagLS, displayed the best performance overall and is recommended for the rendering of bilateral Ambisonics signals. A notable effect of the preprocessing method was observed: whereas all methods performed similarly at the highest spatial orders, some were considerably better at lower orders. Models predicted that the binaural renderings’ accuracy increased with spatial order, as expected.

This assessment was supported by numerical analyses of HRTF interpolation errors, interaural differences, perceptually-relevant spectral differences, and loudness stability.

In this study, nine HRTF preprocessing methods were used to render anechoic binaural signals from Ambisonics representations of orders 1 to 44, and these were compared through perceptual hearing models in terms of localisation performance, externalisation and speech reception. Several preprocessing methods have been proposed, but they have not been thoroughly compared yet. In order to alleviate this issue, the HRTF may be preprocessed so its spatial order is reduced. Processing Ambisonics signals at low spatial orders is desirable in order to reduce complexity, although it may degrade the perceived quality, in part due to the mismatch that occurs when a low-order Ambisonics signal is paired with a spatially dense head-related transfer function (HRTF). * Corresponding author: rendering of Ambisonics signals is a common way to reproduce spatial audio content. Imperial College London, London SW7 2AZ, United Kingdom

0 Comments

Augmented reality and binaural sound

Leave a Reply.

Author

Archives

Categories