How close to reality do we need to reach with our virtual acoustic simulation? Should we try to opt for a perfect illusion no matter what it will cost, or can we trust that an engaging story helps the listener 'forgive' any deficiencies in the virtual audio simulation? In cinema, for example, we can clearly see that the characters are just two-dimensional projections on a screen, but still, we are able to immerse ourselves in their world.
And even if we wanted to create as believable an acoustic illusion as possible, we know that it would be a sum of multiple factors and cues (binaural rendering and HRTF, head movement tracking, reverberation, environmental context, sound propagation, sonic familiarity, distance cues, visual cues, etc). If – and when – we won't get everything perfect, we can try to improve as many elements as we can.
Here, we will be discussing this topic together with the current virtual audio solution used in the platform prototype of the Full-AAR project. We will also explain our ongoing project with Aalto University Acoustics Lab for developing a custom virtual acoustic system.
This page is still a work in progress.
Our current virtual audio solution is dearVR on Unity. The reason for choosing dearVR was its good externalisation effect compared to many alternatives.
One drawback from using dearVR is that it limits the platform to Unity (no plugins for Unreal, Wwise or Fmod). Further, it is very difficult to create congruence between the virtual room acoustic simulation and the real room(s): for the late reverbs, the user can select a room model from a list of factory-built presets which sound good as they are but probably do not match with the real-world room in question. Also, there is no propagation simulation, ie., the plugin does not calculate the sound waves' path from their source to the listener, taking into account the diffractions and reflections caused by walls and corners. However, some early reflections are estimated based on a real-time analysis of the distance between the listener and surrounding surfaces, but that solution is very rudimental and the effect is mild.
The limitations in the acoustics simulation wouldn't probably be a problem in many gaming and VR application where the environment is artificial. However, in augmented reality, the virtual reverb should match quite well with the real one as should the other acoustic properties. If more of the late reverb parameters were exposed to the user, one could perhaps try to tune them to match the (measures) room characteristics. However, that is not possible. Also, the lack of sound pathing simulation can be compensated with scripting an own system that moves and filters the source sound to roughly simulate the sound wave propagation, but that takes some effort.
Within the project, we have examined some other virtual audio solutions, too. The most comprehensive approach would be Wwise and its Spatial Audio package. It takes care of sound propagation with diffractions, understands coupled-room geometries using manually placed 'portals', and spatialises reverbs. The Spatial Audio module integrates with a separate convolution reverb plugin and an early reflection plugin.
While testing the Wwise's suite, one of our problems was producing convincing externalisation effect. That may have been due to the Auro-3D spatialiser plugin. There is (at least) one other spatialiser available for Wwise by Atmoky, but we didn't have a chance to test that together with the Spatial Audio package.
The aforementioned Wwise plugins are also quite expensive, so that is an important issue for as since we're aiming at accessibility also for content creators, not only for the audience.
Microsoft Project Acoustics (PA) uses wave-based physics to simulate how sound behaves in complex scenes. The acoustic model of the venue is pre-baked, reducing the CPU hit in runtime. [WIP: Our experiences]
WIP
WIP
WIP
WIP
WIP
WIP
WIP
WIP