Table of Contents

Full-AAR prototype

Currently, no comprehensive technical platforms are available for indoor AAR with 6DoF. While some proprietary solutions exist, they are not accessible to the public. To conduct our experiments, we developed our own prototype platform supporting multiple users in a multi-room indoor space.

In our current prototype setup, the users' positions are tracked by an array of stereo cameras with body tracking algorithms. IMU's attached to users' headphones track their head orientation. The interactive, narrative content is running on a game engine equipped with a spatial audio plugin for virtual acoustics and binaural rendering. Audio is then transmitted back to headphones by a wireless digital IEM system.

A local area network (LAN) connects the different components together, although Dante devices are using their own, isolated LAN for reliablity.

Master computer

In our solution, we use one master computer (a MacBook Pro with M3 processor) for scene interaction and virtual audio processing for all of the simultaneous users (currently two). This ensures a relatively low system latency with enough processor capacity for good-enough virtual audio processing. Additionally, the users do not need to carry a mobile device so they are wearing only a pair of custom made headphones.

The computer is running

Mac was chosen due to its lower audio latency compared to Windows. That is at least when running a Unity build based on our tests. Using a Linux or Windows based system is not out of the equation, but compatibility with software and plugins must be first checked, and low enough audio latency confirmed.

With our recent switch from an Intel Mac to the new one with an Apple silicon processor, there were (and still are) some compatibility issues with software and plugins.

Game engine

A Unity project, which is the central piece of software in our setup, containing

Dante virtual audio driver

Feeds audio to wireless IEM transmitters via local area network (LAN).

Using Dante to feed audio to the IEM transmitters via existing LAN makes the setup easy since no physical audio interface and extra cable runs are needed. When scaling up the system with multiple users and transmitters, Dante makes configuration and setup potentially very convenient, although that is still to be verified.

The virtual Dante introduces some latency (4 to 10 ms), which could be eliminated by using a hardware Dante interface. However, we've encountered serious compatibility issues with RME audio interfaces and the dearVR plugin on Apple silicon. Therefore, we're proceeding cautiously to avoid spending money on hardware that might not work.


Headphones

|
One of the prototypes of the headphone rig

Open-back headphones (currently Sennheiser HD 650) equipped with

To calibrate against drift typical to IMUs, the system uses the ZED cameras to keep looking at the orientation of ArUco markers on the headphones, and calibrates the IMU readings if needed.

The headphones setup is still at a prototype stage. Changes for the upcoming versions:

  • lower impedance headphones to better match with the IEM receiver
  • better component placing and attachment for optimal weight distribution
  • neat cabling
  • casing

Further, tests will be conducted replacing the headphones (HD 650) with a pair of 'Mushrooms', acoustically transparent 3D-printable headphones by Alexander Mülleder1).


ZED camera system

Camera tracking computers

NVIDIA Jetson Xavier computers with Ubuntu interpreting camera signal from Stereolabs ZED stereo/depth cameras with

The computers send the body joint coordinates of all visible persons with IDs based on ArUco markers to the MQTT server. Some computers connected to LAN via WiFI for easier installation.

ZED stereo cameras

Multiple ZED stereo cameras from Stereolabs


Pozyx Gateway

We use the Pozyx's UWB system for tracking the orientation of some individual narrative objects such as a picture frame and a door.

We first tried to use UWB for location tracking with the Pozyx system, but we never managed to get reliable readings in our venue, possibly due to electro-magnetic reflections or some other reason. However, we kept the Pozyx as a part of the system since the Pozyx tags are working very nicely as orientation trackers, and we may still be using the location tracking features for some limited area.


Primary LAN

Connects most of the devices together, either by cable or wirelessly, including

Dante LAN


1)
Mülleder, Alexander, et al. 'Do-it-yourself headphones and development platform for augmented-reality audio'. AES 2023: International Conference on Spatial and Immersive Audio, August 2023, http://www.aes.org/e-lib/browse.cfm?elib=22188