A guest article by Daniela Rieger
Immerse yourself in sound all around you. As real as if you are there at a live concert or with the artist recording in a studio. With 360 Reality Audio, music has never been so immersive and so real” (Sony, 2020).
Besides Dolby Atmos Music, which is currently the hot topic, there is another immersive, object-based technology for music production: 360 Reality Audio by Sony. Also known as “Sony 360RA”. An overview of various music titles in Atmos Music and 360 RA can be found here.
Whether the comparative product Dolby Atmos Music also delivers what it promises has already been examined in this blog post: What can Dolby Atmos Music really do in terms of sound?.
Here, however, we will not be talking about Dolby, but about the technical background of the 360 Reality Audio technology. Let’s take a look at what other 3D audio products Sony has besides the Playstation 5.
360 Reality Audio is a commercial object-based technology for playing immersive audio content, which was also unveiled in late 2019. Unlike Dolby Atmos Music, 360 Reality Audio is produced and transmitted in a purely object-based manner. 360 RA is based on the MPEG-H codec from Fraunhofer IIS – the Fraunhofer Institute that developed the “mp3” codec.
The terms “3D audio” or “360 degree sound” refer to a listening experience in which sound can be perceived from all directions (= three-dimensional) around the listener, including above and below. Let’s not get hung up on terminology here, because in the end the numbers and words describe the same thing.
Three-dimensional reproduction can enhance the listener’s emotional and spatial “immersion” in the soundscape. The sound appears as spatial as we humans know it from our natural environment. 3D audio can be heard through various playback systems: So-called “3D speaker setups”, 3D sound bars or binaurally via headphones.
Playing music over headphones has steadily gained in importance in the age of mobile devices. In addition to conventional stereo sound, another playback method exists that makes even cleverer use of the two audio channels: Binaural audio, which refers to the two-channel sound that reaches the listener’s left and right ears.
Unlike stereo playback through headphones, binaural signals are filtered through a combination of time, intensity and spectral functions to mimic the localisation of the auditory system.
The psychoacoustic effects that occur in natural hearing are simulated in binaural rendering using digital signal processing techniques. This is intended to create realistic auditory events by precisely controlling the sound pressure signals that reach the eardrum.
Thus, the reproduction of 3D audio through headphones does not depend on the headphones, but on the technology used to mix them. If a song has been produced in such a way that it offers 3D audio over headphones, it can be heard with any headphones.
So, to answer the question: Yes, any headphone with the compatible music streaming services can play 3D audio, the only important thing is a working left and right channel. But of course it’s a bit more fun with higher quality equipment!
Now, to avoid having to create a separate mix for every possible playback system for 3D audio, the principle of “universal delivery” already exists. This offers the possibility of a qualitatively impressive playback for a variety of output formats, each adapted to the playback system. For example, by means of object-based audio, which can be transmitted via the MPEG-H 3D audio codec.
In comparison to channel-based audio, where the loudspeaker signals are stored and transmitted as the final production format, with object-based audio the audio content is described by audio objects.
An audio object can be considered a virtual sound source consisting of an audio signal with associated metadata such as positions or volume. Audio objects can be placed in the room independently of the loudspeaker positions.
To play back object-based productions, a so-called audio renderer is needed. Audio rendering represents the process of generating loudspeaker or headphone signals based on the respective playback systems of the end consumers (loudspeaker systems, soundbars, headphones, mobile devices, etc.) and appropriately calculating and playing back the objects within this environment.
Here we have already looked at how so-called Next Generation Audio (NGA) is used in practice.
360 RA is available via music streaming services Deezer HiFi (360 by Deezer mobile app), Tidal HiFi (mobile app), Nugs.net HiFi (mobile app) for headphone playback. Also via Amazon Music HD* for playback on the Amazon Echo Studio* speaker. Sony is working with several music companies to deliver this service. As a result of these collaborations, albums and tracks already released in stereo or 5.1 surround will be released in 360 RA format, as well as newly produced for it.
Production of audio content in 360 Reality Audio is possible with the 360 Reality Audio Creative Suite (360 RACS). This suite is a tool for object-based mixing that allows audio objects to be positioned and moved in three-dimensional space.
The 360 Reality Audio Creative Suite (360 RACS) makes it possible to produce music in an immersive, spatial sound field by using the tool as a plug-in in DAWs. Currently supported are Avid ProTools and Ableton Live; Sony writes that the plug-in can also be used in other common DAWs, but no official support is available for this yet.
360 Reality Audio is purely object-based, so it works with static and dynamic objects. Dynamic objects can move in space. In addition, 360 RACS makes it possible to group different objects; a total of up to 128 objects are supported.
The 360 Reality Audio Creative Suite has binauralisation for playback with headphones and offers the possibility to load own HRTF sets. This allows the later sound to be previewed in different playback scenarios.
The final export is Sony’s own “.sam” format, which contains metadata and audio. The further processing and encoding of these exports is done in consultation with Sony, according to the current state of knowledge.
360 Reality Audio is based on the MPEG-H 3D Audio Standard (MPEG-H). MPEG-H is a next generation audio codec and supports NGA features such as immersive audio (channel, object, and scene-based transmission and combinations thereof). It also allows interactivity and personalisation, unlike Dolby Atmos. In addition, MPEG-H offers flexible and universal delivery of the produced content.
Among other things, the MPEG-H 3D audio standard was developed specifically for integration into streaming applications. 360 Reality Audio supports the transmission of up to 24 discrete/independent objects, but does not use the full functionality of MPEG-H, such as interactivity or channels.
Audio is described in MPEG-H as a combination of audio components and associated metadata, distinguishing between static and dynamic metadata: Static metadata remains constant and describes, for example, information about the type of audio content. Dynamic metadata, on the other hand, changes over time (e.g. in the position information).
A clear difference between Dolby Atmos Music and 360 Reality Audio is particularly evident in the encoding. In the latter, the MPEG-H codec consistently transmits a combination of audio and associated metadata to be rendered on the user’s device to the appropriate playback system.
Thus, a fundamental added value of object-based audio can be played out with this technology: A mix is created in production, which is transferred through a distribution file and can be played back on many possible playback systems by the consumer.
Thus, a closer look shows that only 360 Reality Audio can be described as an entirely object-based technology. Here, both the production and playback processes are purely object-based.
With Dolby Atmos Music, on the other hand, a mixture of channels and objects is already used during production and the object-based process is abandoned during encoding of two different files. When Dolby Atmos Music is played back over headphones via some playout paths, one cannot speak of object-based playback due to the AC-4 IMS two-channel file – due to the pre-binauralisation, no characteristics of object-based audio reach the end consumer.
For example, personalisation by analysing the shape of the ear is not possible, which was implemented in 360 Reality Audio by the object-based transmission method: Since the binauralisation is only done in the playback app, personalised settings can still be made by using the Sony Headphone Connect app.
No matter which format – in the end it still depends on the right content to inspire people. Interested in producing 360 Reality Audio? Further questions, ideas or suggestions? Click here to go to the contact page!Contact