What will 3D audio and raytraced audio really mean for PS5 and Xbox Series X?

What will 3D audio and raytraced audio really mean for PS5 and Xbox Series X?

Gaming


Sennheiser Gaming – a brand with 115 years of audio technology expertise and craftsmanship – is rebranding to EPOS. To celebrate the move, we sent over some questions to get the lowdown on what next-gen consoles will mean for video game audio.

You can read the full interview below, answered by Andreas ​Jessen, senior director of global product management and marketing.

VG247: How will the PS5’s 3D audio change how you’re approaching the next gen of headsets?

Jessen: It’s important to note that at the time of writing, we do not have full insights as to what the PS5’s 3D audio actually entails, but our assumption is that it is some implementation of HRTF (head-related transfer function) algorithms either running on a 7.1 signal or perhaps even an object-based audio signal.

For some years now, we’ve had our own HRTF algorithm in the market for PC gamers, so making hardware for a binaural signal is nothing new to us. What’s important to understand about HRTF is that it’s a phenomenon that describes the manipulation of audio signals, to give the brain the audio cues it needs to place an audio source in a given room. There are a ton of cues that can be implemented to do this, but the two generally accepted as most important are “level difference” and “time difference”. This means that in order for your brain to place an audio source e.g. 30 degrees to the right of you, the brain uses the time and level difference of that audio reaching your left and right ear.

If we as a hardware provider know that the gamer is plugged into a system that utilizes technology like this, one of the most important things we can focus on is balance and manufacturing discipline. So, what does that mean? A speaker system is actually a very analogue system, and consists of mechanical parts – a suspension, a diaphragm, a magnet, a voice coil etc. There are never two speakers completely identical. If you are not careful with manufacturing, you could risk having speakers with up to 5dB difference in some parts of the frequency response. This would give you a difference in level before you artificially add the level difference – meaning you wouldn’t be able to accurately pinpoint the location of an approaching enemy.

So, in short, to develop headsets that can take full advantage of 3D audio, we are making sure that every single headset is tested in level difference between L&R before leaving the factory line, and keeping very tight tolerances for the mechanical assembly of the product.

What kind of a difference will 3D audio make for users?

Ultimately it should do two things depending on what type of gamer you are. Firstly, it should add to the immersion, meaning it should be easier to imagine being somewhere else if the audio is more lifelike.

Secondly, you can look at 3D audio as another layer of information in the audio signal, so if you are a competitive gamer and get used to surround sound, you will be able to more accurately pinpoint an audio source. A disclaimer here is that if you have played CS:GO for ten years and always in stereo, you have trained your brain to a stereo signal and you won’t instantly be able to process this added information layer. With training, you can get there and benefit from the added audio layer to enhance your game.

What is raytracing audio and what difference will that make?

I think most gamers have now somehow seen the effects of raytracing on light sources, and it can be quite impressive when it’s done right.

Surround sound audio has for a long time relied on a predefined set of speakers that could play audio. This probably comes from the movie industry that had to record all audio sources and play them in some defined speaker setup for movie theatres – this is how we got the 7.1 and 5.1 standards.

However, a game is not a recorded audio, it is rendered with metadata. In theory, instead of mixing everything down to a 7.1 format, you could just put a virtual speaker in every single location of an audio source. It is my personal belief that benefit versus computational power is not worth it. This would be the equivalent of raytracing audio – that every single audio object in a game is a speaker, instead of downmixing it in the game engine to a standard 7.1 signal. You could imagine a 360.1 speaker setup.

Will both Xbox Series X and PS5 be capable of taking full advantage of your next-gen headsets?

That is the plan. However, there are still some open points on what exactly will be possible on the next-gen consoles. For example, will they allow multiple audio sources to come from the USB interface? Will they have a Toslink port, will they allow for download of other HRTF implementations than their own?

So yes …the headsets will work, but we do not know how much more magic we can add just yet. But we are preparing parallel projects to take advantage of the new platform as much as possible as soon as we know more.

EPOS branded gaming product launch is set for October 2020.

Watch on YouTube