Menu

Audio Dynamic Range Calibration System

April 16, 2016 - Papers
Audio Dynamic Range Calibration System

The Audio Dynamic Range Calibration System is an attempt to improve the way the players use the dynamic range option in video games. Such options have flourished in the recent years in games, especially in AAA ones, but they are not always understood nor used properly, as the audio dynamic range is an obscure concept for those who are not familiar with it. Some studios, like Bioware, have found some solutions to help players to use the dynamic range option in the right way. Still, a lot of work has to be done to ensure most players have the dynamic range that suits their sound systems, their environments and themselves the best. Moreover, as a given person plays most likely multiple games on the same platform, the dynamic range value should ideally be shared across games so the player does not have to tweak the option in every single game they play.

 

Download the program

(Windows 64-bit)

The need for a dynamic range option

If all gamers were audio people, playing on high-end sound systems in perfect listening environments, we would not see this option in any game and the dynamic range would be most of the time very wide, like it is in cinemas. Everyone would enjoy great audio experiences and our hearing and our brain wouldn’t suffer from the sound being too loud or too compressed (in dynamics).

Yet, most gamers do not have a high-end sound system, nor a great environment. Also, the player being an audiophile or not will have a big impact on the sound volume they will play their games at and how wide they will like the dynamic range to be.

For example, if a person plays on their laptop speakers, having a very wide dynamic range would be very inconvenient:
– because of the limited output gain of their sound system, the volume of the game would be globally too low
– the width of the dynamic range would make the quiet sounds too quiet to be audible

On the contrary, having a too narrow dynamic range would be very frustrating for the player with a high-end sound systems and a calm environment.

Consequently, in the last five years or so, we have seen several games with a dynamic range option, which is a great thing compared to not having any settings. Sometimes, the dynamic range option is “hidden”, included in other audio settings, like “Output”, or “Speaker Type”.

Dynamic range option examples in recent games

The problem of having the player choose

So, more games every year have the dynamic range option, isn’t the problem solved then?
Not exactly, as most players do not know much about audio and have no idea how to choose the best option. Most of them just keep the default value (or do not even open the audio settings menu), some choose the highest dynamic range value as they would use the highest screen resolution or anti-aliasing option (especially gamers used to PC games) and some others look on the internet and ask the community for THE best dynamic range setting. You can see a few examples here, here, and there.

As we can expect and as we can see from what the players write on the forums, most of the them have sound systems with low to medium dynamic ranges and usually low output gain (laptop speakers, or headphones plugged on an integrated sound card have a low output gain, even at the maximum volume; and games whose integrated loudness level is about -23 LUFS will sound too quiet. Also, some of them do not think about turning the volume up, even when they can. In these conditions, most of the players choose low dynamic range options and think the game sounds better this way: indeed, the global volume is then louder and they can finally hear the quiet sounds they couldn’t hear before (compared to the medium or high default dynamic range setting).

On the contrary, players with a medium to high dynamic range sound system and a higher output gain that allows them to turn up the volume may quite possibly choose one of the lowest dynamic range options. It can be because they read on the internet that the lowest dynamic range setting is THE best value, or simply because they test the different options in a bad way. In several games, the lower the dynamic range, the louder the game is; this is most likely to cope with the low output gain of some low-end sound systems (integrated sound cards, laptop speakers, etc.), but it has also a negative effect: if the player compares all the different options, without adapting the volume of their sound system, the lowest dynamic range setting will naturally sound “better”, simply because it’s louder.

So, if we let the player the choice of the dynamic range without any sort of guidance, there are many reasons for them to not choose the option which suits them, their sound system and their environment the best.

Current solutions

Some games have proposed ideas to prevent the players from making bad choices with the dynamic range option. As far as I know, there are two ways of doing it:

1 – Having the dynamic range included in an “Audio Output” or “Speaker Type” option

Even if the player does not know anything about the dynamic range, they may select the right one just by selecting their audio setup properly. It’s probably the best way to be sure a maximum of players select the correct value, with the less explanation as possible. However, I can see four problems to this method:

  1. The sound fidelity of different speakers, inside the same “Audio Output” category varies a lot; should crappy stereo speakers be in the same dynamic range category than the high-end monitoring stereo speakers?
  2. Like I said previously, some sound systems have a low output gain and, for example, having headphones plugged on an integrated sound card won’t allow the player to use a high dynamic range option, even if the headphones are great and the environment calm.
  3. The player’s environment has a big impact on how much they can hear quiet sounds: with the same sound system, playing in a quiet room on a silent PC, or in the living room with the windows open on a busy street, changes everything.
  4. What I call the audio culture has a substantial impact on how well the player will tolerate high dynamic content. People who rarely go to the cinema (and think the volume is too loud there when they do) and are used to listen to their TV or radio with the volume just loud enough to barely hear the news, will not enjoy high dynamic content as sound designers do, for example. Should we force them to choose the high dynamic option when their sound system and their environment allow it? Probably not; they would just turn down the volume after they hear the first loud sound in the game and, as a result, wouldn’t hear any quiet sounds for the rest of the game after that.

 

2 – Explaining briefly to the player what the dynamic range is

The game developer BioWare uses a different approach. With as little text as possible, they explain what the audio dynamic range is and which option to select depending on the player’s sound system and their environment.

I actually like this way of doing it, since it does not “hide” the option under another one, like previously, with the issues that it could cause; instead, the player can select the audio output and the dynamic range independently; then, for example, the player who has fancy surround speakers in a noisy environment could use a more appropriate dynamic range option than the highest one. Also, it has an informative aspect which I appreciate. By doing this, developers raise awareness of audio dynamics among gamers and make them understand that there is no universal best value and that it depends mostly and their sound system and their environment.

Although it’s probably the best way we have seen of letting the player choose the dynamic range settings, it also has issues:

  1. The low output gain of certain sound systems remains a problem here: if the game’s integrated loudness level is close to -23 LUFS (as it should be by default), the volume will be too quiet for certain configurations (integrated sound cards) to choose high dynamic range options, even if, for example, the player has great headphones.
  2. As before, the players can’t instantaneously compare the different options to be sure they are choosing the right one for them. Even by understanding that it depends on their sound systems and their environment, they still have to basically guess which one is the best. The most motivated will compare the different options by going back to the game every time they change the value, but we can’t expect many people to do that as it is quite laborious.
  3. Also, what if they just ignore the instructive text about dynamic range and just select the highest option as they do with graphics settings? And what if they never go into the audio settings menu at all?

 

Audio Dynamic Range Calibration System

 

So, after thinking about the different issues caused by the current ways of choosing the dynamic range in games, I tried to imagine how it could be better and I created this prototype, which I called Audio Dynamic Range Calibration System, or ADRCS. It’s basically a fake game, with an intro scene, a menu scene, a game scene and three other scenes for three different ways of calibrating the dynamic range. If this was actually a game, one of the calibration scenes (probably the Full Calibration one) would be included into the audio settings menu of the game.

Normalised in accordance with EBU R 128

Lately, audio normalisation has gained importance among the field of game audio; Wwise and FMOD Studio now support loudness and true peak metering. I won’t talk in details about that here and there are already great articles about the subject on the internet. For example, Stephen Schappler, sound designer at NetherRealm, wrote a very informative article about loudness in video games, called Listening For Loudness In Video Games.

ADRCS and its content are in accordance with EBU R 128, which means that:

EBU R 128’s logo

First game session audio setup

Have you seen these gamma correction screens being more and more frequent in recent games when you start them for the first time? As many people do not set up the brightness of their screen, it’s very good to have a short calibration process the first time the players launch a game. If it was not a mandatory step the first time you start the game, most of the players wouldn’t go to the graphics menu to change it.

Gamma calibration in GTA V

 

Why don’t we do the same for audio? Wouldn’t it be great to have a first-time calibration process for the player to choose their best dynamic range? If it’s not automatically detected by the system, just before the dynamic range calibration, we could also ask the player for their audio output.

Like so, the first time you start ADRCS on your computer, even before the main menu, you get the two screens of the full dynamic range calibration.

Dynamic range parameters

While designing the prototype and thinking about the best way to do this, I had to list all the parameters that had a noticeable impact, in my opinion, on what was the best dynamic range for a given set of parameters (the combination of the player, their sound system and their sound environment).

Sound system

Sound fidelity

To begin with, the crappy laptop speakers won’t allow the player to choose an as high dynamic range value than with high-end speakers.

Type

In a noisy environment, being isolated by the headphones allows the player to choose a higher dynamic range than if they used speakers, since the quietest sounds are not so easily masked by the noisy environment, thanks to the isolation. On the contrary, in a quiet environment, the player will benefit more with good loudspeakers than with good headphones, because the tolerance to loud sounds is much higher with the first ones than the last; they can then turn up the volume a bit more with speakers than with headphones.

In a similar way, we can also distinguish open headphones from closed ones as the latter isolate much more from the surroundings but are also more aggressive for the ear drums.

For all these reasons, I do not really like when games assign automatically a dynamic range value depending only on which “speaker type” the player chooses.

Maximum output gain

As I said previously, on some sound systems, the maximum output gain is actually too low to choose anything else than a low dynamic range, even if the player has great headphones. This is also one of the reasons why we shouldn’t just take the speaker type in consideration.

Environment

Quietness of the environment

As mentioned earlier, this has been taken in consideration in some games, as they mention it when they describe in a really short text what the player should select: the quietest the environment, the highest the dynamic range can be.

Potential need to limit the volume

Sometimes, it’s not that the environment is noisy, but that the game can be noisy for the player’s surroundings (sleeping baby, neighbours, etc.), if the latter plays on speakers. It has been taken into account already in some games (the “Night” option of Dragon Age: Inquisition or the “Midnight” option of The Last Of Us) as it is something the player should be aware of when they choose the dynamic range.

Player

Hearing

Choosing one of the dynamic range options is personal; not only because not everyone has a perfect hearing, but also because people have a different sensibility to certain sounds. Two people, with the same age and a perfect audition, can perceive some sounds quite differently and one can be more tolerant to certain loud sounds than the other one. This might be also related to the audio culture, which is the next parameter.

Audio culture

As I mentioned previously, we do not all have the same tolerance to high dynamic content. Most people are used to very low dynamic content (TV ads, TV news, radio, loudness war in music, etc.) and many people adjust the volume so the dialogue are just audible. Although, we shouldn’t force people to turn the volume up and to listen to high dynamic range content; it’s better to inform them than a high dynamic range, when the conditions allow it, is the best way to listen to the content.

A good way to inform players about dynamic range, to get them to choose audio options properly and to encourage them to play on good sound systems, is also to tell them that high dynamic content has noticeable benefits on listening fatigue: Soundscapes: What sound does to games and brains.

Calibration scenes

For testing purposes, there are three different dynamic range calibrations in this prototype. Of course, only one of them would make its way to an actual game.

Full Calibration

For now, this is the ideal calibration process in my opinion. It’s a bit long (two screens) but I can’t see how to make the player to choose their best dynamic range value properly with a shorter calibration process. If you have any idea how to shorten it, please share you thoughts in the comment section. I also made sure all the non-critical text is not shown unless the player hovers their mouse over some parts of the critical text (or select them with their controller).

ADRCS

Loudest sound

The loudest sound here is a really simple sound done in Wwise, synthesised in real-time. Any very loud sound would probably be fine, although, keep in mind that the nature of the sound may change the loudness perceived by the player. Indeed, an aggressive sound, like a gunshot or an explosion, will probably make the player set the volume of their sound system a bit lower than what they would do with a peaceful sound, even if the two sounds have the exact same loudness level. In any case, the best option is most likely that the loudest sound of the calibration is actually the loudest sound of the game, whatever this sound is, to be sure that the loudest sound of the calibration is perceived as loud, by the player, as the loudest sound of the game.

Be aware that this is just a guess; I am not sure about how much different the results would be depending of the nature of the loudest sound of the calibration. It would be interesting to run some user testing to confirm the hypothesis and to see how much it changes the results.

Volume boost: -15 LUFS (+8 LU)

This is not considered at all in many games: what if the player’s sound system has a quiet output and -23 LUFS is too quiet for them?
Some recommend an alternative integrated loudness level target in this case, like -18, -16 or -15 LUFS, but there is no clear universal convention yet, especially in games.

http://www.tcelectronic.com/media/2040040/mobile-test-paper-2013.pdf
http://www.tcelectronic.com/media/1120841/lund-slides.pdf

For the prototype, I have chosen -15 LUFS, which is then 8 LU louder than without the volume boost. It gives enough gain so quiet sound systems do not have to be only used with low dynamic range options.

calibration_full_06

Quietest sound

For the second step of the Full Calibration, what kind of quiet sound should we play?
In a similar way than with the loudest sound, we probably want to use the quietest sound of the game here. Although, sound masking can play an important role here and we should probably take this in consideration. Your noisy (by “noisy”, I mean “like pink noise”) ambiance sound might not be exactly the quietest sound of the game, but you might want to use it in the calibration, at a lower volume, as the quietest sound, as it is easily masked by background noise around the player’s sound system, as air conditioning, traffic or computer fans for example.

For obvious reasons, we also want a sound with a constant volume, so pink noise is actually a great choice for that too.

Dynamic range slider

Here, the player actually chooses the dynamic range that suits them the best. Everything before this slider was just to make sure the player is ready to choose the dynamic range value properly.

The value goes from 20 LU to 80 LU. This is a bit extreme but I wanted a nice range for this prototype and would be interested to see how high or how low people will go. During my tests, with different sound systems, different environments and different people, I got results from 30 LU to 70 LU.

Below the dynamic range slider, depending on the current dynamic range value, the player gets examples of corresponding sound systems and environments.

calibration_full_08

In my opinion, this is the best way I can think of for the moment to get the player to choose the best option depending on their audio culture, their hearing, their sound system and their sound environment.

Express Calibration

As the Full Calibration is quite long, I tried to design an alternative calibration process that would be about half as long as the full one (only one screen instead of two).

Text

I tried to keep the text to the strict minimum here:

Ambiance test sound

I tried a different approach here with the test sound, which is a simulation of a game scene. The different sounds included in this sound event are routed to the different buses (loudest, loud, medium, quiet, quietest) and the buses volumes react as the player tweaks the dynamic range slider.

It makes this calibration shorter than the full one, as there is only one test sound instead of two. In addition, it’s probably more user-friendly than the sounds of the full calibration. Moreover, this kind of test sounds makes the calibration less boring for the player (they would probably prefer to listen to some sounds of the game than to pink noises), but it also has the disadvantage of making the calibration less precise, as, among other things, the quietest sound of any game is most likely less neutral and less consistent than pink noise.

Advanced Calibration

This scene combines the features of the two previous calibration types and adds some other data that can interest people who are interested in the prototype.

calibration_advanced_2

 

Statistics

Some of the data I gathered while testing with friends are available in this scene. I will discuss more about the data I have gotten below.

Bus levels

This is an interesting view if you want to take a look on how the dynamic range slider affects the buses volumes in Wwise.

Integrated loudness level target

The value showed there is to highlight the effect of the Volume Boost feature on the loudness level. It’s not a measure of what is actually happening ingame, but it is pretty close to the integrated loudness level you get when you play the ambiance test sound.

Maximum true-peak level

Likewise, this is not a measure, but a target for the prototype.

Game

There is not much to say about this scene. It’s simply a fake “game” scene that allows the user to play the ambiance sound.

game_02

Statistics

I have begun collecting data from playtesting I have done with the prototype so far and from the Statistics screen that shows up when users quit the program.

statistics

With these data, I would like to check several things:

One of the things I would like to analyse is for example how much different the results get depending on the audio culture of the testers.

A cross-game system?

The player will most likely play more than one game on their gaming platform, whatever it is, and while doing so, their sound system, their environment and all the other parameters that affects their best dynamic range value shouldn’t vary much. Then, why should they proceed with a new dynamic range calibration process every time they launch a new game? Shouldn’t this value be shared across the different games they play on the same platform?

Ideally, this kind of calibration should be done, in my opinion, by the operating system and the data collected during the calibration process should be easily accessible by any programs running on this system. It would then be the player’s responsibility to set up their system properly and the game developers would simply use the value to mix their games accordingly. Possibly, when launched for the first time on a given machine, games could just remind the player that the audio calibration is important and could give a link to the system’s calibration process.

The Xbox One actually runs a calibration the first time it is started with Kinect plugged to it, and a part of this process is to calibrate Kinect’s microphone for the latter to understand as well as possible the player’s voice commands, so we are probably not far from having audio calibrations in games and in game systems to adapt the audio of our games to suit better each player and their environment. Moreover, what if gaming platforms such as Steam had an audio calibration tool, whose data could be used in games? What if this audio calibration tool had achievements depending on how high the player’s dynamic range is to encourage them to play games at a decent volume and to improve their environment and their sound system? What if the players could share their results on Facebook? Ok, I am being a bit silly now, sorry.

A common unit

If different games used the same calibration system, the unit used to share the dynamic range value should be easy to implement in any game audio pipeline and preferably be in accordance with international standards.

LRA?

The Loudness Range (LRA), promoted by international normalisation standards such as EBU R 128, is great to measure the distribution of loudness within a game. Although, it wouldn’t make much sense for ADRCS to output a loudness range simply from the volume difference between the quietest and loudest sound as they are two different things; two games having the exact same dynamic range could give two completely different loudness range values.

dB?

If the quietest and the loudest sound from the calibration process were exactly the same signal played at different volumes, then we could use the decibel as a unit for the dynamic range, as at the same volume, both sounds would have the same loudness.

LU?

The current prototype of ADRCS, as described before, does not use the same source at all for the two sounds of the full calibration. Thus, the unit I use is the Loudness Unit (LU) and I had to make sure, using the Loudness Meter in Wwise, that if the player used a dynamic range of 0 LU (which is impossible in the prototype as I set the minimum value is 20 LU, but it is technically possible), both the sounds had the same loudness level.

The cross-game feature of ADRCS

The ADRCS prototype, like probably every personal projects I am going to release on Windows from now on, has two ways of saving the player’s content across different game sessions.

Windows registry

Unity creates automatically some variables in the registry when the player starts an Unity game for the first time (screen resolution, fullscreen mode, graphics quality, etc.). I added the variable “notFirstLaunch” to these values for the game to remember if it has been used already on the same machine. The registry variable is created and the value set to “1” the first time the player gets to the menu of the game.

regedit

XML file stored in AppData

Using Windows registry to store values across different games is not a good idea as accessing and writing data from and to other directories than the application’s directory is hard and not convenient (for security reasons). Instead, for this purpose, I use an configuration file stored in the “AppData” folder (usually located in “C:\Users\*userName*\AppData”), like many games and applications do.

Since I have only two variables to store here, I could have simply used a TXT or INI file, but I wanted to learn how to read and write XML files from Unity, as it will probably be useful for future projects.

xml

Now, we are going to see how these values are used by ADRCS’ protoype.

First case: the XML file does not exist or it is invalid (if the registry key exists or not has no importance in this case)

If the XML file does not exist or is invalid, we simply assume that’s it’s the first time the user plays a game using ADRCS on this machine, so we invite them to run the calibration:

first_time_01

Second case: the XML file is valid, but there is no registry key for this game yet

This case means this game has never been launched on this machine yet, but at least one game using ADRCS has been in the past and the calibration data have been stored properly.

first_time_02

Third case: the XML file is valid and the registry says that the game has already been played on this machine

If the calibration data are valid and it’s not the first time the game is used on this machine, we simply skip this and go directly to the game menu.

Use ADRCS in your game

The path of the XML file is “…\AppData\Roaming\ADRCS\player_preferences.xml” and any program running on the same machine can read and edit it (or even create it if it doesn’s exist yet), feel free to use ADRCS in your game if you think it’s worth it, as I do. If you do so, please contact me, as I would be quite interested to know that you plan to use it!

The Wwise project side

After showing the first version of this article to some friends and colleagues or just discussing with them about it, some of them wanted to know more about how this system would be implemented in Wwise. I was quite surprised, because it’s not really the most interesting part of the problematic to me, which is much more about using user research to improve the way gamers play and listen to games, but as it has been mentioned several times, here is a short part about it.

Using volumes curves driven by RTPCs on each bus

The Master-Mixer Hierarchy has been done this way:

adrcs_wwise_bus_hierarchy

When the dynamic range parameter increases:

 

On this image, I put all the four curves from the buses L1, L2, Q1 and Q2 on the same graph so we can compare them:

adrcs_wwise_buses

There is another volume curve on the parent bus to compensate the loudness changes and make sure the loudness of the whole project stays consistent whatever the dynamic range value is:

adrcs_wwise_loudness_correction

All these curves depend of course on the content of the project.

Finally, here the Wwise Gain effect which applies the 8 LU boost to the master:

adrcs_wwise_volume_boost_effect

The effect is bypassed unless the player enables the Volume Boost in the prototype:

adrcs_wwise_volume_boost

Using HDR

It could also be possible to use ADRCS with an HDR bus, the HDR threshold of which would be affected by the dynamic range value.

Known issues and alternative ideas

While testing ADRCS and discussing with friends and colleagues, I noted different drawbacks the prototype had and possible alternative ideas.

The player’s loudness tolerance changes throughout one game session

The same sound, at the exact same volume, will sound louder if it’s played at the very beginning of the game session than if it is played at the end of a loud action sequence. Does this mean that the actual loudest sound of the game (which will be probably in the middle of a loud scene) should be louder than the loudest sound of the calibration? I do not like this idea, since it would trick the player to think that the loudest sound of the dynamic range calibration is the loudest sound of the game, when it’s not.

I do not actually have any solution to this problem, but hopefully, the player’s loudness tolerance does not change that much throughout the game session. In fact, it would be quite easy to run the dynamic calibration with the same sound system, the same environment and the same player just before and just after the latter is being exposed to a loud game session. I might run some tests at some point.

Difficulties to keep the target loudness level depending on the dynamic range and the volume boost options

I haven’t talked about the Wwise part much, but it hasn’t been so easy to get a consistent loudness level depending on the dynamic range chosen by the player. With the volume boost disabled (target loudness level = 23 LUFS), the loudness level is quite the same, whatever the dynamic range is, but with the boost volume disabled (target loudness level = 15 LUFS), it is trickier, especially because the loud sounds peak over 0 dBTP sometimes and a peak limiter is required on the Master. Although, I have used real-time synthesised sounds only and some of the assets have their own dynamic range a bit all over the place.

Respecting these target volumes on a proper game project with some loudness conventions for each category of sounds should most likely not be a problem.

No ingame volume slider

Many games have a master volume slider, or several sliders for SFX, VO and music. It wouldn’t make sense here as we want to be at close at possible to the target loudness level, in accordance with EBU R 128.

LRA is not shown

I would have liked to show a real-time LRA (loudness range) value in the prototype, when playing the “ambiance test sound”, but we can’t simply send the values from the Loudness Meter to RTPCs in Wwise as we can do with the Meter. There is actually a function in Wwise’s SDK for this, but it would require some research on my part, so I haven’t taken the time to set this up.

Tweak the global volume with dialogue instead of the loudest sound of the game

Some people told me it could be a good idea to try using some reference dialogue lines as test sounds for the first part of the calibration (tweaking the player’s sound system volume), as we are all more used to listen to human voices than any other types of sounds and we have some idea about how loud dialogue is supposed to be; thus, it would make the calibration process more accessible, especially for people who do not have so much audio culture.

Although, I really dislike the idea, as many people would probably set the volume of their sound system too low, as they do when they watch TV or listen to radio: they just want the dialogue to be understandable, but as quiet as possible.

Two quiet sounds playing in sequence

While discussing with me about the prototype, Clément Duquesne had an interesting idea about an alternative way of finding the best dynamic range value.

Playing one continuous sound that the player would turn down until they can’t hear it anymore has a significant drawback. Down to a certain point, the quiet sound is so quiet you do not really know if you hear it for real of if you just imagine it. While testing myself, I often had to run blind tests to be sure I was not making up the sound (and many times, I was…).

Rather, why do not we play two sounds in sequence, the first one being the quietest and the second one being the same one, but 6 LU louder, like in this example:

This should be much less error-prone, as there is not this problem about imagining the quietest sound anymore: if you go too far, the sound which is 6 LU louder than the quietest sound will also be difficult to hear and you will know that you have been too far.

Although, this technique also has a disadvantage: we do not really know precisely the dynamic range value to store after such calibration, as it is somewhere between the quietest sound and 6 LU louder. Instead of being 6 LU louder, the second could be only 4 or 3 LU louder, but then it would be more difficult for the player to tweak the dynamic range value properly so they do not hear the quietest sound but they hear the 3 LU louder one.

 


 

I am not expecting a problem like this to be solved with just one idea or one prototype, but as we pay more attention, as developers, to the dynamic range calibration in games and as we inform the players via games (with short and simple explanations) of the importance of dynamics (and, by extension, of the importance of audio in games in general), we should progressively achieve our desire outcome, which is that every player gets the best audio experience possible.

Thanks

Post a comment