With more and more music is being delivered by music streaming services like Spotify and iTunes Radio, what these services do regarding the loudness wars matters more and more.
In the first part of this series of articles, we looked at the Loudness FAQ from the Loudness Petition Group which expands on the issues and covers some of the misconceptions and urban myths that have already grown up around this whole subject.
In this article, we are going take a look at the research that Eelco Grimm has just completed and presented at the Berlin 142nd AES convention that he did for HKU Muziek en Technologie, University Of The Arts in Utrecht in the Netherlands, in cooperation with the music streaming service TIDAL. Over to you Eelco...
How Have We Got Here?
From the jukebox days in the 50’s artists and record labels have pushed mastering engineers to make their songs louder than any others. With the advent of fast digital peak limiters in the mid 90’s, the pressure on mastering engineers has led to a dramatic loss of dynamics in pop music, otherwise known as the “Loudness War”. Even artists who do not feel an urge to compete in loudness are pulled into this race. For more than a decade sound engineers, musicians and music lovers have discussed how to end it. There was no solution possible when CD’s and downloads were the main source of music consumption because there is no central control on their level. But now Music Streaming has taken over as the major source of music consumption, the loudness war could be ended if these services chose to turn on metadata driven loudness normalisation by default. Although Loudness is subjective it can now be measured with the right tools. In 2006 the ITU has introduced a standard for measuring loudness in broadcasts, BS1770. Since it is the only open standard for loudness measurements available and music producers demand that their songs be treated equally on all streaming services, we recommend using BS1770 to measure the loudness of music productions.
Track Normalisation Or Album Normalisation?
The main issue is to select the type of loudness normalisation: track normalisation or album normalisation. With track normalisation, all tracks are made equally loud. With album normalisation, just the loudest tracks of an album are made equally loud and the other tracks keep the relative loudness they had on their album. If one listens to an album, album normalisation makes the most sense. But when streaming, people do not just listen to albums as a whole but to randomly picked tracks in a shuffled playlists. So the question is: does album normalisation work for a shuffled playlist too?
Survey Of 4.2 Million Albums
To help answer this question, in cooperation with TIDAL we did a large survey on 4.2 million albums in their catalogue. A proposal was developed to use album normalisation where the loudest track of each album would be normalised to -14 LUFS and the other tracks are aligned to the relative level they have on their respective albums. These levels would then also be used when tracks are played in a randomly shuffled playlist with other albums’ tracks. This proposal was tested against track normalisation in a shuffled playlist of 24 songs with 38 subjects. It turned out that 80% of the subjects preferred album normalisation, even though the tracks were selected to have an extreme difference in loudness, of up to 10 LU.
Target Loudness For Mobile Devices
The second question is: what should the target level be for mobile playback? The current generation of mobile devices such as smartphones have a limited headroom to comply with governmental hearing protection laws in the EU. Because of this, AES TD1004 recommends keeping the loudness of content above -20 LUFS on mobile devices.
Recommendations to Music Streaming Services
Based upon these answers we produced the following recommendations.
- Turn album normalisation ON by default. With album normalisation, the loudest tracks of all albums are adjusted to the same loudness during playback.
- Use the industry standard ITU BS1770-4 measurement so that mastering engineers can predict the result of their work for all services at once.
- In mobile devices, use a target level of -14 LUFS for the loudest track of an album.
- In a living room setting, lower target levels such as -18 LUFS to -20 LUFS are recommended. The advantage is that albums of more dynamic genres such as classical and jazz will also become loudness normalised.
- To avoid clipping, only attenuate tracks, never apply positive gain. If the loudest track of an album is softer than the target level, all tracks of the album will play soft.
- In the preferences, loudness normalisation should optionally be turned off, for people who are concerned about automatic changes to the data, or for research purpose.
- Although it seems attractive to add the third option of “track normalisation” in the preference to satisfy all users, this can potentially confuse many people. It is not recommended.
- Provide playback device manufacturers with the option to merge loudness normalisation with playback gain, since this offers the highest possible loudness normalisation and sound quality. In this case, the loudness metadata information of a track is sent downstream all the way to the final gain stage of the system. The playback device manufacturer then merges the normalisation data with the user set playback gain and applies just a single gain change. (Mark that if the system is floating point throughout, both gains can also be set independently). The advantage is that over a large range of playback levels, both very dynamic and compressed albums will be perfectly aligned in loudness, and there is no need to select a ‘target level’ anymore.
The Research In More Detail
Analysis of the TIDAL database of 4.2 million albums showed that until the end of the 90’s albums had a peak in the distribution at around -14 LUFS, while in the 00’s and 10’s that moved up to -8 LUFS. The loudness war is real.
Here is a graph that shows the distribution of the loudest track of all albums of the database:
87% of all albums have a loudest track that is louder than -14 LUFS. Part of the remaining 13% were manually evaluated and most of those albums were either classical or jazz, or were made in the 80’s. Many of these are -15 or -16 LUFS, so would play just 1 or 2 dB low if a target of -14 LUFS was chosen.
Here is a graph that shows the distribution of the softest track of all albums:
In just 2% of all albums, the softest track is of equal loudness as the loudest. That means that track normalisation will change the artistic loudness characteristics of 98% of all albums ever made. In 72% of the albums, the softest track is 6 LU or less below the level of the loudest track. If album normalisation is used and the loudest track is aligned to -14 LUFS, the softest tracks of these albums would still be in the range of the AES TD1004 recommendation of -16 to -20 LUFS.
A group of albums from the remaining 28% (with a difference of 7 LU or larger) was manually evaluated and it was found most of the albums were either classical or jazz. Among the pop albums in this group, there were some quite recent ones. Apparently, artists still love to add soft tracks to their albums sometimes, as an intermezzo. The question was if the low loudness of such tracks would still be appreciated when they are played outside the album context, in a shuffled playlist. We took the loudest and softest tracks from 12 albums with substantial differences of 7 to 10 LU. These were put in a randomly shuffled playlist of both track normalised and album normalised style, which was tested with 38 subjects.
The result was that 71% of the subjects preferred album normalisation blindly. Another 9% said they would never accept track normalisation if it was turned on by default and in that case, they’d rather accept the level differences in the shuffled playlist. Which means 80% have a preference for album normalisation in case normalisation should be turned on by default. That percentage would likely be larger for albums with more typical loudness differences between tracks. The testing done here was on material with very large differences.
Regarding the optimal target level, normalising the loudest track to -14 LUFS seems quite a good choice. If it is lowered to for instance -16 LUFS, a larger amount of ‘loudest tracks’ will be correctly normalised, but at the same time, many more soft tracks will fall below the -20 LUFS lower limit that is recommended by AES TD1004. Conversely, if the target is raised to let’s say -12 LUFS, a lot of albums made before the 2000’s will loose normalisation. At the moment Spotify and TIDAL have a track normalisation target around - 14 LUFS. If they switch to album normalisation at -14 LUFS, most contemporary hit songs will remain at the same level since they are usually among the louder tracks of the albums.
In a stationary situation, such as in a living room, it makes sense to lower the target level since there are no headroom limitations in that case. Broadcasts are normalised to -23 LUFS (in Europe) or -24 LUFS (US), and by lowering the target of the loudest track to around. -20 LUFS, a switch between music and broadcasts would show few loudness jumps. Additionally, music genres with large dynamics, such as classical music, will then be loudness normalised with pop music. Care should be taken to loudness normalise system sounds as well. It is recommended that mobile devices automatically switch to a lower loudness target if they stream to a stationary device, for instance via Airplay or Google Chromecast.
Additional Notes About Album Normalisation
- A special property of album normalisation is that an artist can still decide to choose track normalisation for his or her album. For instance by releasing albums with just one track. Or by mastering an album in such a way that all tracks are equally loud. That means that album normalisation offers the creative option for the artist to select track normalisation. On the other hand, if track normalisation were to be the standard, there is no option to release an album with album normalisation, except by putting all songs in one single track, which is highly inconvenient.
- Album normalisation comes in two types: normalise according to the average loudness of the whole album or according to the loudest track of the album. The ‘average album loudness’ variant is currently implemented in Apple iTunes and in ReplayGain. It, unfortunately, has a few drawbacks. Someone could master an album with many soft songs and just one loud song, which will then play louder than the loud songs from other albums and so the loudness war would continue. Another disadvantage is that the playback loudness of a certain track would only be known when the full album is finished and measured as a whole. This is unacceptable for a mastering engineer. All in all, we recommend to never use the ‘average album loudness’ version but always the ‘loudest track’ version.
- Some people have suggested switching automatically between track normalisation and album normalisation, based on the user’s behaviour. Our research showed that 71% of the subjects had a blind preference to use album normalisation in a shuffled playlist, so this automatic switch may not be the ideal solution. But apart from that, it will be hard to implement such behaviour by the music streaming service, because users may decide to play the full album after they have started a track. That track was then started at ‘track loudness’ level but the next track would have to be played at ‘album loudness’, which of course breaks the loudness sequence.
- If one looks at the question “track or album normalisation?” in a wider perspective it is clear that track normalisation has an artistic problem because it deprives the artists of part of their artistic freedom in the creation of a music album. Using track normalisation on classical music is obviously unacceptable to all users. It would be a bit cynical to have loudness normalisation limit the artistic freedom of musical artists since the aim of normalisation is to stop the loudness war that currently limits that artistic freedom. For these reasons alone, any loudness normalisation that is turned on by default would have to be album normalisation.
Eelco would like to thank TIDAL, Maurits Lamers of HKU and the MLA6 group. We would like to thank Eelco and the Loudness Petition Group for allowing us to reproduce these articles here.
As someone who has been involved in loudness for 5 years as well as teaching and training in the broadcast sector helping broadcasters update their workflows to match the loudness delivery specs based on the universal loudness standard BS1770, I have found these two articles on music streaming loudness normalisation, a revelation.
In broadcasting, we have effectively adopted a track normalisation model, where the loudness of each piece of content whether it is a trailer, advert, news, documentary or drama is measured and then normalised by our equivalent of the music mastering engineer. Then, all being well, the transmission process doesn't interfere with it all the way to the consumer.
This system has a lot of merits, but there have always been concerns voiced by people like Jake Knott and Rob Ashard, to name just two from here in the UK, about the broadcast equivalent of album normalisation where there are shows that we would like to a little louder like light entertainment shows, or other shows that would value from being just a little quieter.
It is also interesting that all of this track versus album normalisation is being discussed in a context of mobile devices, which implies this content is being consumed in high ambient noise environments. This is also a challenge broadcast content creators have to face, both in a domestic and mobile locations, unlike our film colleagues who have the advantage of a more controlled replay environment, so the fact that this album normalisation is the outcome even with test sequences deliberately chosen to be at the extreme end of the spectrum, in my opinion, gives weight to the need for the broadcast standards authorities like the EBU and ATSC together with the DPP, which is doing great work in producing unified delivery specs the UK and now the US, to consider an album type normalisation amendment.
Coming back to the music streaming sector and track versus album normalisation, if you had asked me what the best thing to do was before I had read the work of the Loudness Petition Group, I would have replied instantly with track normalisation, and so this work has been a complete revelation to me. It is really interesting that Eelco was able to get the support from Tidal to be able to analyse their catalogue and then with the playlist they assembled confirming the album normalisation theory.
That said, the album normalisation model sits very well with the comfort zone I talk about in my Understanding Loudness tutorial series of a zone of +2 to -3LU, supported by research that if broadcast content stays within this comfort zone, consumers don't feel the need to reach for the volume control.
What I would like to see now is a larger survey of users and also playlist which are closer to the real world so not having such an extreme range of tracks, to confirm the findings from this survey.
Finally, I must thank the team at the Loudness Petition Group of Eelco Grimm, Bob Katz, Matt Mayfield and Ian Shepherd for their work so far. Please consider going and signing the petition on Change.org and I look forward to your thought and comments on all of this below.