AUDIOGEEK: Has YouTube Started Normalizing Audio For Uploads?
YouTube is by far one of the most popular ways to consume music the world over. The traffic that Tower of Doom videos (Tower Sessions, Re/Amp, etc.) receive from YouTube viewers literally dwarfs anything we have ever tried to use as a media platform for our company before and it seems like it is only getting more prevalent with time.
As a music producer and audio engineer myself, I know how this has affected my own workflow and thousands of other aspiring producers out there as well. However, lately we’ve noticed some changes to the way YouTube handles audio, so we ran some tests and wanted to share our findings in case any of you have been noticing some discrepancies. But first let’s look at the big picture:
Some of the concepts and issues discussed below are super
nerdy and will be incredibly boring for the average music fan.
Continue reading at your own risk!
Seriously though, if you're not into longass articles about audio
production please feel free to click on another article link to the right.
The good thing with it is that YouTube has made it super easy for literally anyone and their mother to upload and release content to the world, and its global audience is growing exponentially everyday.
At the time of this writing, our own TOWERofDOOM YouTube channel has amassed over 164,000 subscribers and has over 51 million views. That's a lot of traffic!
What's even more surprising is that this channel growth actually began to take off during the lifespan of our YouTube show, Tower Sessions. The channel has actually been active since 2006 (there are some really dumb early uploads that you can all still watch if you dig deep enough) but it really only hit its stride when we started producing and uploading our live performance episodes, and this really made us start paying more and more attention to how our audio was translating on YouTube.
The bad thing about YouTube is the audio quality.
Sound engineers will spend hours or even days obsessing about how to translate their mixes onto the platform properly, in the hopes that whatever you're hearing on your headphones or studio monitors will hopefully be what everyone else is hearing throughout the world.
So, you spend absurd amounts of time mixing and mastering your work until you have the perfect audio upload, only to be sorely disappointed when you watch it live online and it's streaming at a whopping 128kbps! What is this 1995? Take note: this is even before reading any comments, which are undoubtedly just ready and waiting to rip your engineer ego a new one.
Is YouTube F*cking With Your Master?
Initially, we were quite happy with the way YouTube represented our Tower Sessions audio and you could even upgrade the quality by switching over to a higher resolution if you wanted to (480p, 720p, 1080p and so on).
Unfortunately this doesn't seem to be the case anymore and there have been other tests performed by other fellow audiogeeks out there that have found that YouTube is now consistently streaming audio at a lowly 128kbps for all video resolutions.
Upon discovering this, I decided to conduct some tests of my own just to see if this was all true and I actually stumbled across something entirely different in the process!
YouTube is Most Likely Auto-Normalizing Audio for Uploads
I spent some time running the YouTube audio output through my Focusrite Saffire Pro 40 Stereo In Loop feature direct into my trusty Cubase DAW and recorded audio streaming from video to video (awesome functionality by the way: it lets you record anything playing back in your web browser, or anything else running on your computer for that matter).
I made sure to pick songs from different producers, artists, genres, etc. The YouTube audio controls were left maxed out and streaming into my DAW on one long continuous track.
After going through several videos, I noticed that while some of the videos were streaming your average "super-loud mastered" audio, most of them (especially those uploaded recently in 2015) were averaging at relatively "soft" -4 to -3 on my Cubase meters.
At first, I chalked it up to engineers uploading softer masters in general, since the whole loudness wars thing has been debated to death... but then I noticed that almost every audio stream I had recorded was being displayed at the same -4 to -3 readings from video to video.
Here's a long audio recording of several different YouTube videos played in succession. The waveforms look different but most of them averaged at around -4 to -3. Seen above, only 2 of the songs I randomly selected were still as loud as your usual master. The waveforms are displayed in white against a yellow background.
So, here I started thinking: there's just no way that ALL the engineers who mixed all these different songs, between different genres and different artists, are ALL reading or watching the exact same "YouTube Audio Loudness" tutorial!
Even our own catalog of uploads will differ from year to year as our own tastes and audio preferences mature through time. It's just impossible that everyone was uploading this audio at the same average loudness, So, I decided to conduct a test to be sure.
Testing 1, 2 , 3...
I used some of my favorite loudness measurement tools and plugins (any true audiogeek knows what I'm talking about) and measured the loudness of one of my own recent uploads. The audio I mastered for it was leveled out like any other modern loud master would be. I gave it a few measures of headroom for the AAC conversion that our Tower of Doom video team uses for our YouTube uploads. I quickly dialed in some rough compliance settings dictated by MFIT loudness standards, and called it a day.
SHAMELESS PLUG: By the way, I am a certified Mastered for iTunes mastering engineer for any of you in the Philippines that might need one... as far as I know, there aren't too many of us down here in the ASEAN region so please feel free to let me know if you guys need any help with MFIT compliance.
We then uploaded the video to YouTube and once it went live I took sample recordings of my uploaded audio, one at every resolution available (this was to help me wrap my head around the whole 128 Kbps issue at different video resolutions) and I saved them within my Cubase project.
Here is the same upload being recorded at different video resolutions.
They are all playing at 128 Kbps.
A week later, I rerecorded the same uploaded audio into the same Cubase project, making sure that all of the same factors were in place so I could easily compare the 2 recordings.
Low and behold, there it was! YouTube had auto normalized my initial audio by the same -4 to -3 db. It was clear as day from the moment I saw the waveform coming together within my project and the week old audio was actually visibly smaller within the waveform display than the one I had recorded on the upload day.
The file to the left is the recorded audio from upload day. Notice the Cubase channel meter displaying -0.0.
The file to the right is the audio from a week later. The channel meter is now displaying -3.9 for the same audio.
My theory is that YouTube is auto-normalizing audio for uploaded content across the board. It will leave things as is initially when your upload is fresh, but will end up normalizing your audio as soon as it finds time to get around to it!
During this test, I also decided to compare waveforms from past uploads in the same manner and also found the same things. Several of our videos that were all uploaded close to your usual loud master levels were all now normalized to the same levels retroactively.
It should be said though; some of our Tower of Doom uploads have not been affected just yet and are still playing at the super loud standard you'd expect. I found this to be true for other uploads as well around YouTube, but I actually had a harder time finding loud uploads overall. This certainly wasn't the case only a few months ago in 2014. Is it only a matter of time before all of them are normalized this way?
YouTube is a BEAST
All of this is particularly interesting to me because it means that YouTube is somehow feeding ALL the audio from ALL their videos through some sort of normalizing beast of an algorithm, that is churning out re-mastered versions for uploads.
This is not only happening to brand new uploads rather quickly (one week after the fact is an incredible turn around time if you consider that almost 300 hours of video is uploaded to YouTube every single minute) but it is also happening to ALL of the existing content already uploaded to their site! Audio quality aside, this really must be some sort of technical feat to put into motion.
Another possibility could be that YouTube is encoding the video playback in real-time along with the audio normalization. This would explain how it is retroactively adjusting other videos from the past but it also doesn't explain the random old videos that haven't been affected or the fact that my upload was not normalized during the initial recording on day one.
Questions, questions, questions...
I think all of this is actually a good thing, but it sure is surprising that no one from YouTube or Google has really announced it. All I found searching for it were Google and YouTube support forum requests for an auto normalize feature, so it would seem that they are actually rolling this out silently.
At the end of the day, this will possibly bring life and dynamics back to the audio engineer realm. It won't really matter for audio to be mastered super loud anymore, because an auto-normalizing feature will make crushed masters sound quite small after being processed.
I found this to be true for some of the louder masters we uploaded over the last three years that were affected by YouTube's alleged normalizing feature. In comparison to the softer uploads that were also normalized, the louder masters were given less perceptive volume by the algorithm and sounded weaker or smaller by default.
This is most likely an effect of less dynamic transient information being available in the waveform upon encoding. I'm guessing the algorithm probably analyzes this and gives louder uploads lower overall volume capacity, thus normalizing the file to a much softer perceptive volume than more dynamic uploads with more transient information available within the file.
Of course, all of this is really just conjecture until we have someone from within YouTube or Google come out with some real specs about the issue. Until then, I hope this analysis helps any of you out there wondering why your uploads are sounding different!
Here's a video explanation (from the f*ckin man, Ian Shepherd #masteringlikeaboss) of how crushing transients in exchange for loudness will affect your audio upon auto-normalization. Of course, listening to everything through your studio monitors at 128Kbps kind of defeats the purpose but the argument is explained pretty well. Though I personally believe in adjusting my own loudness tastes in order to properly serve our clients at Tower of Doom, I think it's also really important for engineers to appreciate creating music in both forms (softer with dynamic punch / super loud ghetto blaster explosion). Everyone should at least try to do a little bit of both and then at least you've learned a bit more.
Hi everyone! This is AudioGeek--a new segment I decided to start that will contain more technical audio engineer information for any of you out there looking for new resources.
Back in 2001-2002 when I started my journey as an audio engineer and producer, there was nothing available to me other than super technical forums and websites that I had to wade through on my own. I only had a very basic understanding of audio engineering in general and it was very difficult to learn how to properly create and record music without someone to guide me.
Nowadays, things are WAY different and there is a YouTube tutorial for almost anything I can think of! Sometimes, I wish I could rewind time and tell myself from all those years ago about all the mistakes I was making on my work, but it is what it is and here we are today, with more help available than most of us even know what to do with, all at the cost of a quick Google search.
I'm sure any of you budding producers out there will know: there is already a WEALTH of audio engineering tutorials available online and I really don't want to waste anyone's time repeating anything. Instead, I'd like to try to write about things that haven't been talked about too much and add to the knowledge that is already out there.
I spend most of my time working in Tower of Doom studios so I don't really know how often I will be able to work on these posts but let's see how it goes! If any of you have any ideas for topics or questions you'd like me to answer, please feel free to let me know in the comments below and I will try my best to get back to you.
Hopefully, I can write these articles for any of you out there that may be interested in learning more about audio production and the like. I only hope I can offer some help to someone out there that is starting out like I was all those years ago, with no one to approach for guidance.
Thanks and I hope you guys enjoyed my first article!