The No.1 Website for Pro Audio
 Search This Thread  Search This Forum  Search Reviews  Search Gear Database  Search Gear for sale  Search Gearslutz Go Advanced
YouTube Sample Rate for Music Videos Video Editing Software
Old 2 weeks ago
  #1
Audio Alchemist
 
Lagerfeldt's Avatar
YouTube Sample Rate for Music Videos

There's a lot of confusion about what sample rate to use for YouTube.

YouTube's own guidelines for music videos
Bit depth: 24 bit is recommended, 16 bit accepted
Sample rate: 44.1 kHz is recommended, higher rates accepted
Channels: Stereo
Format: PCM (WAV or AIFF)

If you insist on delivering compressed audio the guidelines state:

Codec: AAC-LC
Sample rate: 44.1Khz
Bit rate: 320kbps or higher, 256 kbps accepted
Channels: Stereo

(source: Encoding specifications for music videos - YouTube Help)

YouTube's own guidelines for other or general material
Codec: AAC-LC
Channels: Stereo or Stereo + 5.1
Sample rate: 96khz or 48khz

Other formats are clearly accepted even though it's not stated, but the above is the official recommendation. Apparently YouTube doesn't want uncompressed audio here, which means transcoding is bound to happen. D'oh!

Other tests I've done don't show any penalty to the video quality if you upload full quality PCM audio instead of the recommended AAC, so there you go. I did this test to see if the total size of the video (i.e. the audio contribution in terms of file size) would compromise the video quality.

(source: Recommended upload encoding settings - YouTube Help)

Test
Obviously there's some potentially conflicting information so I decided to test what actually sounds or measures best, and what happens at different source sample rates.

Test files
· 24 bit PCM WAV
· 20 Hz to 20 kHz slow sine sweep generated at the actual target sample rate
· Identical sample peak, true peak and integrated loudness, headroom of 6 dB
· Accompanying video is 1080p HQ with no lossy compression of audio.

Why a simple sine sweep? It covers the full frequency spectrum and it's easy to detect distortion or aberations. Actual music might mask problems in the conversion file (which could be a good thing in real life, though).

Maybe I'll do a square wave test as well to see what happens with inter-sample peaks and how this could also affect the loudness reading slightly in a negative fashion.

Captured audio
· Converted audio from YouTube was captured digitally
· Converted audio was sharply low pass filtered @ 15 kHz by YouTube
· Converted audio was lowered in level by YouTube to match YouTube’s loudness normalization target

System playback @ 44.1 kHz

Results
Uploaded source: 44.1 kHz
Aliasing artifacts: None
Other noise: Very low noise around the AAC noisefloor @ 20-500 Hz
True Peak level: -14.93 dBTP
Sample peak level: -15.01 dBFS
Max. momentary loudness: -13.1 LUFS (I) (loudest in test)
Total RMS level: -16.52 dB (loudest in test)

Uploaded source: 48 kHz
Aliasing artifacts: Yes
Other noise: No
True Peak level: -15.63 dBTP (lowest true peak in test)
Sample peak level: -15.69 dBFS (lowest sample peak in test)
Max. momentary loudness: -13.2 LUFS (I)
Total RMS level: -16.53 dB

Uploaded source: 96 kHz
Aliasing artifacts: None
Other noise: No
True Peak level: -14.87 dBTP (highest true peak in test)
Sample peak level: -14.93 dBFS (highest sample peak in test)
Max. momentary loudness: -13.2 LUFS (I)
Total RMS level: -16.53 dB

Conclusion
It's hard to say what will sound worse on YouTube, but the 48 kHz conversion is the only one that has aliasing artifacts, which is a big no-no in my book. The 48 kHz version also features the lowest peak values after conversion for some reason.

Distortion is almost similar between the 44.1 and 96 kHz files, but while the 48 kHz file seems to have fewer problems in the low mids it has distortion in more areas in the high end.

On the other hand the 44.1 kHz conversion was the only one to feature some measurable noise around the AAC noisefloor, approximately at -100 dBFS, which I would expect to either be in all or none of the conversions.

For now I think it's safe to conclude at if you've mastered a track with a target rate of 44.1 kHz for digital aggregation or CD there's no point in doing a specific 48 kHz version for YouTube, it could even be detrimental to the sound.

However, a 96 kHz version shouldn't be bad if you're already working at that sample rate, but upsampling later won't help.

Disclaimer and other practical considerations
Here's the catch:

Many or most video editors by default work in 48 kHz sessions and will import a 44.1 kHz master into that session, automatically converting it to 48 kHz using what ever crap SRC is built into the video software. It's lazy, but that's the way it is.

So ironically you might be better off doing your own high quality 48 kHz SRC using e.g. Izotope RX7 for the video guy/gal even though it's likely worse for YouTube in the end. A better solution would be to educate the video editor and label, if you have the guts, but then there's VEVO...

VEVO is apparently different. Specs here are 44.1 kHz for AAC, but 24 or 16 bit 48 kHz for PCM, with Little Endian byte order during video export. So there's no way around a 48 kHz version for VEVO if the specs are to be believed, i.e. the video might be rejected on upload if it's PCM and not 48 kHz.

Pics

44.1 kHz sine sweep after conversion


48 kHz sine sweep after conversion, notice the aliasing lines in the background not present in the other conversions


96 kHz sine sweep after conversion

Download or open in new tabs to blow up details.

Last edited by Lagerfeldt; 2 weeks ago at 04:04 PM..
Old 2 weeks ago
  #2
Lives for gear
 
Virtalahde's Avatar
 

Verified Member
Thanks for the insight, good stuff!

Quote:
Originally Posted by Lagerfeldt View Post
Many or most video editors by default work in 48 kHz sessions and will import a 44.1 kHz master into that session, automatically converting it to 48 kHz using what ever crap SRC built into the editor.

So ironically you might be better off doing your own high quality 48 kHz SRC'ed for the video editor even though it's likely worse for YouTube.
Which is exactly what I do. If SRC results in overs, I SRC an unlimited master and re-apply the limiter settings @ 48khz. Usually I also lower the ceiling a little further.

Many people editing videos also seem to be completely unaware of what sample rate they are using. I have battled this many times..
Old 2 weeks ago
  #3
Audio Alchemist
 
Lagerfeldt's Avatar
It's all going to play through an iPhone anyway. At 48 kHz. ;-)
Old 2 weeks ago
  #4
Lives for gear
Thanks for this Holger!
Old 2 weeks ago
  #5
Lives for gear
 
Giuseppe Zaccaria's Avatar
 

Thanks for the detailed insight Holger!
My experience with this subject is that 44.1khz 24bit is the best one for streaming services, after extensive listening.
Old 2 weeks ago
  #6
Audio Alchemist
 
Lagerfeldt's Avatar
Many streaming services don't accept 24 bit files yet. So 16 bit is still necessary in many places.

The difference between a dithered 16 bit master and a 24 bit master is negligible, but a different sample rate can wreak havoc with the material in comparison.
Old 2 weeks ago
  #7
This is great Lagerfeldt!
I've been meaning to research this for a long time - you saved me the trouble.
thanks!

C
Old 2 weeks ago
  #8
Here for the gear
 

Quote:
Originally Posted by Lagerfeldt View Post
Test files
· 24 bit PCM WAV
· 20 Hz to 20 kHz slow sine sweep generated at the actual target sample rate
· Identical sample peak, true peak and integrated loudness, headroom of 6 dB
· Accompanying video is 1080p HQ with no lossy compression of audio.
Lagerfeldt, What video codec did you use?

It's most common for video editors to use h.264 for YouTube, which doesn't even have the option to contain lossless wav audio, only AAC or mp3 (at least it doesn't have this option in Premiere...)
Thank you!
Old 1 week ago
  #9
Audio Alchemist
 
Lagerfeldt's Avatar
Quote:
Originally Posted by DooBop View Post
Lagerfeldt, What video codec did you use?

It's most common for video editors to use h.264 for YouTube, which doesn't even have the option to contain lossless wav audio, only AAC or mp3 (at least it doesn't have this option in Premiere...)
Thank you!
The H.264 standard does include the option for lossless audio.

In fact I did several H.264 encoded (test) videos with lossless audio, but Apple ProRes 422 is my preferred codec after extensive comparisons.

As you correctly point out, for some reason Adobe Premiere Pro doesn't have an option for lossless audio with the standard H.264 export, only with H.264 for Blu-Ray via Adobe Media Exporter. I guess audio isn't their strong side.

I use Final Cut Pro X and Apple Compressor, which allows for very detailed export settings and encoding tweaks, including H.264 encoding with lossless linear PCM audio.
Old 1 week ago
  #10
Here for the gear
 

Thanks Lagerfeldt!

Quote:
Originally Posted by Lagerfeldt View Post
System playback @ 44.1 kHz
Does this mean that YouTube re-samples everything to 44.1 kHz ?
If yes, it only makes sense that 44.1 kHz file will sound better eventually.

I also assume that the audio is being converted by YouTube to a compressed AAC or mp3..?
In this case, I wonder what's better-
A. To upload the best sounding WAV file and let YouTube compress it, or
B. To try and find out what are the exact specification of the compression YouTube uses, do it myself, and hopefully the audio will be left untouched by YouTube.
Old 1 week ago
  #11
Audio Alchemist
 
Lagerfeldt's Avatar
Option A is the only real choice.

YouTube will always encode your audio.

This potentially leads to transcoding (=double encoding) if your audio is already lossy encoded, regardless of how closely you match YouTube's own settings in your uploaded video.

AFAIK YouTube still uses a variety of audio codecs for different platforms, including AAC, Opus, Vorbis, and even MP3. Anything you give YouTube will be chewed up and spat out in these formats depending on what platform is used to play the video.
Old 1 week ago
  #12
Here for the gear
Thanks a lot Lagerfeld ! I've been struggling with this for a long time.... Like mastervargas said you saved me the trouble.

I'm still wondering about 24 vs 16 bits for Youtube?

Quote:
Originally Posted by Lagerfeldt View Post

On the other hand the 44.1 kHz conversion was the only one to feature some measurable noise around the AAC noisefloor, approximately at -100 dBFS, which I would expect to either be in all or none of the conversions.
With -100 dBFS noisefloor it seems that we are closer to 16 than 24 bits dynamic. To my mind it's still useful to upload 24 bit master to avoid the dither noise from the 24-> 16 conversion.

What do you think ?
Old 1 week ago
  #13
Audio Alchemist
 
Lagerfeldt's Avatar
That's a 16 bit-ish noisefloor yes.

It's still prudent to upload the 24 bit version instead of 16 bit, though.

Mainly theoretically, since the difference between a dithered 16 bit and original 24 bit source is already extremely low. After lossy conversion I doubt you can even measure the difference, let alone hear it.

The reason isn't to avoid the dither noise from 24 bit > 16 bit dithered per se, but to let the lossy encoding process have as many "original" bits to work with.

Lossy formats like AAC don't have a fixed bit depth and can theoretically benefit from having a 24 bit source.
Old 1 week ago
  #14
Lives for gear
 
cheu78's Avatar
 

Verified Member
Quote:
Originally Posted by Lagerfeldt View Post
There's a lot of confusion about what sample rate to use for YouTube.

YouTube's own guidelines for music videos
Bit depth: 24 bit is recommended, 16 bit accepted
Sample rate: 44.1 kHz is recommended, higher rates accepted
Channels: Stereo
Format: PCM (WAV or AIFF)

If you insist on delivering compressed audio the guidelines state:

Codec: AAC-LC
Sample rate: 44.1Khz
Bit rate: 320kbps or higher, 256 kbps accepted
Channels: Stereo

(source: Encoding specifications for music videos - YouTube Help)

YouTube's own guidelines for other or general material
Codec: AAC-LC
Channels: Stereo or Stereo + 5.1
Sample rate: 96khz or 48khz

Other formats are clearly accepted even though it's not stated, but the above is the official recommendation. Apparently YouTube doesn't want uncompressed audio here, which means transcoding is bound to happen. D'oh!

Other tests I've done don't show any penalty to the video quality if you upload full quality PCM audio instead of the recommended AAC, so there you go. I did this test to see if the total size of the video (i.e. the audio contribution in terms of file size) would compromise the video quality.

(source: Recommended upload encoding settings - YouTube Help)

Test
Obviously there's some potentially conflicting information so I decided to test what actually sounds or measures best, and what happens at different source sample rates.

Test files
· 24 bit PCM WAV
· 20 Hz to 20 kHz slow sine sweep generated at the actual target sample rate
· Identical sample peak, true peak and integrated loudness, headroom of 6 dB
· Accompanying video is 1080p HQ with no lossy compression of audio.

Why a simple sine sweep? It covers the full frequency spectrum and it's easy to detect distortion or aberations. Actual music might mask problems in the conversion file (which could be a good thing in real life, though).

Maybe I'll do a square wave test as well to see what happens with inter-sample peaks and how this could also affect the loudness reading slightly in a negative fashion.

Captured audio
· Converted audio from YouTube was captured digitally
· Converted audio was sharply low pass filtered @ 15 kHz by YouTube
· Converted audio was lowered in level by YouTube to match YouTube’s loudness normalization target

System playback @ 44.1 kHz

Results
Uploaded source: 44.1 kHz
Aliasing artifacts: None
Other noise: Very low noise around the AAC noisefloor @ 20-500 Hz
True Peak level: -14.93 dBTP
Sample peak level: -15.01 dBFS
Max. momentary loudness: -13.1 LUFS (I) (loudest in test)
Total RMS level: -16.52 dB (loudest in test)

Uploaded source: 48 kHz
Aliasing artifacts: Yes
Other noise: No
True Peak level: -15.63 dBTP (lowest true peak in test)
Sample peak level: -15.69 dBFS (lowest sample peak in test)
Max. momentary loudness: -13.2 LUFS (I)
Total RMS level: -16.53 dB

Uploaded source: 96 kHz
Aliasing artifacts: None
Other noise: No
True Peak level: -14.87 dBTP (highest true peak in test)
Sample peak level: -14.93 dBFS (highest sample peak in test)
Max. momentary loudness: -13.2 LUFS (I)
Total RMS level: -16.53 dB

Conclusion
It's hard to say what will sound worse on YouTube, but the 48 kHz conversion is the only one that has aliasing artifacts, which is a big no-no in my book. The 48 kHz version also features the lowest peak values after conversion for some reason.

Distortion is almost similar between the 44.1 and 96 kHz files, but while the 48 kHz file seems to have fewer problems in the low mids it has distortion in more areas in the high end.

On the other hand the 44.1 kHz conversion was the only one to feature some measurable noise around the AAC noisefloor, approximately at -100 dBFS, which I would expect to either be in all or none of the conversions.

For now I think it's safe to conclude at if you've mastered a track with a target rate of 44.1 kHz for digital aggregation or CD there's no point in doing a specific 48 kHz version for YouTube, it could even be detrimental to the sound.

However, a 96 kHz version shouldn't be bad if you're already working at that sample rate, but upsampling later won't help.

Disclaimer and other practical considerations
Here's the catch:

Many or most video editors by default work in 48 kHz sessions and will import a 44.1 kHz master into that session, automatically converting it to 48 kHz using what ever crap SRC is built into the video software. It's lazy, but that's the way it is.

So ironically you might be better off doing your own high quality 48 kHz SRC using e.g. Izotope RX7 for the video guy/gal even though it's likely worse for YouTube in the end. A better solution would be to educate the video editor and label, if you have the guts, but then there's VEVO...

VEVO is apparently different. Specs here are 44.1 kHz for AAC, but 24 or 16 bit 48 kHz for PCM, with Little Endian byte order during video export. So there's no way around a 48 kHz version for VEVO if the specs are to be believed, i.e. the video might be rejected on upload if it's PCM and not 48 kHz.

Pics

44.1 kHz sine sweep after conversion


48 kHz sine sweep after conversion, notice the aliasing lines in the background not present in the other conversions


96 kHz sine sweep after conversion

Download or open in new tabs to blow up details.
Thanks a lot Lagerfeldt!




Cheu
Old 1 week ago
  #15
Lives for gear
 

Thanks for the information.

Just to clarify. Everything eventually plays at 44.1k on YouTube?

I tend to use mostly 48k, as it seems the best overall compromise. But it's the worst option for YouTube conversion?
Top Mentioned Products
Post Reply

Welcome to the Gearslutz Pro Audio Community!

Registration benefits include:
  • The ability to reply to and create new discussions
  • Access to members-only giveaways & competitions
  • Interact with VIP industry experts in our guest Q&As
  • Access to members-only sub forum discussions
  • Access to members-only Chat Room
  • Get INSTANT ACCESS to the world's best private pro audio Classifieds for only USD $20/year
  • Promote your eBay auctions and Reverb.com listings for free
  • Remove this message!
You need an account to post a reply. Create a username and password below and an account will be created and your post entered.


 
 
Slide to join now Processing…
Thread Tools
Search this Thread
Search this Thread:

Advanced Search
Similar Threads
Thread
Thread Starter / Forum
Replies
Spoiled / Video Production and Post-Production
8
captainj / Mastering for Beginners
12
Jovas / Newbie audio engineering + production question zone
5
tranel / Electronic Music Instruments and Electronic Music Production
4

Forum Jump
Forum Jump