Those of us who post “look at me play guitar” videos on YouTube are quite interested in the audio quality of the resulting clips. I have some ideas about how to optimize uploads and downloads, and also some recordings that you can use to judge the current audio quality, at least as it pertains to solo acoustic guitar.
The first thing to say loudly is that YouTube is a rapidly changing environment. In five years we’ve gone from heavily compressed SD to 1080p HD and more. The audio has improved from 94 kbps AAC audio to a recent high of 152 kbps. The future will see new upload formats, new transcoding tools, new download formats and bandwidth. So anything in this post is subject to change at any time.
Everything Gets Recompressed
As far as I know, YouTube always transcodes (recompresses) every upload. This is their stated process and I’ve never seen evidence otherwise. Folks who try to choose a format that avoids transcoding are going in the wrong direction. Instead, it makes sense to me to search for the most recent guidelines from YouTube and follow them. Here’s the YouTube help page “Video Encoding”. There’s a lot of information for specific video editors, a list of supported formats, and more. But I would boil the basic advice down to this – upload high quality video and audio.
Good Video Gets Good Audio
Since we’re focusing on audio quality here, let me start out with an important bit of knowledge that might seem contradictory. Your video format can limit your audio quality. YouTube delivers video downloads in several formats and sizes, but the largest format it will provide is the one you uploaded. So if you provide a 360P or 480P file YouTube will offer nothing larger than that for download. More significantly, the audio in that download will only be 129 kbps AAC compared to the 152 kbps delivered with higher resolution video formats.
In order to get the maximum quality audio with your YouTube download, your upload should be at least 720P (1280×720 pixels). The quality of your audio should be as high as possible, preferably PCM (WAV or uncompressed) if possible, otherwise a high bit rate MP3 or AAC. This results in large files and long uploads, but the current YouTube uploader has been quite reliable for me, and the upload can take place in the background or unattended.
To give a more concrete idea of what I’m talking about, here are the statistics on one of my clips, Uncle Sonny Chillingworth’s Slack Key #2 (Mahina’s Trot) http://www.youtube.com/watch?v=RjV1zzM6RDM posted at my slack key channel. http://www.youtube.com/franguidry
|240||426×240||270 Kbps||MP3||64 Kbps|
|360||640×360||298 Kbps||AAC||129 Kbps|
|480||854×480||484 Kbps||AAC||129 Kbps|
|720||1280×720||1242 Kbps||AAC||152 Kbps|
|1080||1920×1080||2873 Kbps||AAC||152 Kbps|
We can see three levels of audio quality along with the five levels of video. The 1080 file delivers 10 times the video data contained in the 240 stream, so we need a pretty high speed connection to view these files without buffering and pausing. The 720 stream is the smallest video format that delivers the 152 Kbps audio. For most of us the 720 stream is the best choice, since it’s more likely to stream smoothly but includes the highest quality audio.
What Do You Hear?
That brings up a fascinating question – how good is YouTube audio? It certainly has been terrible in the past, at least my results were terrible and many other people complained. Have the recent upgrades improved the quality enough to match the quality of the original file we uploaded? If you’ve read some of my other posts about comparing audio, you know what I’m going to say: ABX! Listening tests that are not well controlled are worse than no test at all, because they convey bad information. Good listening tests are level matched, same performance, and double blind.I’m going to make a couple of files available for you to compare, one extracted from the video I uploaded, the other from the same video after YouTube processed it and I downloaded it. In the past I’ve sent the comparison key in response to any post or email requesting it, but this time I’ll only send the key if you convince me that you performed an ABX test on the files.
What Is ABX?
ABX is a testing technique for confirming audible difference between two test subjects. The two subjects are A and B, one of them is presented as X, the unknown. The listener identifies X as either A or B, and repeats this evaluation enough times to eliminate the likelihood of successful guessing. For hardware devices it requires a lot of trickery and trouble, but for digital recordings ABX is a snap. There are free ABX tools for both the PC and Mac. This post at the Hydrogenaudio Forum describes ABX in more detail and explains why a number of trials is necessary. This Wiki article explains why the rule of thumb for a demonstration of difference is 13 successes out of 16 trials.
I created a blog post and a video explaining how to download and use the foobar2000 ABX tool. Please note that my current advice on the number of trials would be a minimum of 16 trials with at least 13 correct answers.
Mac users can download ABXer, another free utility that manages ABX sessions for you.
For Your Listening Pleasure
So here are a couple of files you can download and compare:
These are the audio tracks from a video I posted recently on my slack key channel, Uncle Sonny Chillingworth’s Slack Key #2 (Mahina’s Trot). I shot this video on a Panasonic Lumix GH2 and captured the audio using a Sony PCM-D50. I tweaked the audio a bit when I edited the video in Edius and exported a Blu-ray format video clip with a 48 Khz 16 bit stereo PCM audio stream (I used 48 Khz because that is the default in Edius, not because I think the audio quality is any different from 44.1 Khz). This is an uncompressed format, like WAV or AIFF and requires a bit rate of 1536 Kbps, quite a big pipe. When I captured the YouTube 1080 download, the audio had been compressed to a 152 Kbps 44.1 Khz stereo AAC stream. The audio is 1/10th the size after YouTube processes it. That audio compression is very useful, because it delivers smooth streaming with a slower connection and it leaves more bandwidth for the video side. But does the result deliver satisfactory audio quality? I used FFmpeg to extract the audio from the original upload clip. I loaded the 1080 download MP4 from YouTube into REAPER and rendered the audio stream as a 48 Khz 16 bit PCM stereo file, matching the one from the upload clip. Of course, REAPER also uses FFmpeg to manage video. Now it’s up to you. Download the files, they’re big so be prepared for a wait. Download and install the ABXer for your OS. Take 16 tries at telling the files apart, and let me know how it goes. Can you tell the difference between the orignial and the file with 10 to 1 compression?
The reason I need your help is that it is impossible for me to prove the files don’t sound different. I can only demonstrate that I personally can’t hear a difference, and I’m not very motivated to hear the difference. I’m biased. Someone who is confident that a difference exists is likely to try harder, listen more intently and longer, to find that difference. So if a few of you can demonstrate that you hear a difference, I’m going to agree with you.
This entry was posted on Saturday, March 17th, 2012 at 12:34 pm and is filed under Audio, Comparisons, Video. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.