Audio Bitdepth

From wiki.jriver.com
Revision as of 15:40, 3 May 2015 by Glynor (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Sound is simply a wave, and digital audio is the digital representation of this wave. The digital representation is achieved by "sampling" the magnitude of an analog signal many times per second. This can be thought of conceptually as recording the "height" of the wave many times per second.

Audio Bitdepth is a measure of how precisely the height of the wave is measured.

Bits

There are several ways to describe the precision used for measuring the height of the sound wave.

One common unit in digital audio, and the unit used inside Media Center, is bits. This is where the name bitdepth comes from.

Bitdepth describes the number of 0's or 1's (computers are binary) used for each height measurement of the sound wave.

For example, an audio CD is 16bit. This means each measurement of the sound wave will have one of 65536 (2^16) values.

A good DAC is 24bit, meaning each measurement will have one of 16777216 (2^24) values.

Decibels (S/N ratio)

It's also possible to represent the bitdepth in decibels. Each bit is worth 6dB, so:

  • 16bit = 96dB S/N
  • 24bit = 144dB S/N

Conversion

Converting from less bits to more bits is perfectly lossless. Conceptually, imagine adding bits like adding zeroes at the end of a decimal. For example, the number "10" might become "10.0" or "10.00" if you add more bits, but all three representations are perfectly identical.

When Media Center inputs data, all audio is first converted to 64bit. This ensures that any processing like digital volume, Replay Gain, or any other DSP (if any is enabled) is done with as much precision as possible. It also puts the data into a format that is efficient for a computer to handle, and makes it so that tracks of varying bitdepths can seamlessly transition.

When outputting data to a soundcard or DAC, the 64bit data is converted back to the format required by hardware. This is often 24bit for high-end DACs.

The transition from 16bit to the output bitdepth (often 24bit) is bit-perfect. Again, it's like "10" vs "10.0".

Output Bitdepth

It is recommended to output to your soundcard or DAC using the highest bitdepth that the hardware supports. This is 24bit for most high-end DACs.

If you play 16bit input, you might feel inclined to output 16bit data even though your DAC is 24bit. This will at best sound the same as outputting 24bit. But it has two important drawbacks:

  • Transitioning between 16-bit and 24-bit source material will require reopening the audio hardware (so make gapless transitions impossible)
  • If you apply any digital processing, including volume, the sound quality will be worse

ASIO

If you use ASIO, the output bitdepth selection is ignored. This is because ASIO automatically delivers audio to the soundcard in the highest bitdepth that it supports. In other words, the ASIO framework (which is very good) has the advice above about using the highest bitdepth available built in.

Bit-Perfect

The precision offered by Media Center's 64bit audio engine is billions of times greater than the best hardware can utilize. In other words, it is bit-perfect on all known hardware.

To demonstrate the incredible precision of 64bit audio, imagine applying 100 million random volume changes (huge changes from -100 to 100 dB), and then applying those same 100 million volume changes again in the opposite direction.

Amazingly, you will have the exact same signal at 32bit after 200 million huge volume changes as when you started.

In other words, this incredible number of changes results in a bit-perfect output at 32bit, which is the highest hardware output bitdepth (most high-end hardware is 24bit).

This also means one volume change or a series of 100 million volume changes that add up to the same net result is bit-identical.

Bitdepth of Lossy Formats

Lossy formats like MP3 use floating point math to build their output values. There is no true or correct bitdepth in this case.

Media Center preserves the full 64bit precision when converting from the lossy format to PCM and uses as much precision as possible during output.

In other words, taking a 16bit input file, encoding as MP3, and then playing it in Media Center will cause 64bit data to be delivered to the playback engine. This does not mean that the file has improved, only that Media Center does the best job possible dealing with the MP3 (or other lossy) data that is not inherently precise to some number of bits.

C++ Proof of Example Above

This is the proof of the example listed above. It's only interesting if you are a programmer. It's provided here for completeness sake, and to show that the information above is based on real-world, repeatable, results.

// constants
const int nIterations = 100 * MILLION; // apply lots of volume changes
double dValue = 0.9323402123; // starting value (doesn't really matter what value we choose since it changes by huge random amounts)


// take a snapshot of the value at 32-bit (the highest used bitdepth of any hardware; most high-end hardware is 24-bit)
int nValue32Bit1 = CConvertFromFloatToInteger<32>::Convert(dValue);


// apply lots of random volume changes
JRDoubleArray aryDecibelChanges;
for (int i = 0; i < nIterations; i++)
{
double dDecibelsChange = (fabs(dValue) > 1.0) ? JRMath::GetRandomNumber(0.0f, -100.0f) : JRMath::GetRandomNumber(0.0f, 100.0f);
dValue *= CDecibels::GetMultiplierFromDecibels(dDecibelsChange);
aryDecibelChanges.Add(dDecibelsChange);
}


// apply inverse of volume changes
for (int i = 0; i < nIterations; i++)
{
double dDecibelsChange = -aryDecibelChanges[i];
dValue *= CDecibels::GetMultiplierFromDecibels(dDecibelsChange);
}


// take another snapshot of the value at 32-bit after all the volume changes
int nValue32Bit2 = CConvertFromFloatToInteger<32>::Convert(dValue);


// test for changes
int nDelta = nValue32Bit1 - nValue32Bit2;
ASSERT(nDelta == 0);

More