View Single Post
Old 5th February 2007   #7
georgia
Lives for gear
 
georgia's Avatar
 
Join Date: Dec 2006
Location: NY NY
Posts: 1,249

Thread Starter
Some Dolby info regarding metadata

Since metadata like Dialnorm is becoming more and more important in post, here's some data from Dolby to spark some thinking...


Dolby ®
Metadata Guide

Dolby, Pro Logic, and the double-D symbol are registered trademarks of Dolby Laboratories.
Surround EX is a trademark of Dolby Laboratories. Issue 3
© 2005 Dolby Laboratories, Inc. All rights reserved. S05/14660/16797
Dolby Laboratories, Inc.

A Guide to Dolby Metadata

Metadata provides unprecedented capability for content producers to deliver the highest quality audio to consumers in a range of listening environments. It also provides choices that allow consumers to adjust their settings to best suit their listening environments. In this document, we first discuss the concept of metadata:
• Metadata overview
We then discuss the three factors controlled by metadata that most directly affect the consumer’s experience:
• Dialogue level
• Dynamic range control (DRC)
• Downmixing
Finally, we define each of the adjustable parameters, and provide sample combinations:
• Individual parameters
• Metadata combinations
1 Metadata Overview
Dolby® Digital and Dolby E are both data-rate reduction technologies that use metadata. Metadata is carried in the Dolby Digital or Dolby E bitstream, describing the encoded audio and conveying information that precisely controls downstream encoders and decoders. In normal operation, the encoded audio and metadata are carried together as a data stream on two regular digital audio channels (AES3, AES/EBU, or S/PDIF). Metadata can also be carried as a serial data stream between Dolby E and/or Dolby Digital equipment. Metadata allows content providers unprecedented control over how original program material is reproduced in the home. Dolby Digital is a transmission bitstream (sometimes called an emission bitstream) intendedfor delivery to the consumer at home through a medium such as DTV or
by one metadata stream. The consumer’s Dolby Digital decoder reproduces the program audio according to the metadata parameters set by the program creator, and according to settings for speaker configuration, bass management, and dynamic range that are chosen by the consumer to match his specific home theater equipment and environmental conditions.

Dolby E is a distribution bitstream capable of carrying up to eight channels of encoded audio and metadata. The number of programs ranges from one single program (Program Config: 5.1) to eight individual programs on a single Dolby E stream (Program Config: 8 × 1). Each program is discrete, with its own metadata in the Dolby E stream. Some metadata parameters in a Dolby E stream automatically configure a Dolby Digital encoder at the point of transmission, while others affect only the consumer’s Dolby Digital decoder operation. Dolby E is a professional technology used for broadcast applications, such as program origination and distribution; the Dolby E bitstream carries the entire metadata parameter set. Dolby Digital, used for consumer applications, such as transmission to the home or for DVD authoring, employs a subset of the full metadata parameter set called Dolby Digital metadata; the Dolby Digital bitstream carries only
those parameters necessary for proper decoding by the consumer. Metadata is first inserted during program creation or mastering, and is carried through transmission in a broadcast application or directly onto a DVD. The metadata provides control over how the encoded bitstream is treated at each step on the way to the consumer’s decoder.

Here’s an example of how it works:
In a broadcast truck parked outside a football stadium, the program mixer chooses the appropriate metadata for the audio program being created. The resulting audio program, together with metadata, is encoded as Dolby E and sent to the television station via fiber, microwave, or other transmission link. At the receiving end of this transmission, the Dolby E stream is decoded back to baseband audio and metadata.
The audio program and the metadata are monitored, altered, or re-created as other elements of thprogram are added in preparation for broadcast. This new audio program/metadata pair, reencoded as Dolby E, leaves the postproduction studio and passes through the television station to Master Control, where many incoming Dolby E streams are once again decoded back to their individual baseband digital audio/metadata programs. The audio program/metadata pair that is selected to air is sent to the transmission Dolby Digital encoder, which encodes the incoming audio program according to the metadata stream associated with it, thereby simplifying the transmission process. Finally, the Dolby Digital signal is decoded in the consumer’s home, with metadata providing the information for that decoding process. Through the use of metadata, the mixer in the truck has been able to control the home decoder for the sporting event, while segments such as news breaks, commercials, and station IDs are similarly decoded, each using metadata carried within each individual segment.

This control, however, requires the producer to set the metadata parameters correctly, since they affect important aspects of the audio—and can seriously compromise the final product if set improperly. Although most metadata parameters are transparent to consumers, certain parameters affect the output of a home decoder, such as downmixing for a specific speaker configuration, or when the consumer chooses
Dynamic Range Control to avoid disturbing family and neighbors.

TheDolbyEbitstream containsboththe5.1-and two-channel programs’ encodedaudio,andeach program'smetadata. TheDolbyDigital bitstream containsasingleprogram’s encodedaudioand
correspondingmetadata.

Metadata Flow from Production to Consumer
In the simplest terms, there are two functional classifications of metadata: Professional: These parameters are carried only in the Dolby E bitstream. They are used to automatically configure a downstream Dolby Digital encoder, allowing maximum control by the content producer over how the encoded bitstream istreated at each step on the way to the consumer’s decoder. Consumer: These parameters are carried in both the Dolby E and the Dolby Digital bitstream. The consumer’s Dolby Digital decoder uses these parameters to create the best possible audio program possible on each consumer’s playback system. Consumer parameters include the DRC values, which are ultimately enabled by the end user’s selection, as discussed in Section 3, Dynamic Range Control.

Both types of metadata can be examined, modified, or passed through during encoding. A/D Converter Type

Special Parameters
There are other professional parameters included in the Dolby E bitstream that are not under direct user control, such as Timecode and Pitch Shift.

Timecode
Dolby E bitstreams carry timecode information in hours:minutes:seconds:frames format.

Pitch Shift
The Pitch Shift parameter can be generated automatically by a Dolby E decoder to control the Dolby Model 585 Time Scaling Processor. If the input to the Dolby E decoder is not at normal play speed (as with varispeed or program play), then the Pitch Shift Code parameter indicates the amount of audio pitch shifting required to restore the original program pitch.

Dialogue Level
Dialogue Level (also known as dialogue normalization or dialnorm) is perhaps the single most important metadata parameter. The Dialogue Level setting represents the long-term A-weighted average level of dialogue within a presentation, Leq(A). This level can be quantified with the Dolby Model LM100 Broadcast Loudness Meter. When received at the consumer’s Dolby Digital decoder, this parameter setting determines a level shift in the decoder that sets, or normalizes, the average audio output of the decoder to a preset level. This aids in matching audio volume between program sources. In broadcast transmission, the proper setting of Dialogue Level ensures that the consumer receives a standard listening level, so switching channels or watching a television program through the commercial breaks doesn’t require adjusting the volume. Using the same standard for all content, whether conveyed by broadcast television, DVD, or other media, enables the consumer to switch between sources and programs while maintaining a comfortable and consistent listening level. The proper setting of the Dialogue Level parameter also enables the Dynamic Range Control profiles chosen by the content producer to work as intended in less-than- optimal listening environments, and is essential in any content production, whether it is for transmission in a broadcast stream or for direct distribution to consumers, as with DVDs.
Note: Programs without dialogue, such as an all-music program, still require a careful setting of the Dialogue Level parameter. When setting the parameter for such content, it is useful to compare the program to the level of other programs. The goal is to allow the consumer to switch to your program without having to adjust the volume control.

The Scale
The scale used in the Dialogue Level setting ranges in 1 dB steps from –1 to –31 dB. Contrary to what you might assume at first, a setting of –31 represents no level shift in the consumer’s decoder, and –1 represents the maximum level shift. Here’s why: Dolby Digital consumer decoders normalize the average output level—that is, the output level averaged over time using the equivalent loudness method, Leq(A)—
to –31 dBFS (31 dB below 0 dB full-scale digital output) by applying a shift in level based on the Dialogue Level parameter setting. Note: The –31 dBFS Leq(A) should not be confused with the station reference level (often –18 or –20 dBFS). It is common to have different Leq(A) values for program material that has the same reference level. An average loudness level of –31 dBFS Leq(A) is quite compatible with facilities running at a
variety of reference levels. When a decoder receives an input signal with a Dialogue Level setting of –31, it applies no level shift to the signal because this indicates to the decoder that the signal already matches the target level and therefore requires no shift. In contrast, a louder program requires a shift to match the –31 dB standard. When the Dialogue Level parameter setting is –21, the decoder applies a 10 dB level shift to the signal. When the setting is –11, it applies a 20 dB level shift, and so on.
A Simple Rule:
31 + (dialogue level value) = Shift applied
Example:
31 + (–21) = 10 dB

The most important point to remember is that in setting the Dialogue Level parameter, you are providing your listener with an essential service. For your listeners, setting this level properly means:
• The volume level is consistent with other programs.
• The DRC profiles you make available to them work as you intend.
Once dialogue level is set, you can set up DRC profiles to further benefit the consumer.

Dynamic Range Control
Different home listening environments present a wide range of requirements for dynamic range. Rather than simply compressing the audio program at the transmission source to work well in the poorest listening environments, Dolby Digital encoders calculate and send Dynamic Range Control (DRC) metadata with the signal. This metadata can then be applied to the signal by the decoder to reduce the signal’s
dynamic range. Through the proper setting of DRC profiles during the mastering process, the content producer can provide the best possible presentation of program content in virtually any listening environment, regardless of the quality of the equipment, number of channels, or ambient noise level in the consumer’s home.Many Dolby Digital decoders offer the consumer the option of defeating the Dynamic
Range Control metadata, but some do not. Decoders with six discrete channel outputs (full 5.1-channel capability) typically offer this option. Decoders with stereo, mono, or RF-remodulated outputs, such as those found on DVD players and set-top boxes, often do not. In these cases, the decoder automatically applies the most appropriateDRC metadata for the decoder’s operating mode. The Dolby Digital stream carries metadata for the two possible operating modes in the decoder. The operating modes are known as Line mode and RF mode due to the type of output they are typically associated with. Line mode is typically used on decoders with six- or two-channel line-level outputs and RF mode is used on decoders that have an RF-remodulated output. Full-featured decoders allow the consumer to select whether to use DRC and if so, which operating mode to use. The consumer sees options such as Off, Light Compression, and Heavy Compression instead of None, Line mode, and RF mode. Advanced decoders may also allow custom scaling of the DRC metadata. All that needs to be done during metadata authoring, or encoding, is selection of the dynamic range control profiles for Line mode and RF mode. The profiles are described in the following sections.
Note: While the use of DRC modes during decoding is a consumer-selectable feature, the Dialogue Levelparameter setting is not. Therefore, setting the Dialogue Level parameter properly is essential before previewing a DRC profile.

Line Mode
Line mode offers these features:
• Low-level signal boost compression scaling is allowed.
• High-level signal cut compression scaling is allowed when not downmixing.
• The normalized dialogue level is reproduced from the decoder at a constant
loudness level of –31 dBFS Leq(A), assuming the Dialogue Level parameter
is set correctly.
Line-level or power-amplified outputs from two-channel set-top decoders, two- channel digital televisions, 5.1-channel digital televisions, Dolby Digital A/V surround decoders, and outboard Dolby Digital adapters use Line mode.
Consumer control of the dynamic range is limited when downmixing. Products with stereo or mono outputs do not usually allow consumer scaling of Line mode. This is because these devices are usually downmixing (for example, when receiving a 5.1-channel signal). However, in these products, the consumer may have a choice between Line mode and RF mode.

RF Mode
In RF mode, high- and low-level compression scaling is not allowed. When RF mode is active, that compression profile is always fully applied. RF mode is designed for products (such as set-top boxes) that generate a downmixed signal for connection to the RF/antenna input of a television set; however, it is also useful in situations where heavy DRC is required—for example, when small PC speakers are used for DVD playback. In RF mode, the overall program level is raised 11 dB, this results in dialogue being reproduced at a level of –20 dBFS Leq(A), while the peaks are limited to prevent signal overload in the D/A converter. By limiting headroom, severe overmodulation of television receivers is prevented. The 11 dB gain provides an average loudness level that compares well with existing analog television broadcasts. In some situations it may be necessary to further constrain signal peaks above the average dialogue level so that there is less than 20 dB headroom. The selection of a suitable RF mode profile achieves this.

Dynamic Range Control Profiles Six preset DRC profiles are available to content producers: Film Light, Film
Standard, Music Light, Music Standard, Speech, and None.

• Film Light
Max Boost: 6 dB (below –53 dB)
Boost Range: –53 to –41 dB (2:1 ratio)
Null Band Width: 20 dB (–41 to –21 dB)
Early Cut Range: –26 to –11 dB (2:1 ratio)
Cut Range: –11 to +4 dB (20:1 ratio)

• Film Standard
Max Boost: 6 dB (below –43 dB)
Boost Range: –43 to –31 dB (2:1 ratio)
Null Band Width: 5 dB (–31 to –26 dB)
Early Cut Range: –26 to –16 dB (2:1 ratio)
Cut Range: –16 to +4 dB (20:1 ratio)

• Music Light (No early cut range)
Max Boost: 12 dB (below –65 dB)
Boost Range: –65 to –41 dB (2:1 ratio)
Null Band Width: 20 dB (–41 to –21 dB)
Cut Range: –21 to +9 dB (2:1 ratio)
Dolby Laboratories, Inc. Metadata Guide

• Music Standard
Max Boost: 12 dB (below –55 dB)
Boost Range: –55 to –31 dB (2:1 ratio)
Null Band Width: 5 dB (–31 to –26 dB)
Early Cut Range: –26 to –16 dB (2:1 ratio)
Cut Range: –16 to +4 dB (20:1 ratio)

• Speech
Max Boost: 15 dB (below –50 dB)
Boost Range: –50 to –31 dB (5:1 ratio)
Null Band Width: 5 dB (–31 to –26 dB)
Early Cut Range: –26 to –16 dB (2:1 ratio)
Cut Range: –16 to +4 dB (20:1 ratio)

None
No DRC profile selected. The dialogue level parameter (dialnorm) is still applied. These choices are available to the content producer for both Line mode and RF mode. The content producer chooses which of these profiles to assign to each mode; when the consumer or decoder selects a DRC mode, the profile chosen by the producer is applied. In addition to the DRC profile, metadata can limit signal peaks to prevent clipping during downmixing. This metadata, known as overload protection, is inserted by the encoder only if necessary. For example, consider a 5.1-channel program with signals at digital full-scale on all channels being played through a stereo, downmixed line- level output. Without some form of attenuation or limiting, the output signal would obviously clip. Correct setting of the Dialogue Level and DRC profiles normally prevents clipping and unnecessary application of automatic overload protection. Note: DRC profile settingsare dependent on an accurate dialogue level setting. Improper setting of the dialogue level parameter may result in excessive and audible application of overload-protection limiting.

Downmixing
Downmixing is a function of Dolby Digital that allows a multichannel program to be reproduced over fewer speaker channels than the number for which the program is optimally intended. Simply put, downmixing allows consumers to enjoy a DVD or digital television broadcast without requiring a full-blown home theater setup.
As with stereo mixing where the mix is monitored in mono on occasion to maintain compatibility, multichannel audio mixing requires the engineer to reference the mix to fewer speaker channels to ensure compatibility in downmixing situations. In this way, Dolby Digital, using the metadata parameters thatcontrol downmixing, is an “equal opportunity technology,” in that every consumer who receives the
Dolby Digital data stream can enjoy the best audio reproduction possible, regardless of the playback system.
It is important to consider the output signals from each piece of equipment that can receive a Dolby Digital program in the home. Table 2 shows the output types from different equipment.
georgia is online now   Reply With Quote