MPEG works by looking at a picture and noticing
what in the scene and between the scenes might be redundant information and
what is always changing. MPEG reduces the amount of data it needs to store by
identifying this redundant information. For instance, if the scene was a man
talking against a static background (the classic talking head), there is much
in the scene that is redundant - a great expanse of wall, the clothes the man
is wearing, etc. If the scene on the other hand, was a panning shot of a field
of tall grass blowing in the wind, then there might be much in this image that
is unique and changing both within the frame and from frame to frame. By using
variable bit rate compression, less bits can be applied to the static scene
and more to this very complicated scene (see MPEG2 compression, for more
information).
One of the issues that will cause the issue of
bit rate and length to be re-examined is the implementation of the dual layer
and dual sided discs. As these disc configurations and their manufacturing
become more commonplace, the ability to have relatively high bit rates and
voluminous amounts of content on a disc will make the DVD format even more
impressive.
 |
WHAT FORMAT
IS THE SOURCE MASTER? |
A digital component Source Master on D1 or Digital Betacam is preferred
because the compression process itself is also component. In component
recordings, the color information is recorded separately from the black and
white information. This results in a master where the colors are sharper and
the artifacts commonly associated with composite recordings such as "chroma
crawl" are absent. Some of these composite artifacts create additional
non-redundant information, which causes the wasting of those precious bits
used in the encoding process. An analogue component recording such as from
BetacamSP is actually preferable to a Digital Composite Master such as D2 for
this very reason. This is not to say that one cannot encode from a composite
source, but a digital component master is always preferable. If a composite or
analogue source is provided, we will create a digital component master from
which we will encode.
 |
WILL THE
DISC CONTAIN ADDITIONAL MATERIAL SUCH AS LOGOS, TRAILERS, INTERVIEWS,
ETC. ? |
In determining the overall bit rate, all video
material, including FBI warnings, logos, and trailers need to be taken into
account. In addition, it is helpful to make sure that all material that needs
to be encoded is available at the same time so that encoding can take place
efficiently.
 |
HOW MANY
AUDIO TRACKS AND IN WHAT CONFIGURATION? |

Obviously, the number and type of additional
tracks will have a bearing on the overall average bit rates for the disc. In
addition, the preparation of language tracks for AC-3 encoding are an
important element in the making of a DVD disc. The format supports Dolby AC-3
which can exist in a stereo configuration that can be compatible with ProLogic
Surround or 5.1 channel surround sound. This 5.1 channel sound is so named
because there are front left, front center, front right, rear left, rear right
and a subwoofer channel (the .1 channel in the name). If these tracks already
exist for theatrical release, it is possible that they can be used for DVD.
However, these tracks may require additional audio post production to optimize
them for the DVD format. If 5.1 channel tracks do not exist, then they will
need to be prepared. In addition to the lead time necessary to prepare sound
tracks, it is important to be able to locate additional foreign language
elements and to verify that they exist in the correct format and to the same
master to which video is being encoded. One of the challenges in efficient DVD
title creation is to ensure that all the elements necessary for authoring -
all video, audio, and graphic elements are available and approved so that
authoring can proceed.
An interesting specification built into the DVD
format is that Linear PCM audio (such as used on Compact Audio Discs) can also
be used. The DVD format also supports 96Khz Uncompressed PCM audio which has
twice the bandwidth of today's current audio CDs. The fidelity of this sound
format is phenomenal although files are quite large and will consume
additional "real estate" on the disc if used.
 |
WILL THE
DISC CONTAIN SUBTILES, CLOSED CAPTIONING OR OTHER SUBPICTURE
INFORMATION ? |
Another important element of the DVD format is
subpicture information. In addition to the need to gather this information and
to ensure that it is in a format compatible with the DVD authoring process, it
is necessary to create and approve any subpictures that will be created
specifically for the disc. Closed captions, subtitles and other previously
created text data will need to be converted to a computer file that can be
read by the authoring system.
The DVD format supports up to 32 channels of
subpicture information. These sub-pictures are essentially graphic files that
are "bit-mapped" or overlayed onto the pictures one pixel at a time.
The color palette for sub pictures consists of 16 different colors and
contrast values, 4 colors and 4 different contrast values can be displayed on
the screen in one channel at a time. By using the various channels of
sub-picture information that can be triggered by time code or button
depressions, relatively sophisticated graphics, even simple animation can be
put onto the screen. Subpictures can scroll and fade on the screen and can
change every field. More basic uses of sub-pictures are captions, sub-titles
or other text information.
 |
WILL THE
DISC BE RELEASED IN A 4x3 OR 16x9 FORMAT? |
The DVD format allows for the specification of
an aspect ratio in the display of the disc. The disc itself can be programmed
to allow for a 16x9 format transfer to be displayed that way on 16x9
televisions as well to also force the display of a scope or 1:85:1 film in a
"letterbox" on a 4x3 television set.
 |
WILL THE
DISC BE A PAL OR NTSC? |
DVD discs can be PAL and NTSC. To create a NTSC
disc, the video is encoded from a 30 frame 525 line Digital Component or NTSC
source master. To create a PAL disc, the video is encoded from a 25 frame 625
line Digital Component or PAL source master. The format also has different
audio configurations in its NTSC and PAL versions. In NTSC countries, AC-3
Audio or Linear PCM audio playback capability may be used and MPEG2 audio
playback capability is optional. In PAL countries, MPEG audio or Linear PCM
audio playback capability may be used and playback AC-3 Audio capability is
optional.
This means that in order to have an NTSC and
PAL product, the disc needs to be mastered, encoded and authored in both
formats.
 |
WHAT WILL
THE ON-SCREEN MENUS LOOK LIKE ? |
The DVD format allows you to 'program' what the
disc does when it is put in the player, what options you allow the user to
select and how he can access the content. The design of the menus and the on
screen controls requires both good graphics and interface design. The features
of the player and the remote control can be enabled via the menu screens. The
goal is to provide a straightforward interface to the player and to still
create a compelling polished graphics look. The pre-mastering company's
graphic department can help create menus and screen graphics that
differentiate your brand and product. The use of still images, animation, and
graphics can enhance the "look and feel" of the disc.
 |
WILL THE
DISC CONTAIN CHAPTER STOPS THAT ALLOW A "JUMP" TO A
PARTICULAR PART OF THE CONTENT? |
Like laser discs, the DVD format can provide
for "Chapter Stops" that can branch a user to a particular scene.
Unlike laser discs, a DVD menu screen can be programmed with images of the
particular screen, with on-screen buttons or "hot spots" that when
selected, cause the disc to begin playing the selected scene. These menus can
be quite attractive, especially when thumbnail images are used for these
buttons or "hot spots".
 |
WILL THE
DISC CONTAIN CONTENT THAT NEEDS TO BE "BRANCHED" BASED ON
USER INTERACTION ? |
The DVD format can also include multiple
versions of a title on the same disc. An example of this might be a Director's
Cut and the Studio Release on the same disc. Discussions as to how to identify
the different material and to best prepare it for encoding and authoring are
essential.
 |
MPEG2
COMPRESSION. |
High-quality video compression is the single
most enabling technology for DVD. To illustrate the level of compression
required for two hours of high quality video, consider this: the raw data
storage requirements for uncompressed CCIR-601 resolution 4:2:2 serial digital
video are roughly 20 megabytes per second. For a 120-minute movie, this would
require 144 Gigabytes of storage space, before accounting for audio. With DVD
capable of storing 4.7 Gigabytes of data, compression ratios of roughly 40:1
are required in order to fit the video for a feature film along with the audio
and sub-titles on a single-sided disc.
MPEG2 was born out of a continuing desire and
demand to achieve the best possible audio and video quality for digital
storage applications. It follows on the heels of MPEG1 which took the first
steps in bridging the gap between analog and digital technologies to achieve
broadcast quality audio and video.
 |
WHAT IS
MPEG? |
MPEG (Moving Pictures Expert Group) is the ISOIIEC
working group formed in 1988, with contributors from all over the world,
to ensure that the compression of digital audio and video signals followed a
defined standard. Among the contributing companies were Sony, Matshushita, JVC,
Toshiba, Thomson, Motorola, C-Cube, LSI Logic, Texas Instruments, Digital
Equipment, AT&T and many others -- including Philips.
In 1992, the efforts of this group resulted in
a standard for audio and video coding known as MPEG1. MPEG1 was aimed at
coding audio and video at a bit rate of about 1.5 Mbits/sec. which corresponds
to the bit rate on a CD-ROM. MPEG1 has been successfully implemented in
applications such as CD-i and Video CD, and a variety of broadcast
applications, including DSS (Digital Satellite System), DAB (Digital Audio
Broadcast), DVB (Digital Video Broadcast), and Satellite feeds to cable
networks. MPEG-encoded files are available from a growing number of platforms
including CD-ROM and the Internet. Among new applications, MPEG1 will also
have a place in DVD products.
Since its inception and widespread application,
MPEG1 has been well accepted as a boon to digital storage and transmission.
MPEG has also become part of many other standards such as the new Digital
Audio Visual Council (DAVIC) standard which will be incorporated in numerous
digital television decoders; the ITU-R recommendation (BS 1115) for emission,
contribution and distribution; the ETSI Standard on Digital Audio Broadcasting
(ETS 300401, February 1995); and the ITU-R draft recommendation for Digital
Terrestrial Television Broadcast.
 |
THE
EMERGENCE OF MPEG2. |
The compression work for MPEG1 was based on
film and other progressive sources, but its bit rate is not suited for
standard broadcast interlaced video with good compression. To bridge this gap,
the MPEG committee continued in its development of the technology. From their
work, MPEG2 was created and became a standard in November 1994. MPEG2 has been
presented as the audio/video solution for supporting bit rates beyond 5 Mbits/sec.
Today MPEG2 delivers picture and sound quality equal to TV studio standards.
 |
WHAT IS
MPEG2 AUDIO? |
MPEG2 audio is a compatible extension of the
MPEG1 audio coding which enables the transfer of mono, stereo, or multichannel
audio in a single bitstream. It can operate at data rates from 32 kb/s up to
more than 1 Mb/s, and supports sampling rates of 32, 44.1 and 48 kHz. For
stereo, a typical application would operate at an average data rate of 128-256
kb/s. A multichannel movie soundtrack requires an average bit rate of 320-640
kb/s, depending on the number of channels (5 to 7, plus a sub woofer channel)
and the complexity of the encoded audio.
 |
WHAT ABOUT
MPEG2 VIDEO? |
Where MPEG2 shines is in its higher bit rate
and support of interlaced video. The video compression of MPEG2 is much more
sophisticated than that of MPEG1. MPEG2 provides 60 fields/s and supports bit
rates of 6 - 8 Mbit/s. It specifies the syntax and semantics of a compressed
video bit stream (and the parameters such as bit rates, picture sizes and
resolutions which may be applied) and how it must be decoded to reconstruct
the picture.
MPEG2 fully supports the coding of interlaced
video material. This level of performance, combined with new optical disc
technology, makes it possible to record a full-length feature film, with
digital surround sound, on a single high-density optical disc.
MPEG2 supports virtually every video standard
worldwide, including PAL (Phase Alternating Line), SECAM, and NTSC (National
Television Standards Committee).
 |
VARIABLE BIT
RATE CODING. |
In both video and audio, certain sections are
more complex than others, e.g. a picture of a tree is more complex than a
clear blue sky. As a result, the number of bits needed to faithfully encode
varies with the program material. In order to encode in the most effective
way, it is advantageous to save bits from the simple sections and use them to
code complex ones. This is what variable bit rate coding does. Although the
coding of video is completely different from that of audio, the principle
works equally well for both. This can be illustrated by movie soundtracks in
which love scenes are interrupted by gunfights or a sudden thunder clap which
enhances the suspense of a thriller.
 |
BENEFITS OF
VARIABLE OVER FIXED BIT RATE ENCODING. |
Operating at a certain (average) bit rate, a
fixed or constant bit rate encoder provides variable quality. For fragments
that are simple to code, the fixed bit rate encoder can apply more than enough
bits. But it does not have enough bits available for the complex fragments and
coding artifacts may become noticeable.
A variable bit rate encoder provides fixed
quality. It always applies the number of bits necessary to encode without
noticeable artifacts. It has been suggested that a fixed bit rate encoder,
constantly operating at the same peak data rate as a variable bit rate
encoder, will sound equally good or better. But the variable bit rate encoder
needs the high peak data rate for only a small fraction of the total time.
Applying a fixed bit rate encoder in this manner is a clear overkill: one
could easily add another variable data rate bitstream with, for example, a
second language version instead.
 |
MPEG2 IS
COMPATIBLE WITH MPEG1. |
Unlike other competitive systems, MPEG2 has
been designed with compatibility in mind. With the ever-growing number of
applications of MPEG1, especially in the entertainment, satellite broadcasting
(DSS) and multimedia markets, this compatibility will provide the consumer
with a cross-platform format to enjoy high-quality audio and video
reproduction.
The core of the MPEG2 bitstream is an MPEG1
bitstream. This enables fully compatible decoding with an MPEG1 decoder. An
MPEG2 video decoder can decode MPEG1 video and an MPEG2 audio decoder can
decode MPEG1 audio. Likewise, an MPEG1 audio decoder can decode MPEG2
multichannel audio, providing a stereo output signal. This compatibility
eliminates the need to transfer two separate bit streams (one for stereo and
another one for the multichannel audio program). For example, a future upgrade
of DSS with multichannel audio will not make existing set top boxes obsolete.
The existing boxes will reproduce excellent stereo; the new products will
produce high quality multichannel.
 |
MPEG2
APPLICATIONS |
Today's computer systems currently support
MPEG1 audio and video, and many will soon support MPEG2. MPEG2 is the video
standard for DVD players and is being touted as the next standard in
applications where broadcast quality is essential. MPEG2 is also the preferred
DVD audio format for 50Hz countries and an option for the 60Hz countries.
 |
AC-3
HISTORY. |
While conceived and later chosen for HDTV in
the US, AC-3 was actually implemented for the cinema first, making it
practical to provide multichannel digital sound with 35 mm prints. In order to
retain an analog track so that these prints could play in any cinema, it was
decided to place the new digital optical track between the sprocket holes, a
key factor in defining its maximum practical bit-rate. It was also well
documented that a 5.1-channel format would best satisfy the requirements of
theatrical film presentation. Altogether, these needs dovetailed with the HDTV
requirements that led to AC-3's conception.
Dolby Digital, the film sound format with AC-3
as its keystone, debuted in cinemas in June of 1992. Within less than two
years, more than 50 feature films had been released in the new format, and
nearly 600 cinemas in 27 countries had been equipped for playback of the
digital track. This experience confirmed that prints with both digital and
analog tracks could be manufactured economically, that such prints would play
in any cinema, and that the AC-3 coded digital track provided high audio
quality with extraordinary resistance to wear and tear.
Just as important, Dolby Digital provides a
unique springboard for consumer formats based on AC-3, enabling the
accumulation of invaluable experience in mixing, recording, and distributing
multichannel digital audio. It is also fostering a library of program material
immediately available for consumer release, and has facilitated the
development of cost-efficient IC decoder technology. Dolby AC-3 is the only
multichannel perceptual coding technology with this kind of real-life
experience behind it.
 |
Dolby
Digital (Surround AC-3) in the Home. |
Dolby Digital for the home (aka Dolby Surround
Digital or Dolby Surround AC-3) forms the final link from multichannel program
producer to home listener. Like the film format, it provides separate channels
for left, right, and center speakers at the front; two surround speakers at
the sides; and a subwoofer at the listener's option.
Unlike analog Dolby Surround with its single
band-limited surround channel (usually played over two speakers), Dolby
Digital features two completely independent surround channels, each offering
the same full range fidelity as the three front channels. As a result, true
stereo surround effects can be achieved for an expanded sense of depth,
localization, and overall realism. And because Dolby Digital maintains
complete separation of the audio channels, it is as suited to music-only
recordings and broadcasts as it is to video formats. Thus it has the potential
to open up new worlds of multichannel sound reproduction.
That isn't all that Dolby Digital can do. While
Dolby Digital is heard in cinemas with a full complement of loudspeaker
channels, a standardized playback level, and full dynamic range capabilities,
home listening circumstances vary markedly. Therefore, for Dolby Digital
consumer formats, AC-3 has been designed to satisfy many diverse requirements.
At the outset, at least, while some listeners
will have multichannel systems, most will be listening in mono or conventional
stereo. Those with Dolby Surround systems will want a two channel matrix
encoded output from their decoders. Many listeners may prefer a restricted
dynamic range, but others will wish to experience the full dynamic range of
the original signal. Techniques to satisfy these and other needs have been
designed in from the beginning:
- Data identifying each program's original
production format - mono, stereo, matrixed or discreet surround can be
sent to eliminate confusion at playback or reception.
- Program material can be coded when it is
originally mixed so that subjectively constant, dialogue-keyed loudness is
maintained as the listener switches between program sources. No alteration
of program dynamics is involved, only playback volume.
- Decoders can be designed to provide optimum
mix downs from multichannel programming, such as a matrix-encoded
two-track mix for analog Dolby Surround decoding, a conventional stereo
mix, or even a mono mix.
- When programs with wide dynamic range, such
as movie soundtracks, are played at low volume, the system can apply
appropriate compression to preserve low-level content. The degree of
compression can be made to vary according to need.
- The listener can program the Dolby Surround
Digital decoder to route non-directional low bass only to those channels
in the system which have wide range speakers or subwoofers.
- Dolby Digital offers a dramatic step forward
in listener involvement and excitement. It provides program producers,
directors, recording engineers, and performers unprecedented creative
opportunities. And it offers remarkable media adaptability within a
single, far-reaching technological framework.
 |
HOW DOLBY
AC-3 WORKS. |
The digital audio coding used on Compact Discs (16-bit PCM) yields a total
range of 96 dB from the loudest sound to the noise floor. This is achieved by
taking 16-bit samples 44,100 times per second for each channel, an amount of
data often too immense to store or transmit economically, especially when
multiple channels are required. As a result, new forms of digital audio
coding--often known as "perceptual coding"--have been developed to
allow the use of lower data rates with a minimum of perceived degradation of
sound quality.
Dolby AC-3 is the first perceptual coding
designed specifically to code multichannel digital audio. It is also the only
one to benefit from the development of two other successful perceptual coding
systems, Dolby AC-1 and AC-2, and from the development of what are in essence
analog perceptual coding systems: the full gamut of Dolby professional and
consumer noise reduction systems. Indeed, Dolby Laboratories' unique
experience with audio noise reduction is essential to AC-3's effective data
rate reduction: the fewer the bits used to describe an audio signal, the
greater the noise.
Dolby noise reduction works by lowering the
noise when no audio signal is present, while allowing strong audio signals to
cover or mask the noise at other times. Thus it takes advantage of the psycho
acoustic phenomenon known as auditory masking. Even when audio signals are
present in some parts of the spectrum, Dolby NR reduces the noise in the other
parts so the noise remains imperceptible. This is because audio signals can
only mask noise that occurs at nearby frequencies.
AC-3 has been designed to take maximum
advantage of human auditory masking. It divides the audio spectrum of each
channel into narrow frequency bands of different sizes optimized with respect
to the frequency selectivity of human hearing. This makes it possible to
sharply filter coding noise so that it is forced to stay very close in
frequency to the frequency components of the audio signal being coded. By
reducing or eliminating coding noise wherever there are no audio signals to
mask it, the sound quality of the original signal can be subjectively
preserved. In this key respect, a perceptual coding system like AC-3 is
essentially a form of very selective and powerful noise reduction.
In Dolby AC-3, bits are distributed among the
filter bands as needed by the particular frequency spectrum or dynamic nature
of the program. A built-in model of auditory masking allows the coder to alter
its frequency selectivity (as well as time resolution) to make sure that a
sufficient number of bits are used to describe the audio signal in each band,
thus ensuring noise is fully masked. AC-3 also decides how the bits are
distributed among the various channels from a common bit pool. This technique
allows channels with greater frequency content to demand more data than
sparsely occupied channels, for example, or strong sounds in one channel to
provide masking for noise in other channels.
Dolby AC-3's sophisticated masking model and
shared bit pool arrangement are key factors in its extraordinary spectrum
efficiency. Furthermore, where other coding systems have to use considerable
(and precious) data to carry instructions for their decoders, AC-3 can use
proportionally more of the transmitted data to represent audio, which means
higher sound quality.
Technically speaking, AC-3 can process at least
20-bit dynamic range digital audio signals over a frequency range from 20 Hz
to 20kHz x 0.5dB (-3dB at 3Hz and 20.3 kHz). The bass effects channel covers
20 to 120 Hz x0.5 dB (-3 dB at 3 and 121 Hz). Sampling rates of 32, 44.1, and
48 kHz are supported. Data rates range from as low as 32 kb/s for a single
mono channel to as high as 640 kb/s, thereby covering a wide range of
requirements. Typical applications include 384 kb/s for 5.1-channel Dolby
Surround Digital consumer formats, and 192 kb/s for two-channel audio
distribution.