01.
MPEG
VIDEO ///
![]()

![]()
![]()
![]()
COPYRIGHT ©
1956/2002 TVI/josiecory
WHAT'S
MPEG? Q&A
This page provides a background of the
theory behind, and techniques involved
with MPEG VIDEO. It also covers the more
technical aspects of the MPEG standard.
This is presented in questions and answers
format.
more
![]()
02.
MPEG
AUDIO
This
page provides a background of the theory
behind, and techniques involved with MPEG
AUDIO. It also covers the more technical
aspects of the MPEG standard. This is
presented in questions and answers
format.
more
![]()
03.
MPEG
General
This page provides a background of the
theory behind, and techniques involved
with MPEG General. It also covers the more
technical aspects of the MPEG standard.
This is presented in questions and answers
format.
![]()
04.
Blu-ray
Disc
format
![]()
05.
Philips
DVD - Real-Time RW Recorder
Requirements
Making
Sure Your Video Product is Video/DVD/CD/MP3
Compatable . . . is just as important as
Protecting Your Broadcasting Licensing
Rights
Verification
Process for +RW, DVD+RW and VRA DVD, VideoCD
products. The Logo That Counts -
VRA
LOGO - A MUST
ALL RIGHTS RESERVED
Return
To Top
\
![]()
![]()
|
Television International - Smart90.com - Lookradio.com Extras |
||
|
|
as there are seconds in a year
MPEG Applications MPEG Audio is an ISO
(International Standards Organisation) and IEC
(International Electrotechnical Commission) world standard -
open to everyone on an equal, non-discriminatory basis. MPEG Business
Applications MPEG Consumer
Applications
Advantages of MPEG technology are:
Super Audio
CD, The Natural The launch of the
Compact Disc had a great impact on the global Music
Industry. Customers were provided with new levels of
performance and convenience, and CD rapidly grew to become
the World's most popular audio format. However, some
discerning listeners were not fully satisfied with the sound
quality of CD, and longed for something better. The
introduction of Super Audio CD by Philips and Sony finally
provides even the most discerning listener with the highest
possible audio quality as a result of breakthroughs in the
fields of recording technology, data encoding techniques,
data storage and laser optics. Not only that, but SACD was
developed from the outset to have full compatibility with
CD, thereby safeguarding consumers'investments, and
providing them with a smooth migration path from CD to Super
Audio CD. It was also designed to take advantage of the move
towards multi-channel that is rapidly gaining in
popularity. General information on Super Audio CD Mount Rainier Mount Rainier enables
native OS support of data storage on CD-RW. This makes the
technology far easier to use and allows the replacement of
the floppy. Mount Rainier enables
native OS support of data storage on CD-RW. This makes the
technology far easier to use and allows the replacement of
the floppy. This is done by having defect management in the
drive, by making the drive 2k addressable, by using
background formatting, and by standardizing both command set
and physical layout. The new standard is promoted by Compaq,
Microsoft, Philips, and Sony and is supported by 38 industry
leaders: OS vendors, PC-OEM's, ISV's, chip makers, and media
makers. 01.
MPEG VIDEO MPEG video is a
worldwide standard - so all MPEG equipment is the
same? The MPEG video standard
allows MPEG compatible equipment to inter-operate, because
the bitstreams are standardised. However, the way the actual
encoding process is implemented to generate the bitstream is
up to the encoder designer. Therefore, all equipment will
not necessarily produce the same quality video (at a given
bit rate), there will be a range of products available, at
different price levels, which the consumer can choose from
to suit their own application. What is the
difference between MPEG I video and MPEG2
video? MPEG1 is standardised at
a maximum bit rate of 1.856 Mbit/s (for CD based
applications) and does not support fully interlaced video.
It is, however, possible to use higher bit rates for
increased video quality that is not used in a CD based
system (e.g. for broadcast). MPEG2 video is an extension of
the MPEGI video standard, which supports fully interlaced
video. It has been proven to provide studio quality pictures
at bit rates between 4-9 Mbit/s. Both MPEG1 and MPEG2
support additional layers which allow for the addition of
other types of data along with the video data. MPEG1 and
MPEG2 are different in their throughput of information.
MPEG1 is generally recognised to produce video with SIF
resolution (352 x 240/288). MPEG2 handles CCIR-601
resolution (720 x 480/576) as well as the MPEG1 defined
resolutions. Although MPEG1 can produce higher resolution,
the bandwidth used is greatly reduced and delivers lower
quality video (30 field/s). The amount of data processed by
MPEG2 can be more than four times that of MPEG1 and is
recognised as full 60 field/s video. MPEG2 will be used most
in applications where bandwidth is a lesser consideration
than quality. t constructed, top down, line by line.
Interlaced video, as used in a television set, displays the
odd lines (the odd field) first. Then it displays the even
lines (the even field). Each pair forms a frame and there
are 50(60)* of these fields displayed every second (or
25(30)' frames every second). This is referred to as
interlaced video. *PAL(NTSC). Using interlaced video means
that a picture can be encoded as a frame or a field. A frame
of interlaced video consists of two fields which are samples
of the full vertical image separated by the time of the
field period, the lines of one image falling exactly between
the lines of the other. The standard for displaying any sort
of non-film video is 30 frames per second for NTSC systems,
and 25 frames per second for PAL/SECAM systems. This simply
means that the video is made up of 30 (or 25) pictures or
'frames' for every second of video. Additionally, these
frames are split in half (odd lines and even lines), to form
what are called 'fields'. Is MPEG I video
compatible with MPEG2 video? MPEG video is forwards
compatible but not backwards compatible. This means that an
MPEG2 decoder will correctly decode an MPEG1 bitstream to
produce an MPEG1 quality picture. However an MPEG1 decoder
cannot decode an MPEG2 bitstream. This means that all back
catalogues of MPEG1 encoded material will not become
obsolete when equipment fitted with MPEG2 decoders are
installed. Film format video is progressive and has a frame
rate of 24 frames per second. MPEG2 uses a technique called
3:2 pull down to convert film format to interlaced 25 or 30
frames per second. What is interlaced
video? Video pictures can be
represented in two different ways: 'progressive' and
'interlaced'. Progressive video, as used in movies and
computer screens means the picture is Why encode as frames
and fields? MPEG2 video uses the
benefits of interlaced video to further increase encoding
efficiency and picture quality. Slow moving pictures are
best encoded by combining the Fields into a single frame and
then encoding the frame. Where there is a large amount of
fast movement, it is most efficient to encode each field
separately. MPEG2 allows switching between the two modes on
a block-by-block basis. What is a
profile? Not all parts of the
MPEG2 standard are used for MPEG2 video applications.
Profiles provide a means of defining sub- sets of the syntax
and semantics of the MPEG2 standard. Profiles are used to
create a 'tool set' for a certain specific application. By
taking the definitions of the MPEG2 bit- stream, the profile
is built up for the video encoding process. Profiles can be
scalable or non-scalable. Scalable profiles are used to
encode video which is to be used in real time transmission
links because the decoder does not need to decode the whole
bitstream to reconstruct the picture. The drawback of
scalable profiles is the complexity of the encoding process.
Non-scalable profiles are much less complicated, can produce
higher quality pictures and are more suited to encoding for
storage on fixed capacity medium (optical, magnetic).
Profiles are coupled with various levels to completely
define the MPEG2 video encoding/decoding behaviour.
What is a level?
A level is the
definitions for the MPEG standard for physical parameters
such as bit rates, picture sizes and resolutions. There are
four levels specified by MPEG2, High level, High 1440, Main
level, and Low level. MPEG2 Video Main Profile and Main
Level has sampling limits at CCIR 601 parameters (PAL and
NTSC). Profiles limit syntax (i.e. algorithms), whereas
Levels limit encoding parameters (sample rates, frame
dimensions, coded bitrates, buffer size etc.). Together,
Video Main Profile and Main Level (abbreviated as MP@ML)
keep complexity within current technical limits, yet still
meet the needs of the majority of applications. MP@ML is the
most widely accepted combination for most cable and
satellite TV systems, however different combinations are
possible to suit other applications. How is the video
information actually encoded? Encoding of video
information is achieved by using two main techniques. These
are termed spatial and temporal compression. Spatial
compression involves analysis of a picture to determine
redundant information within that picture, for example by
discarding frequencies that are not visible to the human
eye. Temporal compression is achieved by only encoding the
difference between successive pictures. Imagine a scene
where at first there is no movement, then an object moves
across the picture. The first picture in the sequence
contains all the information required until there is any
movement, so there is no need to encode any of the
information after the first picture until the movement
occurs. Thereafter, all that needs to be encoded is the part
of the picture that contains movement. The rest of the scene
is not effected by the moving object because it is still the
same as the first picture. The means by which it is
determined how much movement is contained between two
successive pictures is known as motion estimation
prediction. The information obtained from this process is
then used by motion compensated prediction to define the
parts of the picture that can be discarded. This means that
pictures cannot be considered in isolation. A given picture
is constructed from the prediction from a previous picture,
and may be used to predict the next picture. There is also
the need to have pictures which are not used in any
reference for random access. Therefore MPEG2 defines three
picture types: I (Intraframe)
pictures. These are encoded without reference to another
picture to allow for for random access P (Predictive)
pictures are encoded using motion compensated prediction
on the previous picture therefore contain a reference to
the previous picture. They may themselves be used in
subsequent predictions B
(Bi-directional) pictures are encoded using motion
compensated prediction on the previous and next pictures,
which must be either a B or P picture. B pictures are not
used in subsequent predictions. The I, P and B pictures
can be formed into a group of pictures (GOP). Each
picture type (I, P, B) provides increased opportunity of
redundancy. An I picture is encoded with little
compression (only spatially redundant information). P and
B pictures also use motion compensation to remove
temporally redundant information. B pictures offer the
most compression. Typical bit allocations
(30 Hz CCIR 601 I 4Mbit/s) What is a group of
pictures? This is the grouping of
I, B and P pictures into a specified sequence known as a
group of pictures (GOP). The group must start and end with
an I picture to allow for random access to the group, and
contains B and P pictures in between in a specified sequence
(determined by the designer). A group can be made of
different lengths to suit the type of video being encoded,
for example it is better to use a shorter group length for a
film which contains a lot of fast moving action with complex
scenes. A group length is typically between 8 - 24 pictures.
Commonly used GOP sizes are 12 for 50 Hz systems, 16 for 60
Hz systems. GOPs are optional in an MPEG2 bitstream, but are
mandatory in DVD video, to achieve an SMPTE timebase. A
bitstream with no GOP header can be directly accessed at a
specific point using the sequence header. How does motion
estimation prediction work? Motion estimation
prediction is a method of determining the amount of movement
contained between two pictures. This is achieved by dividing
the picture to be encoded into sections known as
macroblocks. The size of a macroblock is 16 x 16 pixels.
Each macroblock is searched for the closest match in the
search area of the picture it is being compared with. Motion
estimation prediction is not used on I pictures, however B
and P pictures can refer to I pictures. For P pictures, only
the previous picture is searched for matching macroblocks.
In B pictures both the previous and next pictures are
searched. a match is found, the offset (or motion vector)
between them is calculated. The matching parts are used to
create a prediction picture, by using the motion vectors.
The prediction picture is then compared in the same manner
to the picture to be encoded. Macroblocks which have a match
have already been encoded, and are therefore redundant.
Macroblocks which have no match to any part of the search
area in the picture to be encoded represent the difference
between the pictures, and these macroblocks are encoded.
What is meant by a
search area? A search area is used in
the motion compensated prediction process, to determine the
area that the encoder searches in the previous picture for
each macroblock. When the comparison is made, it can be on a
pixel or half pixel basis. A half pixel search is more
accurate and produces higher quality pictures, but is more
time consuming. The MPEG2 video standard defines that the
motion vectors must be transmitted in the half pixel format,
even if the search was only pixel accurate. By interpolating
adjacent pixels a much more accurate motion prediction
picture is obtained than using individual pixels. There are
many ways of defining the way in which macro- blocks are
compared in the search area. Three widely recog- nised
methods are: A Full block
motion estimation search, where macro- blocks are
compared in the entire seach area to seek a matching
macroblock. This process requires a large com- putational
effort. A Telescopic
motion estimation search, which reduces the search time
by looking for a match initially in every fourth
macroblock. When a near match is obtained, every second
macroblock is searched, then every macro- block until the
search has 'homed in' on the best match. A hierarchical
motion estimation search, where before the search is
made, the two pictures to be compared are filtered to
reduce the search area by a factor of four. This is a
common technique used in MPEG2 video
encoders How does a decoder
reconstruct the picture? Decoding the MPEG
bitstream is essentially the reverse process to encoding.
The spatial information is retrieved from the encoded
bitstream by an inverse DCT and dequantizing procedure. This
restores the original frequency coeffcients (as far as the
accuracy of the encoder quantizing process allows). The
decoder reconstructs temporal information in the picture by
using the transmitted macroblocks which were matched to
replace redundant macroblocks discarded during encoding. The
position of the replaced macro- blocks is obtained from the
motion vectors, which are included in the MPEG bitstream.
The decoder needs two memory stores, one to hold the
previous picture, one the next picture (to handle
bi-directional pictures). Can you use a
variable bit rate for video encoding? In any given video
section, certain parts contam more movement than others or
more fine detail. For example a clear blue sky is simpler to
encode than a picture of a tree. As a result the number of
bits needed to faithfully encode without artifacts varies
with the video material. In order to encode in the best
possible way, it is advantageous to save bits from the
simple sections and use them to encode complex ones. This
is, in a simple way, what variable bit rate encoding does,
however the process by which the bit rates are calculated is
complex. Variable bit rate encoding can be carried out in
one or two passes of the video data. For fixed size storage
applications such as DVD, the amount of encoded video
information must be known in advance, therefore two passes
of the video information are required. This ensures that the
amount of data is not too small (quality compromised) or too
large (not enough storage space). The first pass is used to
analyse and store encoding information about the video data,
the second pass uses this information to perform the actual
encoding. Where the amount of encoded data produced is not
so critical, encoding can be carried out in one pass of the
input video. What are the
advantages of using a variable bit rate? The advantage of using a
variable bit rate is mainly the gain it gives in encoding
efficiency. For fixed storage mediums (e.g. DVD) the
variable bit rate is ideaL By reducing the amount of space
needed to store the video (whilst retaining very high
quality), it leaves more space on the medium for inclusion
of other features e.g. multiple language sound- tracks,
extra subtitle channels, interactivity, etc. The other
important feature of the variable bit rate system is that it
gives constant video quality for all complexities of program
material. A constant bit rate encoder provides variable
quality. VARIABLE BIT RATE
= CONSTANT QUALITY CONSTANT BIT RATE
= VARIABLE QUALITY It is possible to use a
variable bit rate in, for example a satel- So why encode video
with a fixed bit rate? For some applications,
it is necessary to transmit the encoded video information
with a fixed bit rate. For example, in broadcast mediums
(satellite, cable, terrestrial etc.), practical limitations
mean that current transmission is restricted to using a
fixed bit rate. This is why fixed bit rate MPEG2 encoders
are available. It is true that a fixed bit rate encoder is
not as efficient as the variable bit rate system, however
the MPEG2 system still provides very high quality video for
both encoding methods. Very importantly, fixed bit rate
encoding can also be carried out in real time, i.e. one pass
of the video information. For live broadcasts, and satellite
link- ups etc. the real time encoding capability is
essential. What video formats
can MPEG2 handle? MPEG2 can encode video
for both PAL/SECAM (and PAL+) and NTSC formats. It also
allows for different aspect ratios, i.e. 16:9 and 4:3. MPEG2
video includes a system to convert progressive film format
video (24 frames/sec) to the interlaced 25 frames/sec of PAL
and the 30 frames/sec of NTSC. This process is called 3:2
pull down. It is achieved by taking the progressive video
sequence and repeating selected fields to decrease the frame
rate. What picture sizes
can MPEG video handle ? MPEG2 defines a range of
picture sizes to suit a range of different applicarions.
What about
subtitles? MPEG1 allows only open
caption subtitles. The MPEG2 bitstream makes a provision for
up to 32 different closed caption subtitle channels in
addition to the audio and video information. These subtitles
can be used to provide 32 diEer- ent language subtitle
channels, one of which is selected for playback at the
decoder using the MPEG system layer. What are the
applications of MPEG2 video? Cable TV networks
are using MPEG2 as the standard for compressing and
decompressing video for distribu- tion and for
broadcasting. They want high quality video and have the
bandwidth needed to handle high bit
rates. DBS (Direct
Broadcast Satellite) will use MPEG2 video for direct
broadcast. Multi-source channel rate control methods are
employed to optimally allocate bits between several
programs on one data carrier. An average of 150 channels
is planned. HDTV
(High-Definition Television also known as ATV), The U.S.
Grand Alliance, a consortium of companies, has already
agreed to use the MPEG2 Video and Systems syntax
(including B-pictures). Interlaced and progressive modes
will be supported. The developers of
DVD have defined in its specification that MPEG2 video is
to be the video encoding standard. This is made possible
by the greatly increased capacity of DVD (Up to a maximum
of 17 GB). DVD can also take advantage of the efficiency
increases of variable bit rate encoding, not presently
possible in broadcast systems. For DVD, it will also be
possible to make the playback interactive by including
for example, several different camera angles of the same
scene which the viewer can switch between on playback. It
is also possi- ble to include several different
storylines which the view- er can select
between. Video on Demand
(VOD) encompasses nearly all video based applications,
but the most common application referred to regarding VOD
is movies on demand. Initially in hotels and hospitals,
and eventually in our homes, all of us will have an
interactive television system from which we can order
which movie we want, when we want. The technology exists
today for this application although VOD to the home is
some time away from a large scale implementation. This
application is also planning the use of MPEG2 video.
However, VOD in hotels is well underway in many areas of
the U.S. and around the world. 02.
MPEG AUDIO MPEG2 audio is a
compatible extension to MPEG1 audio encoding, which enables
the transmission of mono, stereo, or multichannel audio in a
single bitstream. It can operate at a wide range of bitrates
(8 kbit/s up to more than 1 Mbit/s) and supports sampling
rates of 16, 22.05, 24, 32, 44.1 and 48 kHz. For stereo, a
typical application would operate at an average bit rate of
128-256 kbit/s. A multichannel movie soundtrack requires an
average bit rate of 320-640 kbit/s, depending on the number
of channels and the complexity of the audio to be encoded.
MPEG2 defines an extension for five full bandwidth channels
plus a low frequency enhancement (LFE) channel, termed 5.1
multichannel. With an additional compatible extension, seven
channels are possible (7.1 multichannel). How does MPEG audio
work? In devising an encoding
method, the basis had to be the human ear. Although not a
perfect device for acoustic reception, advantage was taken
of one of its characteristics: a non- linear and adaptive
threshold of hearing. The threshold of hearing is the level
below which a sound is not heard. It varies with frequency
and, of course, between individuals. Most people's hearing
is most sensitive between 2 and 5 kHz. Whether a person
hears a sound or not depends on the frequency of the sound
and whether the amplitude is above or below that persons
hearing threshold at that frequency. The threshold of
hearing is adaptive, and is constantly changed by the sounds
heard. For example, an ordinary conversation in a room is
perfectly audible under normal conditions. However, the same
conversation in the vicinity of a loud noise, such as an
aircraft passing low overhead, is impossible to hear due to
the distortions introduced to the hearing thresholds of the
individuals concerned. When the aircraft has gone the
hearing thresholds return to normal. Sounds that are
inaudible due to dynamic adaptation of the hearing threshold
are said to be 'masked'. This effect is universal but is of
particular relevance in music, An orchestra instrument
playing fortissimo will, to a greater or lesser extent, make
the sound of some other instruments inaudible to the human
ear. When the music is recorded, however, all the
frequencies go on the medium because the response of the
recording device is flat, i.e. it is not dynamically
adaptive. When the recording is played the masked
instruments will not be audible to the listener. A linear
recording, as used on CD, is inefficient in this respect. To
make the best use of a recording medium the parts of the
medium that contain inaudible data can better be used to
store audible data. In this way a fixed capacity recording
medium can contain a considerably increased amount of audio
without any loss of quality. Also, the demands on a
transmission link carrying the information are reduced.
Is MPEG2 compatible
with MPEG1? The MPEG2 standard was
designed with compatibility being a major consideration.
With the ever-growing number of applications of MPEG1 audio,
especially in the entertainment, satellite broadcasting
(DSS) and multimedia fields, this compatibility will provide
the consumer a cross-platform format to enjoy high quality
audio reproduction. The core of the MPEG2 bitstream is an
MPEG1 bitstream, which enables fully compatible decoding by
an MPEG1 audio decoder. In addition, the need to transfer
two separate bit streams (one for stereo and another one for
the multichannel audio pro- gram) is avoided. In other
words, a future upgrade of e.g. DSS with multichannel audio
will not make existing set top boxes obsolete. The existing
ones will reproduce stereo, the new ones high quality
multichannel sound. How does an MPEG1
decoder handle multichannel input? An MPEG1 decoder will be
supplied in the MPEG1 part of the bitstream with an
appropriate (stereo) 'downmix' of all channels in the
multichannel frame. The left and right channel of the stereo
signal contain components of all the channels, according to
the equations in the compatibility matrix. The MPEG1
(stereo) decoder decodes the stereo part |