Other standards

Download as PDFDownload as PDF

The ISO and IEC standards bodies have formed two committees that have defined video/audio compression, namely the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG). These two committees have defined the JPEG, MPEG1, MPEG2 and MPEG4 standards. While these standards are more applicable to broadcasting and multimedia, some of the standards are used in videoconferencing.

1 JPEG

ISO/IEC standard 109181 (also defined by ITU.T standard T.81). This is an international standard for the compression and coding of continuous tone still images. This standard includes several methods of compression depending on the intended application. JPEG is a ‘lossy’ method of compression as it loses some detail during the coding/decoding process. It can be adjusted however to be very economical in terms of data rate. ‘Lossless’ algorithms on the other hand can be decoded to reproduce the original detail but require higher data rates for transmission.

2 MPEG1

This is a popular standard for the compression and coding of moving images and sound. It is the format used to store material on CDROM and CDI; the maximum data rate obtained is 1.5Mbit/s. MPEG1 has three elements:

  • MPEG1 ISO/IEC 111721 defines the MPEG1 multiplex structure, i.e. the way in which the digital audio/video/control data is combined;
  • MPEG1 ISO/IEC 111722 defines the MPEG1 video compression and coding;
  • MPEG1 ISO/IEC 111723 defines the MPEG1 audio coding.

MPEG1 is a widely used compression format and has been used for CDROM production. It has an upper video resolution of 352 x 288 pixels (i.e. CIF) which while adequate for many applications represents only a quarter of the SDTV (Standard Definition Television) resolution of 704 x 576. Because of this limitation, to meet the needs of the broadcasters the MPEG2 standard was developed.

3 MPEG2

  • MPEG2 ISO/IEC 138181 defines MPEG2 data stream formats.
  • MPEG2 ISO/IEC 138182 defines MPEG2 video coding.
  • MPEG2 ISO/IEC 138183 defines MPEG2 audio coding.

Basically MPEG2 is a ‘compression toolbox’ which uses all the MPEG1 tools but adds new ones. MPEG2 is upwardly compliant, i.e. it can decode all MPEG1 compliant data streams. MPEG2 has various levels of spatial resolution dependent on the application.

  • Low level, i.e. 352 x 288 pixels (CIF resolution)
  • Main level, i.e. 720 x 576 pixels (Programmable Array Logic (PAL) TV resolution)
  • High level, i.e. 1440 x 1152 pixels (high definition TV)
  • High level wide screen, i.e. 1920 x 1152 pixels.

MPEG2 has further options regarding the algorithms used for coding and compressing the information these are known as ‘profiles’.

  • Simple Profile uses a simple Encoder and Decoder but requires a high data rate.
  • Main Profile requires a more complex Encoder and Decoder at a greater cost but requires a lower data rate.
  • Scalable Profiles which allow a range of algorithms to be transmitted together e.g. basic encoding for decoding by an inexpensive decoder and enhanced encoding, which can be accessed by a more sophisticated and more expensive decoders.
  • High Profile to cater for High Definition Digital Television Video (HDTV) broadcasts.

The most common MPEG2 set is Main profile, Main level, used for television broadcasting. Depending on the quality required the data rate can vary from 49Mbit/s. Data rates for the whole MPEG2 family can vary between 1.5 and 100Mbit/s.

4 MPEG4

MPEG4 is a comprehensive format that builds on the MPEG1 and MPEG2 standards. It is designed to provide a mechanism whereby multimedia content can be exchanged freely between the producers (e.g. the broadcasters and record companies), the distributors (telephone companies, cable networks, Internet Service Providers (ISPs) etc.) and the consumers. This content can be audio, video and/or graphic material. The delivery can be one-way or interactive and may be streamed in real time. This all-encompassing standard spans digital broadcasting, interactive graphics and multimedia over the Internet and includes 3G multimedia phones. The standard has numerous profiles for audio, video, graphics etc. MPEG4 AVC, sometimes referred to as ‘MPEG4 part 10’, is the one most likely to be met with in videoconferencing.

4.1 MPEG4 AVC

The ISO/IEC has collaborated with the ITU to develop this new standard also known as H.264. It is expected to eventually replace MPEG2 and MPEG4 standards in many areas due to its more efficient coding algorithms. It is claimed that bandwidth can be reduced by 50% when compared to H.263 compression. Another big advantage of H.264 is its inbuilt IP adaption layer, allowing it to integrate into fixed IP, wireless IP and broadcast networks with ease. It is also expected to find new applications in areas such as Asymmetric Digital Subscriber Line (ADSL).

5 Motion Joint Photographics Expert Group (MJPEG)

While MPEG encoding is now used extensively it does have some serious limitations for some applications and particularly for videoconferencing. The MPEG encoding/compression process in common with H.261/H.263 coding of video signals functions by eliminating a high proportion of both redundant spatial and temporal picture elements. In doing this it requires a considerable amount of time to actually complete the process (termed latency). In practice this ‘latency’ demands that the audio signal be delayed by a similar amount so that lip synchronisation can be preserved within a conference. To ensure realism ‘echo cancellers’ then have to be incorporated to reduce echo between sites to an acceptable level.

If the temporal structure of the vision signal is left intact and JPEG frames are joined together the resultant coding is called MJPEG or Motion JPEG. This signal format can overcome most of the latency problems. Unfortunately no single standard has yet evolved for joining the JPEG frames together so MJPEG itself is not an international standard as are the JPEG and MPEG formats.

MJPEG coding/compression reduces the redundant spatial picture elements but does not affect the temporal elements (i.e. redundancy between successive frames). The process therefore generates far less latency than MPEG or H.261 systems and it is found that echo cancellers are generally not necessary. This reduces cost and has the potential to improve sound quality.

This reduction in latency is quite marked. For MJPEG CODECs end to end audio delay is typically 60 microseconds whereas for an ISDN2 system (i.e. 128kbit/s) the delay could be as much as 400 microseconds (i.e. almost half a second). MJPEG encodes only vision signals, so another coding algorithm (usually G.711) or high quality Pulse Code Modulation (PCM) is used for the audio information.