Understanding H.264 Video Compression

How the new H.264 standard for surveillance video compression works, and when to use it


H.264 is the latest official video compression standard, which follows from the highly successful MPEG-2 and MPEG-4 video standards and offers improvements in both video quality and compression. The most significant benefit for IP video systems is the ability to deliver the same high-quality, low-latency, digital video with savings of between 25 and 50 percent on bandwidth and storage requirements. Or to put it another way, the goal is to deliver significantly higher video quality for the same bandwidth.

What is H.264?

H.264 is a video codec (compression and decompression, a.k.a., encode, decode) standard. A video codec is designed to compress and uncompress digital video in order to reduce the amount of bandwidth required to transmit and store the video. This is needed as the raw data rate of uncompressed CCIR601 active digital video (720x480 pixel, 4:2:2 video at 30fps) is in excess of 158Mbps. That raw volume is over 300 times the capacity of a 512kbps ADSL connection and would allow only a little more than one hour of recording on an 80GB hard disk.

Simply scaling the video, to SIF resolution (352x240 pixels with a 4:2:0 video color sampling scheme at 30fps), and compressing with standard utilities such as WinZip or Gzip could achieve 10:1 compression. However, at least 300:1 compression is needed to stream live video over an ADSL connection and to achieve 300 hours (12.5 days) recording to an 80GB hard disk. This level of compression can be achieved with H.264.

Implementing the Standard

It is important, before looking at H.264 in more detail, to understand the difference between making a comparison between a standard and an implementation of a standard. The two are very different. Thus when people say, "H.264 provides better video quality than MPEG-2" this is a little misleading.

H.264 is a video compression standard. The H.264 standard defines the syntax of a compliant bitstream, to which a compliant decoder must conform exactly, implementing all the necessary tools defined by the standard in order to decode the bitstream.

An H.264 encoder, conversely, can implement a subset of the syntax defined by the standard, providing it produces a compliant bitstream. Various implementations and algorithms within the encoder are also not defined by the standard, and are created by the designer of the encoder. As such, H.264 encoders from different vendors will produce streams of differing quality, for the same bitrate.

So it is more appropriate to say, "H.264 provides a richer syntax and toolset than MPEG-2 and as such allows the possibility of implementing a superior video encoder that can generate higher quality video for the same bitrate, and can generate the same quality video at a much lower bitrate."

This can be demonstrated using the reference software encoder (JM11) freely available from the International Standards Organization (ISO). The H.264 reference encoder allows a user to select which tools to use in order to encode a particular video sequence. The table below shows the result of encoding an identical video sequence using the H.264 reference encoder with different tools. Each output bitstream from each test is a fully compliant H.264 bitstream and each bitstream is of equivalent video quality.

Tools

Bitrate (kbps)

Total execution time of required coding tools only (relative)

I-frame only encoding

2279

1

I and P-frames but with no motion estimation (0 search range)

1055

1.5

I and P-frames with a +/-16 search using a simplified search algorithm

453

14

I and P-frames using a full search algorithm with differing block size motion compensation

421

56

This table clearly shows that the more tools and algorithms that are used, the greater the compression achieved for the same quality of video. However, it is also clear that the addition of tools comes at the expense of increased complexity - in this case measured by the execution time of the encoding process. It is this increase in complexity that often causes some tools or algorithms to be omitted from the design of an H.264 encoder.

This content continues onto the next page...