Eye on Video: H.264 Compression

April 22, 2008
Increases over MPEG4 and MJPEG could make H.264 the No. 1 format for surveillance

H.264, the newest video compression technology, presents a huge step forward for many industries. Without compromising image quality, an H.264 encoder can reduce the size of a digital video file by more than 80 percent compared with Motion JPEG compression and as much as 50 percent more than with the MPEG-4 Part 2 standard. With far less network bandwidth and storage space required for a video file, users can save money or achieve a much higher video quality for a given bit rate.
With such efficient performance, industry experts predict H.264 (also known as MPEG-4 Part 10/AVC for Advanced Video Coding) will become the compression standard of choice in coming years. The technology has already been introduced in the latest mobile phones and digital video players, and is quickly gaining acceptance by end-users. Service providers, such as online video storage and telecommunications companies, are beginning to adopt H.264, and from all indications, the video surveillance industry will be no exception.
H.264 is the joint effort of standards-setting organizations in both the telecommunications and IT fields, which gives the technology the necessary pedigree to become the defacto open, licensed standard for video compression.

The Essence of Video Compression: Encoding and Decoding

Video compression is all about reducing and removing redundant video data so that a digital video file can be effectively transmitted and stored. An algorithm applied to the source video compresses the information. An inverse algorithm applied to the compressed information produces a video that shows almost the same content as the original source video. The algorithm pair is called a video codec (encoder/decoder).
Results from encoders that use the same compression standard may vary because the designer of an encoder can choose to implement different sets of tools defined by a standard. As long as the output of an encoder conforms to a standard’s format and decoder, it is possible to create different implementations. So the performance of a standard cannot be properly compared with other standards, or even other implementations of the same standard, without first defining how it is being implemented.
A decoder, on the other hand, is unlike an encoder because it must implement all the required parts of a standard in order to decode a compliant bit stream. This is because a standard specifies exactly how a decompression algorithm should restore every bit of a compressed video.
The graphic on the right provides a bit rate comparison, given the same level of image quality, among four video standards: Motion JPEG, MPEG-4 Part 2 (no motion compensation), MPEG-4 Part 2 (with motion compensation) and H.264 (baseline profile).

Understanding Video Frame Options
Depending on the H.264 profile — a set of algorithmic features that a standard provides for specific applications — different types of frames such as I-frames, P-frames and B-frames may be used by an encoder.
An I-frame, or intra-frame, is a self-contained frame that can be independently decoded without any reference to other images. I-frames can be used to implement fast-forward, rewind and other random access functions.
A P-frame, which stands for predictive inter-frame, makes references to parts of earlier I and/or P frame(s) to code the frame. P-frames usually require fewer bits than I-frames, but are prone to transmission errors because of their complex dependency on earlier P and I reference frames.
A B-frame, or bi-predictive inter-frame, is a frame that makes references to both an earlier reference frame and a future frame.
Network cameras and video encoders will most likely use the H.264 profile called the baseline profile. The baseline profile uses only I- and P-frames, and, as a result, achieves low latency (the time it takes to compress, send, decompress and display a file), which is critical in surveillance and is particularly important in enabling real-time pan/tilt/zoom (PTZ) control.

Basic Methods of Reducing Data
A variety of methods can be used to reduce video data, both within an image frame and between a series of frames. Within an image frame, data can be reduced simply by removing redundant information. Between a series of frames, video data can be reduced by such methods as difference coding and block-based motion compensation.
In difference coding, a frame is compared with a reference frame (an earlier I- or P-frame) and only pixels that have changed with respect to the reference frame are coded. With fewer pixel values coded, fewer are sent.
In videos with a lot of motion, however, difference coding would not significantly reduce data. Instead, techniques such as block-based motion compensation would be more appropriate. Block-based motion compensation takes into account that much of what makes up a new frame in a video sequence can be found in an earlier frame, but perhaps in a different location. This technique divides a frame into a series of macroblocks — or blocks of pixels. Block by block, a new frame — such as a P-frame — can be composed or “predicted” by looking for a matching block in a reference frame. If a match is found, the encoder simply codes the position where the matching block is to be found in the reference frame. It takes fewer bits to encode just the motion vector than to encode the entire actual content of a block.

Enhancing Compression Efficiency with H.264
H.264 takes video compression technology to a new level by introducing a more advanced intraprediction scheme for encoding I-frames. This scheme greatly reduces the bit size of an I-frame while maintaining a high-quality image by predicting smaller blocks of pixels within each macroblock in a frame. The technology finds matching pixels among the earlier-encoded pixels that border a new macroblock and reuses those pixel values that have already been encoded. As a result, intraprediction drastically reduces the bit size.
H.264 also improves block-based motion compensation used in encoding P- and B-frames. An H.264 encoder can choose to search for matching blocks — down to sub-pixel accuracy — in a few or many areas of one or several reference frames. The block size and shape can also be adjusted to improve a match. The high degree of flexibility in H.264’s block-based motion compensation pays off in crowded surveillance scenes, where the quality can be maintained for demanding applications. For areas of a frame where no matching blocks can be found in a reference frame, H.264 uses intra-coded macroblocks. Motion compensation is the most demanding aspect of a video encoder and the different ways and degrees with which it can be implemented by an H.264 encoder can have an impact on how efficiently video is compressed.
H.264 also reduces the typical blocky artifacts seen in Motion JPEG and other MPEG standards by using an in-loop deblocking filter. This filter smoothes block edges using an adaptive strength to deliver an almost perfect decompressed video.

The gathering momentum for H.264
Pundits in telecommunications and IT expect H.264 to eventually replace other video compression standards in use today. In video surveillance applications, H.264 will most likely find the quickest traction in installations where there are demands for high frame rates and high resolution — such as highways, airports and casinos, where 30 frames a second are the norm. This is where the economies of reduced bandwidth and storage needs will deliver the biggest savings.
H.264 is also expected to accelerate the adoption of megapixel resolution cameras, since the compression technology can reduce the large files sizes and bit rates generated by those cameras without compromising image quality. There are tradeoffs, however. While H.264 provides savings in network bandwidth and storage costs, it will require more advanced network cameras and high-performance monitoring stations where the video is to be decoded and monitored.
As the H.264 format becomes more broadly available in network cameras, video encoders and video management software, it makes sense to choose products that support this new open standard for video compression. Other compressions such as Motion JPEG will still be applicable for systems with low frame-rate requirements — typically below 4 frames-per-second. Ideally, select products supporting both Motion JPEG and H.264 for maximum flexibility.

About the author: Fredrik Nilsson is general manager of Axis Communications, a provider of IP-based network video solutions that include network cameras and video servers for remote monitoring and security surveillance. Contact him via email for further discussion.