Compression 101 and the Next Frontier

Oct. 27, 2008

Compression is one of the most important criteria in determining quality and cost of ownership in a digital video system.   Advances in processing power enable more sophisticated compression techniques.   Better compression lowers the total cost of ownership (see chart on page 62) and enables remote access to quality video over the Internet and wide area networks.   Most digital video products allow an installer to configure the quality to match the bandwidth and storage capabilities of the system.

Why compress video?

Uncompressed digital video is simply cost prohibitive to store and impossible to transmit over most networks.   Raw video generates over 1,000 Kbytes (Kilobytes) for each frame of video.   This translates to a whopping 80 megabytes per second ( Mbs ) for a 10 frame per second (fps) video stream at a resolution of 320x240 pixels.   Some 20 Mbs are required for a 10 fps stream at VGA resolution (640x480).

Compression reduces the amount of data required to represent a video frame. The process works by applying mathematical algorithms to the data at the video source--digital video recorder (DVR), encoder or network camera.   The process must be reversed to decode the compression video for viewing on a monitor.  

All video compression techniques are “ lossy ” meaning that some information is lost during the compression process.    Fortunately, the human eye is forgiving and can tolerate some loss of information without impacting the perceived quality of the image.

Video compression standards fall into two basic categories: spatial (image) and temporal compression.

Spatial compression standards individually compress each frame of video and create an image stream by sequentially concatenating the frames in sequence.   As more frames are created, more bandwidth and storage space is required.   The relationship between recorded frames and storage is very linear.  

Two popular spatial compression techniques are Wavelet and Motion JPEG or MJPEG.   Wavelet is the grandfather of compression for digital video surveillance.   An entire generation of DVRs have been built on wavelet compression.    Most current network cameras have been built to support MJPEG.   Since JPEG images pervade most Web pages, any PC can decode MJPEG video without a special decoder.

Temporal compression techniques like MPEG-4 (Part 2) do not compress each individual video frame in its entirety.   MPEG-4 only compresses the differences between consecutive frames.    The process works by periodically compressing a full frame called a reference or Intra frame (I-frame). The process then compresses only the subsequent differences in the following frames: called difference or P-frames. The result of this process is a significant reduction in the size of the P-frames and overall data stream.   This data reduction is particularly evident when there are periods of low motion in the video scene and therefore fewer differences from frame-to-frame.   (See related graphic on this page.)

H.264 is the next step in the compression evolution. The codec resulted from efforts by two standards groups and is therefore known by several names.   MPEG-4 Part 10, H.264 and Advanced Video Coding (AVC) all refer to the same compression method.    H.264 couples the same general motion estimation techniques found in MPEG-4 with more advanced mathematics to further reduce data rates while preserving video quality.  

Nicol Verheem , chief executive officer of technology provider OpenVideo of Orange County, California expects H.264 to become the de facto compression standard for video surveillance.   Verheem said H.264 can provide a data rate reduction of about 25 percent versus equivalent MPEG-4 video.   Verheem noted that unlike MPEG-4 (Part 2), H.264 provides full support for high definition (megapixel) video.   Relative to MPEG-4, Verheem added that H.264 will have better interoperability between manufacturers since the standard is more fully defined and leaves less room for variations in implementation.   H.264 is starting to appear in digital video products such as Sony's IPela camera line.

Balancing quality and bandwidth with MPEG-4

MPEG-4 and H.264 give the system designer the flexibility to balance video quality with the storage and bandwidth consumption.   Most commercial network cameras and encoders provide a range of flexible configuration settings to tune the compression to different levels of video quality and data rates.   For products using MPEG-4/H.264, this flexibility makes it very difficult to anticipate storage requirements or to compare the data rate performance of one product over another.    The end result greatly depends on the degree of motion in the video scene and how the product's compression has been tuned.

Of course every manufacturer will establish default settings for the compression.   But a few key values can usually be configured during system installation to customize performance.

Compression level – As with most compression methods, the level of data rate reduction (sometimes called quantization) can be configured.   Usually this setting is determined as a value (0 to 100) or an enumerated setting (high, medium, low).

I-frame settings – Reducing the frequency of I-frames saves bandwidth but can reduce the quality of the image, particularly when motion is high. Conversely, increasing the number of I-frames will increase the data rate.   Another side-effect is that long periods between I-frames can increase the latencies and delays in switching a video monitor from one camera to another.   When a PC switches to a new MPEG-4 video source, it must receive an I-frame before it can begin to decode the video stream and render video on the monitor.  

Bit rate control – Adjusting the maximum bit rate and setting it to “variable” or “constant” is a good way of controlling the bandwidth used by the MPEG-4 video stream.    Variable Bit Rate (VBR) settings will adjust the bit rate according to the image complexity.   When in VBR mode the system will use more bandwidth when the activity in the image increases and less bandwidth when the monitored area is quiet.

Leaving the maximum bit rate as unlimited will provide consistently good image quality, but at the expense of increased bandwidth use whenever there is more activity in the image.    Limiting the bit rate to a defined value will prevent excessive bandwidth use but images will be lost when the limit is exceeded.  

When using Constant Bit Rate (CBR) you can set a fixed target bit rate which will ensure that the level of bandwidth consumed is predictable and will not change.   This mode is desirable when video is accessed over a remote network connection with limited bandwidth.   Of course when in CBR mode, the codec will sacrifice quality (frames or image quality) to maintain its target rate.

Ultimately, the advanced compression algorithms in MPEG-4 and H.263 require more processing power and therefore add cost to the DVR or network camera product.   The storage savings will usually justify the higher product cost.    Dealers and integrators must also be aware of the configuration parameters associated with temporal codecs that can optimize the video quality with the available bandwidth and storage.

Tom Galvin of NetVideo Consulting is a network video specialist. NetVideo Consulting (www.netvideoconsulting.com) provides consulting services and product evaluations that enable successful networked video solutions.