Jason Spielfogel is product marketing manager for megapixel camera technology firm IQeye. In his article, he explains the difference between temporal and frame-by-frame video compression, and why his firm has used MJPEG compression.
Lately there has been a lot of discussion about MPEG4, H.264, and other temporal compression formats. Because of this ongoing discussion in the industry, I wanted to take a moment to discuss the strengths and weaknesses of different compression formats.
"Temporal compression" techniques (MPEG4, H.263 and H.264) provide advantages over "frame-by-frame compression" techniques (MJPEG, which we use at IQeye) when there is a fixed background and/or not a lot of motion in the scene. In these applications, they can really compress images (small file sizes) with very little loss of image quality. Temporal compression also makes it easier to synchronize audio with video.
However, temporal compression offers little or no advantage over frame-by-frame compression when there is a lot of motion in the scene or when the background is changing (like with a mechanical PTZ camera).
To explain this, let's first talk about the principles underlying temporal compression. MPEG-4 and related temporal techniques revolve around an "Index" frame or "I-Frame" and one or more "Reference" frames or "R-Frame". The Index frame is essentially the same as a single MJPEG frame, with most of the characteristics, image quality and compression characteristics of such a frame. R-Frames, on the other hand only measure the changes that have occurred since the Index frame. This means if there is little movement in the scene, the R-Frames are relatively small in size. So if 30 frames of MJPEG total 3 megabytes in size, the equivalent MPEG-4 stream (if there is little or no motion) may only be 300 kilobytes, or one-tenth the size.
However, if the entire scene is moving (as with a PTZ camera, or if there is substantial movement within the scene of a fixed camera, then the R-Frames can be at or near the same size as the I-Frame. Additionally, particularly in scenes with high movement, temporal compression can create "artifacts" or remnants of previous R-Frames, until the next I-Frame refreshes the scene. Finally, temporal compression, unlike MJPEG, does not benefit nearly as well from reduction in frame count (a.k.a., images per second). This is because the initial frame reductions with temporal compression are the R-Frames which are already very small in size, so the elimination of some or most of these R-Frames does not have an appreciable affect in reducing overall bandwidth.
Some temporal compression schemes also are impractical today with resolutions larger than D1, such as all megapixel imager resolutions. This is because MPEG-4 is a more CPU resource intensive compression scheme. Even a 1.3 Megapixel resolution camera has nearly four times the pixels as a D1 resolution sensor, meaning the compression processor must work four times as hard to achieve the same results. At even higher resolutions, current IP cameras just don't have the muscle to temporally compress those streams. Even if they could, the decompression cycles needed to temporally decode such a high resolution scheme would require a very high-end graphics card per camera stream, never mind multiple streams being decoded simultaneously by a single machine or server.
People often mistakenly assume frame-by-frame techniques such as MJPEG are bandwidth hogs; however, a competent network designer will always take into account worst case scenarios and would have to consider the bandwidth impact of a lot of scene motion with both temporal and frame-by-frame techniques. Again, if the scene has a great deal of motion, both techniques would consume about the same bandwidth, in which case there is no real bandwidth advantage for either.
On the other hand, frame-by-frame techniques deliver a consistent file size making it easier to predict bandwidth. They also can provide higher image quality as they don't suffer from the same "compression artifacts" or "compression blur" found with temporal compressions when there is motion. This allows frame-by-frame techniques like MPJEG to deliver much higher quality and consistent images in scenes with a lot of or when a mechanical PTZ camera is being used.
Finally, a last issue with MPEG-4 is standardization. While there is a published standard for MPEG-4, in reality this standard is never adhered to 100 percent of the time, so most MPEG-4 formats are not alike. Companies who try to integrate with them have to do extra work and in some cases, pay a licensing fee. Since MJPEG is a single, strictly adhered to standard, it has proven to be easily integrated into NVRs and other systems.
About the author: Jason Spielfogel is product marketing manager for IQinVision (IQeye).