Eye on Video: Intelligent video architecture

What to consider when deciding upon centralized or distributed surveillance analytics


There are several drawbacks to this PC server architecture:

1. Consumes processing power - The server handles many of the processor-intensive tasks such as transcoding the video, managing the storage and processing the video for analysis.
2. Supports relatively few cameras - Since the processing tasks require considerable power, each server can only process video from a relatively small number of cameras.

Distributed architecture options

Instead of overloading a central point such as a DVR or PC server, distributed architectures spread the processing to different elements in the network. This reduces bandwidth consumption and improves system scalability.

Network-centric processing. In typical network video systems, switches and routers send video to appropriate components in the system. As video streams through these gateways, the data can be analyzed and only the essential content or even just the metadata about an image (such as the number of people passing through an area) can be extracted and streamed to the security system operator. This prescreening saves the network from the bandwidth overload that would occur if every frame of recorded video was streamed over the network for analysis at some central point. This architecture would work well in most systems, regardless of size.

There are two main drawbacks to network-centric architecture:

1. High cost - Switches and routers with the requisite additional processing power cost more.
2. Greater complexity - Additional components add design complexity to the network.

Intelligence at the edge. The most scalable, cost-effective and flexible architecture is based on processing as much of the video as possible inside the network cameras or video encoders. Analog cameras, however, lack this 'intelligence at the edge' ability to analyze video. This architecture would be a good match for any size surveillance system running from one to thousands of cameras.

There are numerous advantages to this distributed architecture:

1. Minimized bandwidth usage - Cameras and encoders can be programmed to only transmit video when they detect motion in a defined area of a scene. This dramatically reduces bandwidth consumption and the number of operators needed to review transmissions. They can extract license plate information or headcount from a frame and send just the essential data with a few snapshots instead of consuming bandwidth with several hours of unfiltered video.
2. Reduced server costs - Servers typically process four to 16 video streams in a centralized solution. When cameras do the processing, servers can handle more that 100 video streams. For people counting or license plate recognition applications, the resulting data (rather than the video stream) can be sent directly into a database, further reducing the load on servers.
3. Improved surveillance analysis - When network cameras process raw video data before it is tainted by lossy compression formats such as MPEG-4, the quality of analysis greatly increases. Server processing power is no longer consumed decompressing or transcoding the video packets prior to processing, which would otherwise dramatically increase the number of servers required to process transmissions.
4. Lower operating costs - With fewer servers needed, power consumption and maintenance costs drop. This also removes the burden from environments without server rooms to build special facilities to support their surveillance networks.
5. Lower equipment investment costs - Reducing network bandwidth usage by streaming only essential information (metadata and snapshots) gives users the option to deploy more moderately-priced network components that can easily support reduced data rates.

There are two main drawbacks with this architecture:

1. High processing power required - Not all network cameras and video encoders have enough processing power to run the intelligent video algorithms at the edge. This is something that is being addressed in newer-generation products.
2. Multiple camera inputs needed - Some complex intelligent video algorithms, like multi-camera tracking, would need information from several cameras to work properly, which only makes it possible to run this in a centralized server configuration.