Eye on Video: Intelligent video architecture

What to consider when deciding upon centralized or distributed surveillance analytics


Studies have shown that as the scope of video surveillance installations continue to grow, the sheer volume of video being recorded increases the likelihood that operators and security guards will miss incidents or failed to notice suspicious behavior in time to prevent crime from happening. In fact, a 2002 study published in Security Oz Magazine concluded that, "After 12 minutes of continuous video monitoring, an operator will often miss up to 45 percent of screen activity. After 22 minutes of viewing, up to 95 percent is overlooked."

Other studies report that finding incidents in stored video is so time consuming that many businesses actually view as little as one percent or less of the video they record. So despite increasing their surveillance coverage, companies are actually starting to experience greater security risks to their people and facilities.

To help facilities improve their surveillance effectiveness, vendors are developing a host of intelligent video (IV) applications that automatically analyze video data and glean useful information for security personnel. IV systems use complex mathematical algorithms to extract moving objects or other recognizable forms from the recorded video, while filtering out irrelevant images or movement. Intelligent decision-making rules govern the data search to determine if the events recorded in the video are normal, or if they should be flagged as alerts to security staff or police.

Over the next six installations of this column (on SecurityInfoWatch.com and in Security Technology & Design magazine), we'll cover an array of topics associated with intelligent video applications that enhance the value of surveillance systems. We'll begin this series by first focusing on architectural options.

Where to locate video intelligence

There are two broad choices for IV system architecture: you can centralize the intelligence or distribute it to the endpoints. In a centralized architecture, cameras and sensors collect and transmit video and other information to a centralized server for analysis. In a distributed architecture, network cameras, video encoders and network components such as switches, have the intelligence to process the video and extract relevant information.

Centralized architecture options

In centralized architectures, the cameras transmit all the video back to the recording device for processing of the intelligent video algorithms. In infrastructures with analog cameras, this recording device is typically a multi-function DVR. In network video systems, it is usually a PC server with video management software.

DVR-based processing. A built-in encoder card converts the video from analog to digital format and then performs intelligent analysis -- anything from people counting to vehicle license plate recognition. The IV-enabled DVR compresses the video, records it, and distributes resulting alarms and video output to authorized operators. This architecture would be most effective for smaller systems with four to eight cameras.

There are several drawbacks to this DVR architecture:

1. Not scalable or flexible - Built with a specific number of inputs, adding even one additional camera entails adding another DVR, thus incremental expansion becomes costly.
2. Non-supportive of essential network utilities - Proprietary, embedded devices, DVRs cannot be easily networked and do not support risk mitigating tools such as firewalls and virus protection
3. Limited computing power - Traditionally designed to store and view a limited number of cameras, when running newer IV application that require a lot of processing power, DVRs can support only a fraction of the number of cameras for which they were originally designed.

PC server-based processing. Commercial off-the-shelf, PC servers overcome the scalability and flexibility limitations of DVRs by pushing digitization and compression out to the network cameras and encoders. Network camera video goes directly to the server over the network. Video encoders digitize analog camera video before transmitting it over the network to the server. This architecture would work best for medium-sized systems with eight to 16 cameras.

This content continues onto the next page...