Eye on Video: Applying object-based intelligent video

Oct. 21, 2008
How to detect and characterize moving objects

The previous column in this series (see Applying Pixel-Based Intelligent Video) discussed pixel-based intelligent video (IV) applications that send alerts to security personnel when a camera detects a significant number of pixel changes in the view window. This month's article focuses on object-based IV applications and the sophisticated algorithms that enable systems to recognize and categorize objects in a video frame. Next month, we will discuss specialized intelligent video applications.

Object-based intelligent video applications fall into two broad categories: object recognition/classification and object tracking. Whether you decide to deploy this video intelligence at the server or distribute it to your surveillance system endpoints depends on the equipment you use and the demands of the environment in which it is being operated. (For more information on intelligent video architecture, see the article Intelligent Video Architecture: Deciding whether to centralize or distribute your surveillance analytics.)

When to use object-based analysis
Object-based analysis is a useful surveillance tool for observing the behavior of objects, such as monitoring the flow of traffic in a given area. It is also useful in counting objects within a scene, such as the length of a checkout line in a supermarket. Keeping track of objects, such a suitcase abandoned in the middle of a busy airport or someone loitering in front of an automatic teller machine, is another important capability of the technology.

How classification works
IV applications go through three steps: detection, segmentation and classification. In detection, the system analyzes all the pixels in a sequence of video frames, comparing pixels in each frame to a reference frame in order to determine what objects are moving. In segmentation, the system extracts the moving objects that were detected and assigns them descriptive signatures such as color, size, direction of motion and time. In classification, the system takes the segmented objects and categorizes them into different object types, such as a person or a car, and assigns them a set of descriptors that characterize them by such attributes as color, size or direction of motion. This descriptive information is called metadata.

Once the system creates metadata about objects in an image area, it then applies a set of criteria. For instance, if the metadata shows a person walking the wrong way, an individual abandoning a bag, or a car entering a restricted area, it can raise an alert in real-time or retrieve the appropriate video from storage to be used as evidence in criminal proceedings.

Counting objects
Once the IV system detects and classifies an object, you can leverage the information for a variety of purposes. People counting-especially in retail environments-is one of the primary applications for this technology because it provides a wealth of data to assist store managers in optimizing store layout and customer service.

Ć¢ā‚¬Ā¢ Analyzing customer traffic patterns - Retailers can count the number of people entering and exiting the store, passing through certain aisles or stopping by a particular merchandising display to leverage sales of impulse items or groups of items.

Ć¢ā‚¬Ā¢ Managing queues - Management can count the number of people standing in a queue for service, such as at a checkout counter or at the airport ticket counter or passport/security control point to calculate when to open more stations.

Ć¢ā‚¬Ā¢ Detecting tailgating - Object-based IV applications are particularly valuable in access control because the system can send an alert when one person swipes an access control badge to open a door and multiple people enter the facility.

Ć¢ā‚¬Ā¢ Crowd control - The ability to count the number of people gathered in a particular area of a scene means the IV system can send a warning when a capacity threshold is about to be exceeded or when something is impeding traffic flow.

Tracking objects
In most security situations, you need to keep track of where objects and people travel within a facility. Object-based analysis first segments a particular object in a camera view and then tracks it as it moves around within that view or from one camera to another. This is particularly useful for monitoring ingress and egress to a building or flagging suspicious packages in high traffic areas.

Ć¢ā‚¬Ā¢ Exposing perimeter breaches - Whether you call it a digital fence, a tripwire or a virtual fence, an object-based IV system lets you designate a line or area in a facility where access control breaches may occur. If an object goes past that line in a particular direction or an object enters or leaves a certain area, the system can send an alert to security personnel.

Ć¢ā‚¬Ā¢ Spotting abandoned objects - With heightened concerns about explosives left behind in bags and packages, many public transportation facilities, government buildings and retail malls deploy object-based analysis as a critical component of their surveillance systems. The IV application watches an area, keeping track of all the objects in it. When a previously moving object becomes stationary and stays that way for a certain period of time, the system raises an alert and shows the security system operator the object of concern.

Ć¢ā‚¬Ā¢ Detecting loitering - A loitering function tracks the amount of time and the number of people who linger in a certain area, such as a parking lot or in front of an automatic teller machine, which could be an indication of malicious intent.

Techniques for tracking people and objects
There are a number of ways to track a particular person or object moving in a camera's view. An operator can select the person or object of interest or program the system to make the choice.

Ć¢ā‚¬Ā¢ Using a pan/tilt/zoom (PTZ) camera - A PTZ camera can automatically lock onto a person and keep that individual in sight, including zooming in to give security staff a better view. Once the person moves out of range, the camera finds another moving object to track. The application is useful in low-traffic environments like parking lots and hallways since it provides a view of the object without operator intervention. Some PTZ tracking systems, however, get confused as to what they should be tracking if more than one object appears in the camera view.
Ć¢ā‚¬Ā¢ Using multiple cameras - One of the most difficult IV applications to automate is multi-camera people tracking, also called camera hand-offs. This technique helps security personnel keep a particular suspect in constant view, even in a location or facility covered by a large number of cameras, by handing off the tracking of a particular object from one camera to another.

Pointers for real-world deployment
To ensure that your object-based intelligent video application works accurately, there are a couple of factors you should address:

Ć¢ā‚¬Ā¢ Camera placement. For people counting, you should place the camera immediately above the entrance. The height depends on the optic lens you choose and the width of the entrance. The size of the person passing under the camera must be larger than 6 percent of the camera's total horizontal field of view. It is important to choose a camera with sufficiently high resolution to enable surveillance operators to clearly distinguish the people passing under the camera.
Ć¢ā‚¬Ā¢ Field of view. For single-camera tracking applications, use a camera with a wide field of view. This allows a security operator to track a person of interest over a broader area. Network cameras with a 140o field of view will even allow an operator to zoom into a particular area without losing video quality. Cameras with a 360 o field of view many seem well-suited to this type of tracking, too. But generally they prove impractical because their resolution is often too low to provide sufficient image detail.

As was mentioned in last month's article on pixel-based intelligent video, no IV application is infallible. But with careful adherence to some best practices, your installation can achieve between 90 and 95 percent accuracy.

About the author: Fredrik Nilsson is general manager of Axis Communications, a provider of IP-based network video solutions that include network cameras and video encoders for remote monitoring and security surveillance. He can be reached via email at [email protected].