Martin Gren is the founder of Axis Communications and the inventor of the first IP network camera.
Since the invention of the first analog video camera, it was natural to compare these devices to the human eye. Focus, light sensitivity, iris, lens, focal length and aperture are terms used to describe both. Cameras in surveillance were created to see what we as humans couldn’t. Yet in the analog CCTV world, the comparisons stopped at seeing.
In the world of IP video, cameras are computers that can see. When we talk of computers, we talk of artificial intelligence. We talk of memory. Today we can compare an IP video system to the human eye and the brain.
There are some areas today where the IP camera bests even our own human abilities, but also qualities where a surveillance system will never replace human intelligence or intuition. How do we stack up today with our IP video devices, and where will we fall behind in the future?
Seeing is believing
Let’s start with the obvious comparison: the IP camera and the human eye. While there’s no perfect calculation, the whole eye is said to have a total resolution of more than 100 megapixels, but this is hardly usable for surveillance and it’s not the actual resolution that our brain (the VMS) computes.
While the eye wins outright for overall resolution, one can argue that the usable resolution of the cornea or what the brain computes at a given time can vary greatly, but be roughly estimated between five to 10 megapixels depending on the person’s eyesight. Still, given that lens technology is not on par with the higher resolutions in security cameras—maxing out around five megapixels for professional surveillance—and that most 10 to 20MP cameras lack frame rate and image quality around the edges of the scene, it is a clear win for the human eye.
There’s one main reason that lenses are not keeping pace with IP camera and sensor development and therefore the human eye: Moore’s Law. Unlike the IT components inside a camera, optical components like the lens do not follow Moore’s Law. So while lenses are taking longer to evolve, IP camera developers are using the ever-growing processing power in the cameras to look beyond pure resolution and improve overall image clarity with better light sensitivity.
Many of us suffer with poor night vision. But unlike our eyes, cameras have the ability to leverage IR wavelengths and produce a black and white image at night. Analog held one final advantage over IP regarding light sensitivity, yet neither analog nor IP could produce color images in the dark. Both of these cornerstones were passed during the last year with the introduction of color-at-night Lightfinder technology. Here, Moore’s Law is really kicking in with sensor development and we can expect a lot of progress in low light video. Also, as CMOS sensor technologies evolve we now have the ability to be almost as light sensitive in five megapixel cameras as the human eye, and much more light sensitive than the eye in HDTV and VGA resolutions.
And then of course there is the ability to see with absolutely no light at all—which no human can do. For this we now have professional-grade, all-digital thermal network cameras that can be integrated into an IP-based surveillance system. Thermal cameras can detect humans and objects in complete darkness as well as poor visibility conditions and are no longer just for military use.
Wide dynamic range is another hot issue related to the sensor and image processing. The human eye is said to have a contrast range to 120 dB. If compared to the best wide dynamic range network cameras on the market, it’s a dead heat tie. However, when humans try to see during constant contrast change, the eyes will get very tired and a headache is likely. So in the long run, and especially when fighting direct sunlight, the camera is better than the human eye without even the need to wear sunglasses.
Now that we’ve covered resolution and light sensitivity of the camera versus the eye, the next comparison is with field of view and mechanical speed. The eye has a field of view of approximately 75 to 95 degrees and a pan-tilt speed of roughly 900 degrees/second. If we compare this with current PTZ cameras, the human eye is faster than the majority and still beats the autofocus algorithms of most cameras. Thus, improving focus will be the manufacturer focus for the coming years.
However, since the human eye lacks optical zoom, IP security cameras have a major leg up. We continue to see improvements in the evolution of optics and motors in PTZ cameras that Darwin can’t keep up with.
But remember, in the same way the human eye can contract infections and obstructions, so do security cameras. Dirt, fog, dust and even spider webs affect the camera as much as they do our own eyes. Without the ability to brush debris away from their lenses, installation environment and housings are increasingly important and will see further development.
Unlike our eyes, however, the biggest leg up for the camera is that it never needs to sleep!
From detection to analysis
The no-need-for-rest feature of cameras means that video analytics is superior in the ability to perform around-the-clock, monotonous tasks like people counting, cross-line detection and license plate recognition (LPR). Think about the patience you would have to have to sit on the side of a highway and make a note of all the license plates that drive by. But when it comes to more advanced analytics, the human brain and intuition wins over a security camera in most aspects.
When in controlled environments, advanced analytics are working really well. Face detection by a video surveillance camera in a crowd is something we all can dream of, but face detection in a controlled environment can be deployed successfully. This intelligent feature will not only play a big future role in access control, but in more unique applications like retail customer reward programs.
Humans still effective
When it comes to detecting strange behavior and forensics, there is nothing like a guard or operator. While advanced behavioral analytics are improving, the human element will be important for many years to come—even if CSI and other TV shows would like you to believe otherwise.
The key to the future is mining all the high-quality video data that IP cameras capture and consider new and novel uses for this information. The retail market will be the biggest winner in the future. Analytics will continue to improve—especially as software developers from all walks of life are attracted to the surveillance industry with the goal to develop applications to run inside the camera itself—but a human will nearly always be required for this aspect of the industry to thrive.
However, when talking about analytics and software, there is the rising issue of potential patent lawsuits attempting to block the use of a specific algorithm. This is happening in our industry as well as many others, including the mobile phone market.
One solution could be to pool patent fees among the patent holders in order to share these innovations with the world while keeping overall costs down for the end-user. This will leave us free to innovate and drive business. Until then, we as people will have an advantage over surveillance systems for many years because it’s not possible to patent humans (fortunately!).
We all have personal memories that we can look back on in an instant. I’m not a neurology specialist, so it’s astonishing how our brains can analyze the pictures/videos from our past and have the ability to record for many years. Here, even the most advanced computers completely lag humans. That’s good news for police officers interviewing folks about a crime —even if eye witness testimony is sometimes proven shaky.
Humans are said to have short-term and long-term memories. So too do surveillance systems. Consider long-term memories as the server-based and NVR systems with the ability to download and store video for long periods of time. Local, edge-based recording is then short-term memory—which is improving in the camera not through memory exercises, but through Moore’s Law.
Edge-based storage in a small camera system environment has many benefits and it will continue to get better. Today’s standard for a modern surveillance system is HDTV and, if configured properly, the user will never miss a single frame. With the evolution of SD cards, which are now available in 64 and 128 GB with more storage to come, we will easily be able to store weeks of high-quality video inside the camera or the encoder in the coming years.
And just as edge-based storage grows, so too does Internet bandwidth and availability. Just as our own human behavior has dictated the rising use of Gmail, online banking, movie streaming, personal storage, file sharing and other cloud-based services, a similar need for anytime, anywhere video access and offsite storage has led to the emergence of hosted video. While edge recording is perfect for single-site deployments, hosted video has seen success where the end-user has multiple, dispersed sites to monitor.
Yet a question I often get when speaking about the growth of edge-based and hosted video storage is if these technology trends mean the end of the VMS. The answer is simple: These cameras still need to be managed and what does this the best is a good VMS!
The real game changer will be in the smaller camera count market where we will see edge storage replace the DVR soon. This trend may be even more disruptive when we combine good camera-edge-recording and analytics. A third layer is when we combine edge recording with hosted video. Using analytics with increased edge storage capacities will be attractive because this solution does not require continuous Internet bandwidth.
So while humans can remember even our earliest memories, an IP surveillance system has the most reliable long-term memory and its short-term memory growth is far outpacing our own.
Man vs. machine
When we compare man vs. machine in the surveillance world, the one certainty is that we need to work together for maximum efficiency today and into the future.
Humans have higher pixel vision, but the IP camera helps us see in difficult light and pure darkness. Guards and officers in the field can scan quickly for signs of trouble, while their colleagues in the command post use cameras to zoom in for a closer—and safer—look. Our brains can analyze a scene and predict behavior thanks to human intuition, but the IP camera is there to help with repetitive tasks without getting bored or falling asleep at the wheel. Our long-term memories are unrivaled in the animal world, but the camera never lies or misremembers.
Moore’s Law keeps on working to give us more processing power and usable resolution, while the latest human evolution seems to be that we are growing larger, taller and, yes, wider—just like our old analog TVs!
IP video will continue to improve, and humans must adapt to get the most out of the technology. After all, in a footrace, it’s clear that Mr. Moore is much faster than Mr. Darwin.
Martin Gren is the founder of Axis Communications, Chelmsford, Mass., and the inventor of the first IP network camera.