Tech notes from the Embedded Vision Summit

May 27, 2022
Innovations on display at show hold enormous potential for video surveillance and the security industry as a whole

Last week, the Embedded Vision Summit was held in Santa Clara, Calif., by the rapidly evolving Edge AI & Vision Alliance.  In humans and mammals, the sympathetic and parasympathetic nervous systems activate fight or flight response during a threat or perceived danger, or the restoral to a state of calm.  Like our autonomous nervous system, embedded systems, through visible light, 3D imaging, LiDAR, radar, infrared, acoustic, and energy sensors detect and classify objects, make basic decisions and perform actions.  Exhibitors at EVS supply the entire process or “end-to-end solution” of training, inference, sensing, processing, control, and decision improvement.

Comparing the EVS ecosystem with the security industry puts a great deal in perspective and answers many questions, such as who actually creates an IP video system capable of screening for weapons at openings or reads vehicle license plates, detects the number of passengers and issues a violation for driving illegally in an HOV lane. Each of the exhibitors below has a role in creating vision solutions with deep learning algorithms. 

In the security industry, the “manufacturer” has a “factory” and geographical sales divisions where distributors manage a dealer and system integrator network, who ultimately sell the manufacturers product/service to end users.  Logistics contractors, regional solution providers and service providers help out along the way. The embedded vision ecosystem begins several steps prior to a security industry manufacturer, with an original device manufacturer (ODM) creating the sensor, an original equipment manufacturer (OEM) integrating the sensors into a functional system-on-chip (SoC), module, board, assembly or machine.  The end-to-end solution provider includes all those items and unifies them with modules from a computer vision (CV) application developer that tests and trains the modules with actual or synthetic data.  The solution provider then sells the created product to a security “manufacturer,” joining the security industry ecosystem or selling the product through value added resellers and distributers to end users that participate in other sales channels.

EVS presented any participants in multiple channels with the most comprehensive learning, participating and selling opportunity possible through basic education sessions, “Deep-Dive” workshops and hundreds of exhibitors. Visitors would often meet CEOs as well as sales channel managers at each booth. Jeff Bier, founder of the Edge AI and Vision Alliance talked about EVS as being “fresh inspiration, ideas, know-how, and connections that will speed your journey towards solving real-world problems using visual AI.”

AlwaysAI

An end-to-end application solution provider, AlwaysAI can manage an entire project to transform a physical machine into an automatic driver assistance system (ADAS) or an AI processor that receives multiple IP video streams at different resolutions, executing different deep learning algorithms, and leveraging the advanced power management and required terra-operations per second for a wide range of uses.

One such solution was demonstrated at the Hailo booth, which featured one of the convolutional neural network algorithms developed by CVEDIA for a Lanner Electronics AI processor.  In the security industry, the use of privacy zones is common. However, embedded systems may need to process a moving object that needs to be obscured or redacted, but sill recognized.  This is such a requirement in today’s correctional institutions where private inmate areas are recorded with full detail, however; the individual’s bodies are blurred or redacted so a monitoring operator can only see that a person is there. In the event of an investigation, the unredacted version is available for review by prison officials.

They demonstrated the single algorithm being executed in one window on a development laptop, while the other window showed the code, scrolling rapidly as the algorithm accurately predicted the movement of the people, always keeping up with the redaction, yet providing a non-obscured view of the background scene.  This was a significant demonstration of the typical relationship between hardware provider (Lanner), developer (CVEDIA), and end-to-end provider (AwaysAI).

Arduino Pro

Arduino is known by DIYers as being one of the two most popular microcomputer hardware suppliers.  Arduino Pro, a line of industrial system-on-modules (SOM) now brings pre-tested, multi-core processors, pre-loaded Linux OS, device security, and low energy Wi-Fi/Bluetooth connectivity for Edge AI industrial requirements.

Two new products, the PORTENTA X8 and NICLA Vision come ready to use for security surveillance, predictive maintenance and prototyping applications. The X8 has nine cores over three advanced microcomputers within a compact 25x66mm form factor, including input/output control via 22 digital I/O pins and eight analog input pins. NICLA Vision is a tiny 22.8mm square with 2MP color camera, 6-axis motion sensor, distance sensor and dual core processors.

With both, X8 and NICLA users can deploy AI algorithms and ML on the edge and connect to Arduino Pro Cloud for end-to-end system management. Linux OS is preloaded onboard and MicroPython is supported, making it a plug-and-play solution capable of running device-independent software and connecting to peripherals and the Arduino Pro Cloud for OS/application updates remotely.

CVEDIA

Known for their development of synthetic data, permitting AI devices to run “out-of-box” or pre-trained algorithms, CVEDIA is a strong AI development partner to ODM Hailo and OEM Lanner Electronics.  As a solution provider, Lanner Electronics is leveraging CVEDIA for two significant smart city applications, Monitoring Traffic Control (MTC) through vehicle classification and direction and protecting Vulnerable Road Users (VRU) like pedestrians, bicyclists, and motorcyclists who are easily injured and killed in car-dominated road spaces.

With the scarcity of thermal training data publicly available for MTC and VRU, the latest Hailo/CVEDIA partnership is a significant security/smart city/public safety/transportation leap forward as synthetic data is used to develop thermal AI solutions at scale.

Almost every scalable hardware product introduction at EVS has made GPU-based solutions closer to legacy. CVEDIA's thermal perimeter security solution is running at 55 frames per second (FPS), latency of 18.5 ms, and power consumption of only 1.8W on the Hailo-8 AI processor.  This is about 20 times faster than many GPU-based solutions on the market.

DECI

DECI is a deep learning development, modeling and benchmarking platform that offers significant performance increases of deep learning algorithms across multiple hardware types.  Less code runs faster and the DECI inference platform allows developers to quickly test how many frames per second get processed for a given algorithm at full HD for a fixed scene.  At EVS, DECI’s multi stream demonstration showed better than 30% FPS improvement with complex moving object recognition.

RingCentral is a cross-platform cloud edge and VOIP mobile communications solution who uses DECI to shorten product launches, updates and improve overall performance.

Hailo

On the way to having the most diverse partner network across multiple industries, the Hailo booth was the most active, next to Lanner and Arduino Pro. The Hailo-8 AI processor is capable of 26 tera-operations per second (TOPS) at an ultra-low power consumption of 2.5 W.  Higher TOPS at low power means more complex object recognition algorithms like pose detection to run at higher FPS and across multiple streams at full HD. This moves edge AI forward and means the cloud can be used less for continuous processes like object recognition and more for intermittent processes like performance verification, updates and authentication. 

At the Lanner Electronics booth, Hailo was well represented in products capable of up to an incredible 150 TOPS, the equivalent of about 30 high performing IP cameras or about 50 iPhone 13 smartphones on average.

Immervision

When Immervision first arrived on the security industry scene and exhibited at the ISC and ASIS conferences in the late 1990s, they demonstrated the first immersive 360-degree industrial lens and dewarping software.  Through the 2000s and 2010s Immervision evolved into one the highest quality OEM 360-degree optical solutions for security, video conferencing, and industrial mobile devices.  End users purchasing panoramic IP cameras often do not realize they are using Immervision, who is both ODM and OEM to security manufacturers and embedded device hardware providers.

Another significant announcement at EVS is the Immervision IONODES’ PERCEPT body camera for corporate security. With ultra-wide panomorph lenses featuring a 180-degree field of view integrated into PERCEPT, physical security workers with a body camera can effectively capture, in real-time, any potential security threats or issues in high resolution.  From the mounting system, advanced power management system capable of running for days, and cloud management service, PERCEPT represents a simple, high-performance way for corporate security to improve safety, and act quickly in the event of an active shooter.  In other words, you may be distracted, but your immersive body worn camera has a wider field of view with the potential to run CV algorithms.

Perhaps of greatest importance is the embedded IONODES 4K camera, that connects to Wi-Fi, Bluetooth and LTE, and integrates with a GPS receiver and 9-axis motion sensor, allowing users to closely follow the field of view in real-time, to stabilize the image and provide location tracking to users, whether for security, prevention, monitoring or entertainment applications. 

Inuitive

The “Robot on a Chip” introduced at EVS from Inuitive, includes the powerful family of multi-core vision processors NU4000 and NU4100 that support 3D Imaging (3D depth) with CV, DL, convolutional neural networks optimized for vision and sensor fusion, and an integrated high quality dual ISP.  The NU4000 then is joined with the tiny M4.5S module that integrates 3D sensing and image processing with AI capabilities to provide robotic devices with human-like visual understanding.  The power efficient M4.3WN is a self-sufficient depth sensor (at narrow FOV), vision and AI processor module, functioning as an independent sub-system, generating fully processed depth information to its host. An RGB camera sensor, registered to the depth sensor, and two monochromatic fish eye cameras for tracking complete the visual suite. This opens up a range of lightweight, improved low power uses like VR head mounted displays (HMD), AR/MR glasses, smaller autonomous robots and higher performing drones.

Lanner Electronics

As previously mentioned in the Hailo section above, the two primary categories for product introductions for Lanner Electronics at the show included the Falcon H8 high-performance, cost-efficient PCIe AI Accelerator Card and family of AI processors.

The Falcon F8 is expected to dominate machine and server integration for VMS, BMS, V2X and virtually anything involving sensors and sensor fusion, supplanting the GPU at a lower cost per channel and far higher performance per power usage.  Power usage is important even in continuously powered devices as optimum efficiency is realized in a machine built with components that themselves manage their thermal load.  Essentially, run cool, run fast.  Run hot, fail fast.

The staggering features of the Falcon H8 do not seem consistent with its lower target price as there is scalable support for 4, 5 or 6 Hailo-8 AI processors having extremely high performance and power efficiency, a 156 TOPS 8000 FPS of ResNet-50, along with a full suite of software development tools.  This clearly supplants GPU-based machines.

With AI algorithms running at the camera edge, the cost per channel grows with the cost of the more powerful camera required.  The individual cameras have to work harder which consequently generates more heat and power consumption.

The Foxconn AI processor with Hailo-8 M.2 AI Acceleration Module performs a continuous 26 tera-operations per second (TOPS) and is capable of processing 15 UHD streams from IP cameras at very low power.  The AI processor unit can be placed between the IP cameras and a video management system (VMS), where an additional video stream is processed, delivering far more actionable real time visual data in a quickly deployed upgrade. 

An outage to the fiber infrastructure linking IP cameras and the command center may have an aggregation point close to the command center itself, resulting in a wider outage.  The use of an AI processor unit closer to a “natural” aggregation point of 10-15 existing IP video cameras permits the economical connection to another branch of fiber infrastructure serving a redundant command Center.  Infrastructure outages near either aggregation points would not result in service interruption as the AI processor unit is powerful enough to stream to multiple decoding locations.  This also allows fast deployment at venues requiring temporary surveillance and a mobile command center. 

Consolidating AI stream processing of a suite of visual sensors in a 15:1 ratio drastically reduces cost per channel to purchase and operate.  A small city with 500 IP cameras and video analytics applications at an emergency operations center can present an upgrade challenge.  Locating approximately 40 AI processors closer to clusters of existing cameras delivers the benefits of multiple AI algorithms without a dependency of increased network traffic.  In addition, multiple output streams from the AI processor can serve mobile command centers with quick effect and same user experience as the emergency operations center.

The Road Ahead

The Edge AI & Vision Alliance has its own ecosystem that, as described in the introduction to the EVS, has touchpoints to security, public safety, facility, and real estate management industries.  The problem with any advanced ecosystem is the inability for other markets receiving these products to understand and compare growing categories like sensor fusion, edge AI, hyper performance in PCI machines, AI inferencing, AI processors, AI optimization and testing platforms, and device modularization.  The first measurable impact will be in the area of AI processing at scale, where facilities and cities with “normal” HD IP cameras are reinvigorated with CV algorithms to run on many streams, at high FPS and low power.  The upgrade path for these users is looking bright, after EVS.

About the Author:

 Steve Surfaro is Chairman of the Public Safety Working Group for the Security Industry Association (SIA) and has more than 30 years of security industry experience. He is a subject matter expert in smart cities and buildings, cybersecurity, forensic video, data science, command center design and first responder technologies. Follow him on Twitter, @stevesurf.