Today, you cannot open a magazine or walk a tradeshow floor without being inundated with the word “convergence.” But “risk management” along with “high availability” are quickly joining this important buzzword, and for good reason. When you tie everything together, and run it all on the same network, you are putting all the proverbial eggs into one basket. That can seem risky if not handled properly. But many of today’s IT systems and IP networks have been designed with and offer a number of fault-tolerant features and capabilities making these systems rival the reliability (and availability) once only offered by our plain old telephone system (note, this is not the same as our cell phone networks).
The general measure of a given system’s availability (or its lack of planned and unplanned downtime) can be expressed as its percentage of operation (availability) over an entire year. And the frequently used benchmark for mission-critical systems, such as a telephone system, is to allow for only 5 minutes of downtime over an entire year, or 99.999-percent available. Let’s examine several aspects and approaches that the IT world has used in combination to reach this level of availability (and reliability) and how they can be used to support your mission-critical security systems and applications.
Network Infrastructure Redundancy and Failover
Unlike many traditional analog security systems which have several single points of failure, IP network design best practices feature redundant switches and routers with multiple interconnected paths to get from point A to point Z. This approach is used for the Internet as well as many organizations’ intranets. In these networks, IP network designers will commonly split the redundant links between two other devices such that if one link or the connected device fails, there is another path for data to transit (see Figure 1). When a given connection fails, the intelligent features (including network protocols) in these devices will automatically switch over to the alternate link, eliminating the need for manual intervention. For greater efficiency, these redundant links and devices can be made operational all of the time, not just for failover. As a result, the redundant links and devices provide traffic load-balancing in addition to delivery of higher system and application availability.
In some cases, even network edge devices or hosts (such as servers or PCs) will feature dual network connections. Assuming each network connection goes to a different network device (usually a switch), this also provides an additional measure of fault-tolerance and resiliency. This may be referred to as “dual homing.” To be fair, it should also be recognized that intrusion and access control systems have featured dual connections (such as a plain old telephone system modem connection and, perhaps, a cellular connection) to achieve the similar level of accessibility. Nonetheless, dual homing is prevalent in many applications servers found in data centers, and is becoming more popular in various networked physical security edge devices.
While these protocols will automatically route traffic to a viable path, network devices will also send real-time messages to the network operations management consoles such that failed devices and/or links can be flagged for remediation. Messages can even be sent as e-mails or to pagers, reaching network administrators or key application users (such as security system operators) wherever they may be — facilitating even faster resolution.