Network Downtime is a Bad Time

Aug. 21, 2014
No longer just a network inconvenience, inevitable system downtime can undermine a company

There is a cruel reality for every IT team: system downtime is inevitable. No matter what defenses are in place, no matter the amount of resources spent trying to avoid it, the unexpected loss of a critical system—including mail servers, processors, and file servers—will happen at one point or another.

That said, no administrator can, or should, sit idly by while downtime wreaks havoc on the organization. It’s still IT’s responsibility to keep fighting the good fight, and do what’s possible to minimize the risk and effects of downtime.

Downtime Still Pervasive as Ever

As recently as a decade ago, downtime was perceived more as an inconvenience than a huge problem, and it often was blamed on just “the Internet.” It was frustrating, sure, but employees could distract themselves with other tasks while they waited for the network to come back online. Today, the frustration of losing access to mission-critical processes is still as pervasive as ever, but the issue has moved far beyond the boundaries of an occasional inconvenience.

To start, nearly 90 percent of organizations experience downtime, and a third deal with it at least once a month, according to a recent survey by GlobalSCAPE, Inc. A similar survey by Dun & Bradstreet found that nearly 60 percent of Fortune 500 companies experience at least 1.6 hours of downtime every week—and while the frequency is alarming, the consequences of downtime are devastating.

Three Ways Downtime Undermines Companies

Financial Losses

Productivity is the first and most prevalent casualty of downtime, and, while lost files or slow email may not have an assigned dollar value, organizations lose money every minute a core system is unavailable.

Globalscape’s study revealed that, for most enterprises, a single hour without a core system costs between $250,000 and $500,000. At that rate, a Fortune 500 company that experiences a minimum 1.6 hours of downtime a week, can lose between $400k and $800k a week or $20.8 million and $41.6 million a year.

It may be difficult to assign a specific dollar amount on employee productivity, but one thing is certain: the more senior the employee, the greater the financial loss. Of those executive-level employees surveyed, 100 percent said they experience downtime—and more than half deal with it at least once a month.

Data Loss

As a result of downtime, employees become frustrated when important files become inaccessible. Although not having access to the right file might leave an employee in a tough position, losing the file altogether is the worst-case scenario and all too common. Nearly half of all employees have lost important data and emails when core systems have gone down, according to Globalscape’s study. Not only that, but senior-level employees are the most affected. Of those surveyed, 75 percent reported losing critical communications and data during unexpected downtime.

Security and Compliance

It should go without saying, but today’s state of security is less than ideal. Major breaches occur too frequently: Yahoo, eBay, Gmail, Target, any number of government agencies or healthcare providers. While breaches are often attributed to external threats, nearly two-thirds of all data breaches are due to negligence, system glitches, and human error, according to the Ponemon Institute. Downtime only serves to elevate the risk of an accidental insider breach.

According to another recent survey, employees are likely to turn to their own, more familiar methods to handle sensitive work data when core systems go down, putting data at significant risk and removing it from the scope of all regulatory boundaries. Consider that:

  • 63 percent of employees have used remote storage devices, such as USBs to handle confidential corporate data
  • 45 percent have used third-party consumer sites like Dropbox, to share and store corporate data
  • Nearly a third have used personal email to send sensitive work files

What’s worse, 74 percent of employees that use these methods to share confidential data think that IT administrators would approve. The reality: Employees using insecure, non-managed tools to share sensitive information introduces major security and compliance risks to the system.

When Downtime Complicates Physical Systems

Taking downtime to an even further extreme it is important to remember that physical systems run our businesses too.  What happens when one of those systems is down due to malicious intent, system outage, or internal human error? The negative ramifications could be costly, both financially and to your employees’ health and well-being. Imagine if your electronic badge system was down, you could potentially have to pay your employees to stand in the parking lot. For those in extreme climates, if your heating or cooling systems are down, this could make your office an unsafe work environment.  A fire prevention system that is down can cause overall health and safety issues or worse keep your doors closed. Yes, downtime is more often than not, an annoyance, but with physical systems downtime, it could mean the difference between your business being open or closed.

Minimizing Downtime When it Counts

Despite what business end users might think, it’s not uncommon for servers to become overloaded, shutdown, and require manual intervention before the system is fully restored—a process that could take anywhere from minutes to hours to a full work day.

Unexpected downtime isn’t the only problem. At least once a month, IT receives a laundry list of new patches that require immediate attention if the organization wants to stay ahead of existing threats. That means companies need to take core systems offline on a regular basis—something that leaves a mark on the bottom line. That makes mitigation for planned outages just as important as unexpected downtime

Rather than wasting time and resources attempting to avoid inescapable downtime, IT needs to focus on minimizing risk. Here are five strategies to do just that:

Vet your vendors -- Regardless of what uptime assurances a business might have, key vendors may be unable to commit to the same level of availability. It’s critical to assess the level of availability promised in vendor and partner contracts to ensure all facets of the business are protected. If your partners’ systems consistently fall short of SLAs, consider implementing an active-active cluster or exploring alternative vendors.

Deploy active-active clusters -- In an attempt to avoid the issue, many IT professionals have implemented active-active or active-passive clustering, but active-passive configurations can still leave organizations at risk. According to Globalscape’s report, those depending on active-passive clustering reported losing 34 percent more data and important communications than those respondents using active-active clustering. Active-passive configurations might have been a potential solution in the past, but the fact is, it just doesn’t cut it in today’s business. Active-active clustering environments provide more reliable uptime for core systems, which results in low-risk, efficient organizations.

Implement load-balancing systems with room to grow -- Scalability is key for any IT process, and ensuring uptime is no exception. When paired with load-balancing features, highly scalable servers help businesses prevent outages due to overloaded systems, and keep processes running during scheduled outages. If one node is unavailable, when performing several actions on the same file, for example, other nodes can continue to process essential commands without difficulty or the risk of downtime.

Only choose one vendor to meet your needs -- Attempting to implement multiple file servers can introduce unwanted and unnoticed system-to-system vulnerabilities. Multiple vendors might give way to frequent system glitches and leave sensitive data at risk on a failing server. Limiting an IT environment to a single provider ensures compatibility and effectively removes competing systems out of the equation.

Regular system and process classification audits -- System audits are rudimentary measures to ensure uptime, but why stop there? IT teams should regularly perform a process classification audit to streamline organizational practices. It’s important to examine both systems and processes that don’t warrant a high availability solution. Focus on what makes the most significant impact when unexpected downtime occurs, and work to secure those systems. These audits will help organizations assess where funds are allocated and what aspects of their infrastructure require more resources.

It’s easy to blame IT when the employees lose access to core systems, but poor availability is often a product of insufficient systems and incompatible vendors. Organizations need to implement the right hardware and software components, as well as expand regular audit processes, to effectively manage downtime. IT won’t always be able to completely prevent the loss of core systems, but they can explore measures to minimize these occurrences.

About the Author

James Bindseil is President and CEO of Globalscape, a managed file transfer solutions provider that helps organizations securely and efficiently send and receive files and data.