IT Security: Network Troubleshooting

Six proven steps to figuring out what’s wrong with your IT infrastructure


• Is there a means to rollback the system to the previous system configuration to the pre-error state?

If IT reviews these questions and cannot determine meaningful information as to the cause of the analysis and restore the system, they must then resort to problem analysis to determine the root cause of why the event occurred. Valuable information is often collected when the IT department documents the issues and resolutions so that if they ever occur again they can follow a systematic process of reaching a resolution quickly.

 

IT Security Network Troubleshooting: Six Steps

As there are many users on the network, a fine line of which services take priority and which do not are often calculated in the strategy for how to troubleshoot a network. The first step is to define the severity of the incident — usually classified by major, moderate and minor. These classifications range from a “major” service interruption that disrupts core operations, to “minor” issues which may only affect a single user.

Many aspects of network troubleshooting have Service Level Agreements (SLAs) that determine which actions need to be taken if the solution cannot be achieved within a specific time period, and at which point the issue is required to be escalated and notification made to upper management.

There are six fundamental steps to troubleshooting:

Step One: Physically inspect the problem devices if possible to make sure that power, connections and devices are properly connected and powered. This often is the basic and easiest problem to resolve. If you cannot see any cabling or connection/power issues, the next action is to restart the devices. Be sure to watch the system carefully as quite often the source of error will be displayed during the device startup process.

Step Two: Analysis troubleshooting requires being able to associate what is causing the problem so that it can be resolved. Questions to be asked may include: is the device too hot or too cold, does it exhibit symptoms which are different than other systems, and can the problem be isolated?

Step Three: Now, you must delve deep into your troubleshooting ninja skills. You must be able to use all available information and to determine a theory on what caused the issue. Using available network logs, documentation and recreating steps that can cause an error would be based on theories that make sense to determine a root cause analysis. The best way to prove a root cause analysis is to test the theory by duplicating or imitating the conditions that started the problem. If this can be done without creating additional issues on your network, this is a great practice. Having a lab to test the theory is invaluable to recreate scenarios that cause IT problems.

Step Four: Isolate the trouble components by putting them into various classifications and categories to develop theories and perform testing based on hardware, software, peripheral or configuration issues. This phase is usually time-consuming and frustrating, which is why it is most important to have available documentation such as network inventory mapping and baseline configuration information. Run the tools again to see if there was a change in configuration or change in state from when the system was previously working. These details can provide a depth of understanding as to what areas can be problem areas, and areas that are working properly.

Step Five: During your process of isolating components, you may find that you need to repair or replace something. Start with a known good baseline and insert the product into your network environment. Be sure to follow appropriate security guidelines and do not use default passwords that can expose your network to hackers or any other unauthorized access. Be sure to configure the devices according to information security policies, procedures, standards and industry best practices. You may need to replace and configure multiple devices until you have “rooted” or “weeded” out the bad components in your system.