10 Disaster Recovery Exercise Best Practices
Disaster recovery exercises, when conducted correctly, help businesses prepare for worst-case IT scenarios.
Articles published January 23, 2017 by Jack Kapustka
How you deal with a network outage, recover from it, and the steps you take to prevent the issue that caused the outage from happening again are crucial considerations for your organization to address in your disaster recovery plan.
Once you have a disaster recovery plan in place, it’s also important to use disaster recovery exercises to test your plan and adjust for any shortcomings.
Although most businesses claim they conduct a full exercise of their disaster recovery plan at least once per year, anecdotal evidence suggests that the majority of these exercises are not comprehensive and thorough.
The Microsoft Business Blog shared 10 disaster recovery exercise best practices, reprinted with permission below, to help businesses update and improve their disaster recovery exercise program. Microsoft used data from Forrester, a technology-focused market research company, in making its recommendations.
1. Define Specific Exercise Objectives Upfront
Exercising for the sake of exercising is a waste of time. Make sure that there are clear and concrete objectives and goals set up front that will help determine the ultimate success of an exercise. One objective may be as simple as, “Verify our stated recovery time and recovery point objectives.” You could orient other objectives around training, such as, “Familiarize the database administrators with the plans for recovering Oracle.”
2. Include Business Stakeholders
Business owners play a vital role in your disaster recovery exercises, and they need to be involved from the start of the exercise until you have recovered all services. All business stakeholders should verify the successful recovery of services.
This has the dual benefit of ensuring that you have properly recovered business processes with all of their critical components as well as ensuring that business stakeholders know what to expect in terms of recovery capabilities and performance at the recovery site during an actual declaration.
3. Rotate Staff Responsibilities
It’s important that the person who wrote the disaster recovery plan is not the same person who executes the test, as it is unlikely that that individual would be available in a real disaster. Some companies Forrester interviewed went so far as to have employees with little specific knowledge of a system executing those tests, such as a system administrator running the database disaster recovery test. An important secondary benefit of a disaster recovery exercise is training; by assigning staff to take on new roles during exercises, you are essentially cross-training staff in different areas.
4. Develop Specific Risk Scenarios For Your Exercises
Many companies conduct their disaster recovery exercises without specific scenarios; they tell the response team to assume the data center is “a smoking hole.”
It is important, however, to define specific risk scenarios even for disaster recovery testing for two main reasons: 1) It provides a more realistic situation for the response team to react to, and 2) different scenarios require different actions from the IT staff.
For example, the disaster recovery plan for a short outage at the primary data center that only requires resuming operations would be different from a long-term outage that requires failover (and eventually failback), which in turn would be different from scenarios where only portions of the IT infrastructure were down.
5. Run Joint Exercises With Business Continuity Teams
In our research, Forrester found that many business continuity and disaster recovery teams run all of their exercises separately and often fail even to communicate when they run exercises. However, you should aim to exercise the full business continuity and disaster recovery concurrently at least once per year. This is especially important if the data center is in the same location as the head office.
6. Vary Exercise Types From Technical Tests to Walk-Throughs
A common misconception in IT is that walk-throughs and tabletop exercises are not necessary for disaster recovery exercises. While it’s true that these types of exercises won’t test the technical capabilities of a failover, they are still critical for training, awareness, and preparedness.
Interviewees told us that the majority of the time, exercises that didn’t go as planned actually struggled most with communication and employees’ understanding of their roles during the exercise. Non-technical exercises such as walk-throughs and tabletops will help make these processes go more smoothly.
7. Test All IT Infrastructure Concurrently at Least Once Per Year
Waiting longer than a year risks too much change in IT environments and personnel — you need to bring new staff members up to speed on disaster recovery plans. The most advanced firms run full disaster recovery tests as often as four times per year.
In between full tests, most firms conduct component tests that vary in frequency depending on the criticality of the systems and rate of change in the environment.
8. Identify Members for the Core Disaster Recovery Response Team
The stress of working under time and resource restraints for long hours, often during nights and weekends, is something people cope with in different manners. If you are putting together a core response team to lead IT recovery, it’s important to pick people who can work under extreme amounts of pressure (and sleep deprivation). During an exercise or test, identify those individuals who can remain calm and collected.
9. Learn From Your Mistakes
The point of running disaster recovery exercises is to find potential barriers to recovery while in a controlled environment. If you aren’t encountering problems during your exercises and tests, it’s more than likely you aren’t looking hard enough, aren’t testing thoroughly enough, or you have designed scenarios for recovery that are too simple. When you complete exercises and tests and you have identified problem areas, use what you have learned to update plans and create best practice documents.
10. Report Results to Stakeholders
If your business has recently made significant investments in improving preparedness, most likely executives, business owners, and other stakeholders want to know what the return is on their investment — how prepared are you? Reporting exercise and test results regularly and in a timely fashion gives executives and business leaders visibility into your disaster recovery program. Remember that the results are not pass/fail but should detail aspects of recovery that went well and areas for improvement.
When was the last time you conducted a disaster recovery exercise?
Next Steps: Enhance Your Information Security
Disaster recovery is just one aspect of a comprehensive business continuity plan. You should also take steps to enhance your information security with an approach that employs common-sense password policies, network configuration, data security practices, and an eye for ongoing social engineering that threaten your systems.
Our white paper titled “Enhancing Information Security in an Unsecure World” is a great resource for businesses looking for ways to protect their high-risk data.
DOWNLOAD THE WHITE PAPER