Business Continuity and Disaster Recovery Plans often rely on assumptions (read more about them here). Some Business Continuity plans are very effective response plans but assume that, during an incident, it will be the only plan invoked. That’s a highly blinkered view. Lessons learned from disaster events such as Super storm Sandy prove that when a major incident occurs, multiple disaster recovery, business continuity & crisis management plans are likely to be invoked simultaneously. Do these plans play well together? What happens when multiple plans are interdependent and vie for the use of the same people and other resources?
It is imperative that the Business Continuity planning process (which includes IT Disaster Recovery planning) be guided by some basic Incident Management requirements to assure plans will be viable and manageable during a real business disruption (not just during an artificial exercise).
Actionable Plans
The primary purpose of a Disaster Recovery or Business Continuity Plan is to restore assets or operations within defined Recovery Time Objectives (RTO). Plans that are not focused on restoring services or assets are irrelevant at the time of a disruptive incident. Business Continuity Plans which are intended to facilitate Incident Response should be designed for execution. These ‘actionable’ plans contain 3 basic elements – Who, What & When:
- Who is assigned responsibility to complete the task? It is extremely important that Incident Managers know who is responsible for executing the task to ensure that those resources are available.
- What is the sequence and order of tasks? What tasks have precedence; those which need to be completed prior to starting other tasks? Once this sequence is known, it is possible to identify critical tasks that require more focus or resources to ensure that successor tasks are not delayed.
- How long will each task take to complete? Since the focus of the plan is to restore assets within a pre-defined RTO, understanding how long completion of each task will take is critical to managing the incident response.
Assignment by Skillset
During the planning phase, as plans are being built, the responsibility for executing each task should be assigned to a group or team based on the ‘skills’ required to complete the task. This will ensure that there are adequate resources available to whom to assign the task. To reduce the risk of resource inadequacy even further, include geographically-diverse members in the group or team – just in case.
Collaboration
Because multiple plans might be invoked simultaneously, and because some tasks will be dependent upon the completion of tasks in other plans, the planning process must address the need for dynamic, real-time collaboration between Incident Managers and the many responder teams. Mechanisms should be in place to alert teams in real-time when their tasks are ready to be started, or if the projected status of the task has changed.
For this collaboration to be effective, some sort of critical-path-tracking system must be employed. In a perfect world, this would be an automated workflow that dynamically updates the To-Do task list of each responder based on their role in the recovery process. The manual workaround to facilitate this type of collaboration is a conference bridge, a supply of colored Post-it® Notes and a conference room wall festooned with plan execution Gantt charts.
In either scenario, those Gantt charts are important. Lack of knowledge of the interdependence of plan tasks, and the time required to complete each task, leaves Incident Managers completely in the dark – or at least grasping frantically to maintain an understanding of what’s going on with the 10, 30 or 100 plans in progress.
Control Workflow
The tasks and the workflow within Business Continuity Plans should be capable of change at the time of an incident. Incident managers should have the ability to have responders ‘skip’ tasks if the situation calls for it, or put some tasks on ‘hold’ for reasons beyond anyone’s immediate control. Planning which includes the ability to control plan workflow ensures that the same plan can be successful in a real incident as well in a BC or DR exercise.
Issue Management
No matter how well plans are constructed. No matter how often plans are tested. When multiple plans are concurrently activated, and many responders are involved in the restoration process, issues are bound to arise. The planning process must define issue-handling, escalation and resolution protocols to ensure that restoration processes can continue smoothly and that the resolution of issues will be handled on a timely basis.
To facilitate robust and effective Incident Management capability, the Business Continuity planning process in an organization must develop plans that are truly flexible, viable and executable. Planning and building plans with Incident Response in mind is critical to building a resilient organization.
Related blog:
Incident Management 103: Communications