Aim for Effective Incident Response, Not Just Disaster Recovery

Business disruptions that can’t be contained using Standard Operating Procedures (SOP’s) can be classified as Incidents. An Incident may cause Disaster Recovery and/or Business Continuity Plans to be invoked.

Those invoked activities are an Incident Response.

Think of Incident Response in three distinct parts:

Activation. Knowing which Plans to invoke requires an Assessment of what assets (hardware, applications, interfaces, etc.) and services have been impacted. Assessments focus the scope of the Incident Response. Intelligent assessments, combined with situational awareness information, allow Incident Managers to invoke the appropriate Plans, Recovery Teams and Resources. Plans based on scenarios (failover to an alternate datacentre, or reconstruction at a DR site, restore Tier 0 & Tier 1 Apps…) can be a quick-start Incident Response without conducting a detailed assessment.
Recovery. After Plan activation, monitoring of execution is critical to the effectiveness and success of the Response. Plans constructed as a series of Tasks (rather than broad instructions) make it possible to monitor task status (on hold, in progress, completed, skipped, etc.) and judge Plan progress relative to expectation (from Plan tests). Monitoring must enable Incident Managers to resolve escalated issues to remove roadblocks. Status dashboards should be automated, not manual whiteboards or post-it notes.
After Action. Following closure of an Incident, participants should validate expectations against actual outcomes (Was RTO achieved? Were other objectives met?). They should review response activities to understand what happened, what worked, what went wrong, what could be improved upon and what planning, resources or teams should be changed or added. Improvement efforts should be assigned, completion monitored and results applied to update existing plans – in preparation for revised Plan testing.

Disaster Recovery is more than just Plans. Success relies on effective Incident Response.