Participating as an observer in an annual DR Exercise, we learned quite a few things that are never the subject of discussions at any industry conferences. The objective of the DR exercise included restoring Tier 0 & Tier 1 applications (60 Applications) involving 300+ Servers including Unix, Linux, Wintel, Mainframe, Networks & Interfaces and Replicated Storage. This is the Introductory blog of the “Recovery Optimized: The 5 Stakeholders” series.
We identified 5 different stakeholders and noted that each had differing requirements during the 5 days of the exercise. These can be summarized as:
- Incident Commanders – The Incident Command team kicked off the exercise, activated the necessary Plans and invoked the relevant recovery teams. They were responsible for: resolving any issues that needed the attention of other subject matter experts, ensuring the smooth flow of recovery tasks and providing ad-hoc logistical support (such as hoteling arrangements, transportation and organize Pizza for lunch among many other similar activities)
- Recovery Teams – Within the DR Program, Recovery Teams are assigned to the various infrastructure components (Network, Servers, Mainframe, Databases, Storage) that are within the scope of recovery to support their Tier 0/Tier 1 services. Recovery Teams focused on Task execution, tracking predecessors/successor dependencies and focused on reducing slack
- Application Managers – Each Service (Application) was associated with an Application Manager Team. The Customer Service Team (CST) supported Customer Billing and Collection & Call Center services. The CST Manager was constantly monitoring and measuring the state of the service restoration that they are responsible for. Their main concern was: At any given time, what is the state of service restoration?
- Client Testers – Prior to sign-off that services have been restored, the client testers were tasked to validate the recovery of services, run test scripts, certify that services were performing as designed, and update the Executives & Stake
- Executive & Senior Managers – During the 5 day exercise, the executives established their Executive Command Room and had C-Suite and Senior Management taking turns round-the clock to monitor the overall recovery efforts. Their main concern was: When can we inform the business that critical services are back online?
Does your BC/DR Program support the needs of these stakeholders? What element of the planning process can be tweaked to provide the ability to Monitor, Measure & Manage your DR Exercises?