You encountered a major service outage that affected all users of the service for multiple hours. After several hours of incident management, the service returned to normal, and user access was restored. You need to provide an incident summary to relevant stakeholders following the Site Reliability Engineering recommended practices. What should you do first?
A. Call individual stakeholders to explain what happened.
B. Develop a post-mortem to be distributed to stakeholders. Most Voted
C. Send the Incident State Document to all the stakeholders.
D. Require the engineer responsible to write an apology email to all stakeholders.
You encountered a major service outage that affected all users of the service for multiple hours. After several hours of
-
answerhappygod
- Site Admin
- Posts: 899604
- Joined: Mon Aug 02, 2021 8:13 am
You encountered a major service outage that affected all users of the service for multiple hours. After several hours of
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!