Incident Management Best Practices
If you're here, chances are you're worried about how your team keeps the software it builds running. Everyone needs a plan for when things go wrong, and hopefully this guide will help you with yours.
This guide was written with the following incident values in mind:
- We know there is a problem before our customers do.
- Escalate, escalate, escalate (and communicate with customers).
- Shit happens, clean it up quickly.
- Always blameless.
- Never have the same incident twice.
Chapters
On Call
Incident Response
- Communicating to Users During Incidents
- Writing your first runbooks
- Guidelines for writing better runbooks