Proactive Incident Management: Seven Steps to Prepare for the Unexpected
In July 2024, cybersecurity company CrowdStrike released a software update causing a global IT outage that grounded planes, stalled major medical centers and downed systems across sectors and around the globe. The incident was a painful reminder that large-scale correlated failures are intrinsic to the connected systems powering our world. While we should absolutely wring every preventative lesson from this event, we must also accept the reality that unforeseen bad things will happen. Fortunately, there are several straightforward – but effective – practices that, in my experience, can help organizations prepare for, respond to and mitigate impact when incidents inevitably occur.
1. Designate an Incident Manager
Identify an incident manager and have a clear escalation path in place before the next event occurs. Guided by this plan, these individuals are responsible for coordinating resolution AND communicating status and next steps. Even the smallest teams need central coordination to ensure swift and effective response. Without central coordination, desperate but well-intentioned problem solvers may actually make the situation worse as they correct one component while unknowingly creating a new issue.
2. Plan for Communication Failures
Develop a plan for communicating across teams when messaging and/or email systems are down. A phone or text fan-out through reporting chains works well, provided team members are prepared with up-to-date contact information. This keeps communication lines open when information exchange is most essential.
3. Establish an Event Classification System
Clearly define what constitutes a critical, high, medium or low-priority event, and set engagement and escalation protocols for each category. This structured approach helps ensure that the most impactful issues are prioritized and minimizes distractions from low severity incidents, allowing for better resource allocation and more focused and efficient response.
4. Conduct Root Cause Analysis
After resolving an issue, perform a thorough Root Cause Analysis and, if necessary, prepare a Correction of Errors document. When done well, these practices promote transparency, foster a culture of continuous improvement and help prevent similar failures in the future. Google SRE has helpful guidance and an example.
5. Run Regular Simulations
Regularly conduct tabletop and “game day” simulations for the most likely or impactful scenarios. Simulating events like widespread workstation unavailability can prepare your team for high-impact failures similar to the CrowdStrike bug, as well as higher-likelihood scenarios. These exercises help refine response strategies and improve readiness.
6. Build Relationships with Incident Response Partners
Consider developing relationships with the appropriate incident response vendors – including your insurance provider and those providing forensic analysis and legal support – as well as with local law enforcement. This can be particularly important for incidents involving data loss or other security breaches. Having these relationships in place can help drive a quicker and more coordinated response when incidents arise.
7. Foster a Culture of Preparedness
Effective incident response requires a coordinated, cross-functional effort; it is not solely the responsibility of engineering teams. By creating a culture of preparedness that involves customer service, sales and marketing, PR and communications, organizations are better positioned to mitigate potential impacts on customers, prospects and brand.
By adopting these practices, we believe organizations can better navigate the complexities and inherent risks of today’s interconnected systems and mitigate the impact of inevitable failures. Emphasizing emergency preparedness throughout your organization ensures that when the unexpected happens, your team is ready to respond rapidly and effectively.
Growth Timeline
Don't delete this element! Use it to style the player! :)
Learn More: For more strategic insights and best practices for growth-driven executives, subscribe to The Ascent, Summit’s quarterly newsletter.
Related Experience
Related Content
The content herein reflects the views of Summit Partners and is intended for executives and operators considering partnering with Summit Partners.
Get the Latest from Summit Partners
Subscribe to our newsletter to stay up to date on our partners, portfolio, and more.