We’re launching a new feature today that allows anyone in your organization to kick off your incident response process with an appropriate severity level attached from Slack.
Often people are afraid to open an incident or even share that they’re aware of something going wrong with your applications. When everything is important, nothing is important; users frequently overestimate the impact of an incident and assign an inappropriately high severity level. FireHydrant eliminates the scenarios by using your organization’s rules to assign the most appropriate severity level to a new incident. Engineers and support staff should feel empowered to speak up and let the team know when they’ve noticed something isn’t quite right while delegating the details.
Our new “severity matrix” lets you decide which functionalities of your site are most critical and what the escalation path should be. You can assign severities based on which cohort of users is affected: everyone, external customers, internal users, VIP users, etc. You can also adjust those severities based on how impacted the functionality is: unavailable, degraded or just a minor bug has been discovered.
Did one of your customer support team members notice that Single Sign On isn’t working for employees but customers are not experiencing any difficulties? That’s likely not a SEV1, all hands on deck kind of incident. Maybe that means you email a few people and invite them to a Slack incident room. Let your engineers handle escalating the incident if it turns out to be a larger issue but avoid distracting the majority of your staff for a low severity incident.
Are all customers seeing a 502 when visiting your site? That’s likely an alert the world type of incident and is definitely a SEV1. Push an alert to Pagerduty for the relevant on-call teams, open a Slack incident room and email an internal status page link to all of your stakeholders.
FireHydrant lets you find the right balance where you’re aware of all incidents impacting your applications while keeping your team engaged and focused on their day to day responsibilities.
Invest in your incident response process; practice it often just as you would any other skill in your organization or life. As Nida Farrukh said at Monitorama this year, “postmortems are a data-driven argument for specific change” in your systems. Use FireHydrant to learn from your incidents, use the knowledge to make your applications more resilient and reduce your Mean Time To Relaxation.