Headquartered in Oakland, California, LaunchDarkly is a feature management platform that empowers all teams to safely deliver and control software through feature flags. By separating code deployments from feature releases, LaunchDarkly enables teams to deploy faster, reduce risk, and iterate continuously. Over 1000 organizations use LaunchDarkly to build, operate, and learn from their software.
Arun Bhalla has managed the backend engineering team at LaunchDarkly since 2017 and reached out to us when he had a need to streamline incident management, gain more meaningful reporting, and centralize tooling. While evaluating technology, Arun was driven by three key business challenges outlined below.
Incident Role Consolidation
One key consideration for the LaunchDarkly team was having the ability to reduce the manual work required of the communications role. The communicator acted as a scribe and was also responsible for outward communication to customers and other internal teams. This was a cumbersome role for whomever received it and things were frequently missed and not transferred from Slack to the Post Incident Review (PIR).
After implementing FireHydrant, LaunchDarkly has been able to fully automate the communications role. The integration with Statuspage.io and internally built public status pages can have communication (or update reminders) automated through Runbooks and can be updated without ever leaving Slack. The Slack bot also acts as a fly on the wall tracking each engagement and encouraging team collaboration to indicate which messages were critical to the incident’s resolution. Each piece of information can be automatically pulled into the PIR with the result being 70% of the PIR being completed so that a great emphasis is placed on learning and growth from incidents.
Clear Reporting & Visibility into Incidents
Before using FireHydrant, the experience for creating reports was time consuming and manual. It required sourcing data in multiple platforms (Confluence, Pager Duty, and Statuspage) in order to identify time spent in an incident. That data then needed to go into Excel and charts had to be created. At times, this could also be a challenge because with incidents being managed in so many different places, these data points weren’t always accurate.
One benefit LaunchDarkly has enjoyed after implementing FireHydrant is our out-of-the-box reporting. Not only does FireHydrant automatically track key incident stages but we also enable you to tag incidents, track key incidents against various services, and can automate severity based on a unique matrix. The result is detailed reporting on MTTR, MTTA, and MTTD that is filterable by impacted infrastructure, severity, team ownership, or even having unfinished action items in Jira.
Centralized Tooling to Manage Incidents
Another source of pain for LaunchDarkly was having to manage incidents in a handful of various tools. Incidents are expensive, and navigating numerous tools during an incident is time consuming. This also led to less data integrity for reporting. Whether it’s engaging with FireHydrant in our Web app or with our Slack bot, our goal is that you’ll never have to leave the tool. FireHydrant integrates with Jira to track action items, StatusPage for updates, your CI/CD pipeline for change logs, PagerDuty for metrics and on-call rotations, and we’ve got a lot of other exciting integrations on the way.