Last week DevOps Institute Chief Ambassador Helen Beal and Moogsoft Director of SRE, Thom Duran, hosted the 2nd to the last episode of the Coffee Break with Helen Beal webinar series. The crowd that tuned in to Episode #9, “Intelligent Observability: Blamefree Retrospectives,” had some great questions (more on that later), and the hosts loved their participation! You can watch the one-hour episode on demand.
I highly recommend that you binge the series before the finale if you have not done so already.
So what happened in episode #9? Well, our fictional character Dinesh, an SRE, learns how to leverage AIOps to run sustainable and blameless retrospectives. Thom and Helen’s passion for a blameless culture is imminent throughout the episode. They dig into real-world examples from past industries to a casual night out in the town (with wine). Thom’s personality shines when he talks about how to help teams overcome fatigue, address team friction, and handle post mortems.
Fun fact, the human brain has a 14 millisecond response time. Imagine how exhausted you must be after an 8+ hour shift staring at a computer and analyzing tons of data. Real-time data is the ability to have what’s happening right now in front of you. This data accelerates resolutions, which provides SRE’s with more time, which helps teams remain calm. Pretty cool stuff, right? Helen shares some great “take-home” tools on the subject, and you can access all 3 of them below.
Don’t worry; you will see Thom, Helen, and the rest of the crew again next month on September 9th at the Coffee Break series finale to launch the Observability Odyssey ebook, which incorporates all the chapters of Helen Beal’s series. It’s only bye, for now.
Below is a selection of the live Q&A from the “Intelligent Observability: Blamefree Retrospectives” episode, and you can watch the complete webinar discussion on demand.
Q & A
- When a business loses half a million due to a 1-hour outage, niceties go out of the window, and Post Mortem sounds more palatable because business was dead and blameless/blamefree sound more “non-actionable” and pointless. How do you address this cultural friction in an organization?
- When a central team conducts Post Mortem, this aspect of who did what comes to the foreground, disguised as fact-finding and research, should Post Mortem be started by the Product Owner or someone who leads and is accountable, handle the Post Mortem to ensure we allow them to narrate what happened and why they missed it?What’s the end goal in terms of implementation? Can it be measured in the number of integrations or the number of teams using it? How would you know that somebody is doing well?
- How does Moogsoft help us make it easier to meet our SLIs and SLOs ?
- Is there any approach to figure out beforehand, when we make some changes to a job, that might also be dependent on some other job, and the changes had an impact on that dependent job that ran later?