19Oct
20Jul
Episode 5: Mooving to… Practical Postmortems
So, what is a postmortem? Solidified in Google’s SRE handbook, a postmortem is defined as “a written record of an incident, its impact, the actions taken to mitigate or resolve it, the root cause(s), and the follow-up actions to prevent the incident from recurring.” Translating that to...
05May
More Tools + More People = Increased Complexity
Consider what happens if digital apps or services go down. Companies lose revenue, decrease productivity, compromise customer loyalty and the list of repercussions goes on, depending on the business. Indeed, modern business continuity is contingent on a well-functioning suite of consumer and commercial apps and...
26Apr
Continuous Availability vs. Continuous Change
All companies are going through some form of cloud adoption - whether cloud migration for the first time, hybrid cloud adoption, or extending cloud-native with newer microservice architecture. But, according to a recent survey by Aptum*, only 39% of companies are completely satisfied with their current...
07Apr
Episode 4: Mooving to… Successful Engineering in the Remote World
Martha Sharpe, her husband, and four kids(!) moved to Atlanta to experience everything the city had to offer. She and her husband secured remote work, giving them the flexibility they were looking for. Everything was lining up for them just as they planned when the...
24Mar
Continuous Availability: How It’s Changed, and Why It’s Critical
Remember when Slack went down in early January? The three-hour outage, set off by AWS capacity issues, cost the company an untold amount of money. And the effects rippled across the enterprise. The outage devalued the company’s stock and seemed to send all 142,000 of...
02Mar
Turning Telemetry into Actionable Insight with Moogsoft Observability Cloud
Under the hood, Moogsoft Cloud extends AI-based intelligence so that it starts with raw observability data analysis. It discovers your infrastructure services to collect and analyze the time-series metrics locally, along with turning time-series metrics and event data from your existing tools into actionable insights....
22Feb
Episode 3: Mooving to… Stability: The Role of Catastrophic Failure in Software Design
In this episode of Mooving to… Stability: The Role of Catastrophic Failure in Software Design, we had the opportunity to chat with Jeff Atwood, yes that Jeff Atwood of, Coding Horror, Stack Overflow, and Discourse (Chief Happiness Officer). Jeff started writing 911 software in Boulder,...
25Jan
Episode 2: Mooving to Remix: Code You Will be Happy With
Episode 2 of Mooving to… dives into a new tool called Remix, a framework to help create front-end code, you’ll love. This episode focuses on a new web framework that helps streamline your processes and eliminate downtime to the best of your ability. Thom Duran...
15Dec