Addressing large enterprise demands for software reliability automation, monitoring, reporting and improvement metrics, Gremlin makes new features available that reduce cost, risk and opportunity loss.
SAN FRANCISCO, Nov. 16, 2023 /PRNewswire-PRWeb/ -- Gremlin, provider of chaos engineering and reliability management software, today launched a suite of tools that allows cloud architects and other centers of excellence leaders to define customized software reliability standards and measure progress towards them.
*New Capabilities and Benefits*
- Custom test suites—Define a suite of tests to set a consistent standard of reliability across services. These test suites can be designed to meet compliance requirements, ensure resilience to past outages or scale organizational best-practices.
- Custom reliability scoring—Compare results from test suites across all services and systems to understand how they measure up to those standards, and how that is changing over time.
- Custom enterprise-wide dashboards—Gain visibility into software reliability risks at the service, team and organizational levels with aggregated reliability scoring and risk data.
"Test suites provide organizations the ability to measure and enforce their own reliability standard," said Kolton Andrus, CTO and founder of Gremlin. "We make it very simple to build and roll these out to teams so they can execute these tests in a single click. Our reporting gives leadership visibility into their coverage, the risks in their systems, and which teams are making progress week over week."
BMO, a top-10 North American bank employing 50,000 people and the continent's third largest payments processor, asked Gremlin to support the organization in its digital transformation, in which the bank is moving 60-80% of its workloads to the cloud by 2025, from less than 1% in 2019. Working together, the bank and Gremlin developed custom reliability scoring, setting a new standard for the financial services industry.
"Our cloud transformation required a new approach to reliability," said Anantha Movva, head of site reliability and quality engineering at BMO. "With Gremlin, we incorporated reliability testing into our SDLC process, helping us validate code for reliability before going live. We've modernized how we look at software reliability by systematically discovering issues and fixing issues like misconfigurations in development rather than in production—where they can cause issues that impact our customers. It's a shift we have decided to make proactively based on where the industry is headed."
*Strategic Software Reliability: Aligning the Work of Engineers and Executives*
Software reliability has become a bigger challenge with the advent of complex cloud-native architectures. Organizations turned to SREs to fix the reliability problem, but with so many stakeholders and competing priorities, SREs simply can't do it all by themselves. Instead, a collaborative, organization-wide approach that demands alignment between SREs and the rest of the organization is allowing companies to achieve greater reliability success. This is achieved through clear communication with executive leadership who require the right data to show the cost of inaction.
Gremlin's new suite of tools are designed to facilitate this collaborative, organization-wide approach to strategic software reliability.
The new features work hand-in-hand with Detected Risks, a feature announced earlier this summer. Detected Risks supports executive leadership, SREs, platform engineers and cloud engineering teams in finding and fixing the most common hidden risks that can negatively impact reliability. The entire team can use it to see these risks across the organization, quickly identify who is responsible for fixing them and track how many of them have been resolved.
Learn more from Gremlin's on demand webinar: "Building a Culture of Reliability: Why SREs Can't Do It Alone."
About Gremlin
Gremlin is the world's first enterprise-ready reliability platform with a mission to help every business build more reliable software. Gremlin provides everything teams need to test for the most common causes of incidents, highlighting the biggest risks to availability and delivering actionable insights that drive real reliability improvements. For more information, visit http://www.gremlin.com
Media Contact
Cristin Connelly Zegers, Gremlin, +1 404-931-6752, [email protected], www.gremlin.com
SOURCE Gremlin

Share this article