BUILD AGENTS THAT HELP

any engineer on any team handle any issue anywhere

IN YOUR TECH STACK

Developer Self-Service

- Issues with flaky CI/CD jobs
- Issues with non-prod infra and platforms
- Issues isolating bugs between teams
- Issues with observability in test, staging and production

AI SRE

- Issue root cause analysis
- Issue prioritization and blast radius analysis
- Issue financial and SLO analysis
- Issue remediation

The only AI-SRE platform used to reduce observability costs

Background agents run LLM-optimized scripts ("tools") to find issues in your infra, apps and data. It is the only AI SRE platform that doesn't have the same holes that you find in your observability stack.

image showing the impact of driving down kubernetes costs

It feels like using ChatGPT...

...but focused on finding, prioritizing and fixing issues in your infra, apps, data.

ChatGPT answers questions about information that can be found on the web, and helps you decide what to do next.

RunWhen answers questions about issues in your environment, and helps your engineers decide what to do next.

any engineer can write a new tool

Tools are just LLM-optimized scripts. Our FDEs will work with your team to build 30 new tools in 30 days in addition to the tools that come out of the box.

every engineer can handle an issue

Issues are where human decisions are needed. Research, route, remediate... The platform is designed to help remove the toil before and after.

Your first thousand tools in minutes

Our installer configures thousands of tools from our library for your environment.
Production ready out of the box.

30 new tools
in 30 days

Our forward-deployed engineers work with your team to build "30 new tools in 30 days" integrating deeper with your apps, data and toolchain as part of a PoC.

Getting started with

blue dot grid

FOREGROUND AGENTS

Ask questions for root cause analysis, configuration, cost, remediation and other topics.

The platform will suggest the tools to run or pull insights from the database of prior tool runs.

BACKGROUND AGENTS

Agents are constantly running tools in the background, identifying issues that need attention.

Ask about what happened yesterday, or connect issues to notifications, remediations, etc.

30 NEW TOOLS IN 30 DAYS

Our FDEs or our partners will work with your team to build new tools to add data you want from your infra, apps, data and workflows to each agent's context window.

You are in control.

THUMBS UP?

Get AI-enhanced feedback from your users, showing where new tools should be prioritized for investigation, remediation, reporting or other uses.

Product management built in by design.

3,432
AI SRE Tools in the library for cloud infrastructure, platform and applications
86,524
Autonomous AI Troubleshooting Sessions, saving time and reducing MTTR
2,562
Hours of downtime saved by AI-assisted triage, root cause analysis and remediation

Can my team deploy ?

We work in the strictest financial services, health care and government environments in the industry

Green check
Hybrid SaaS and self-hosted deployment options. Air-gapped? No problem.
Green check
Bring-your-own-LLM-endpoint. Best-in-class enterprise data security guarantees.
Green check
Tested on all major clouds and various on-prem infrastructure configurations.

Need help with a business case?

Our team can help you build a business case for production environments, non-production environments, or both.

We typically do this after a 30 day PoV so we can use real production data in your environment.

Developer Productivity

“Developers ask us 10 questions per day. Each one implies they were blocked for about an hour. If they ask RunWhen AI Assistants, we get back 10 developer hours per day.”

Reliability vs Cloud Cost Trade-Offs

“RunWhen SLOs say this service is healthy 99.99% of the time. What if we drop to a 98% target and scale replica counts down by half?”

Scale Faster Than Headcount

“We have multiple cloud environments scaling up… I need either one more person per cloud environment or one person with ten RunWhen AI Assistants to cover both.”

Developer Self-Service

“Developers ask us 10 questions per day. Each one implies they were blocked for about an hour. If they ask RunWhen AI Assistants, we get back 10 developer hours per day.”

Reduce Downtime

“RunWhen can do a minor incident RCA in 2 minutes that typically takes about an hour. Assuming one minor incident per month…”

Reduce Observability Spend

“We can gradually cut back our observability bills in non-prod environments as teams get used to asking RunWhen AI Assistants questions instead of using dashboards.”

Reliability Program Value

“In between incidents, we followed the RunWhen Reliability To-Do list on our tier-1 services. Our top SLOs went from 96% to 98%, on track for 99% before year end...”

half rings