Give the gift of self-service to application developers, on-call colleagues, new platform engineers, DevSecOps colleagues and others in your organization and free up time for strategic projects
Delegate to Troubleshooting Digital Assistants for a variety of use cases when your colleagues need help.
A 24/7 Troubleshooting Service Desk
Spend more time on improving platform capabilities and less time doing basic dev/test troubleshooting.
With 24/7, instant availability of AI troubleshooting assistance, platform owners will see a dramatic increase in 'customer satisfaction' from their app developers and start to free up engineering bandwidth for platform roadmap or strategic projects.
Accelerated Onboarding For On-Call Engineers
Most Kubernetes teams are too small to emulate the 6+ month on-call training in hyperscale companies. When they can ask Troubleshooting Digital Assistants for help 24/7, on-call is less intimidating and requires far less training. The Digital Assistants produce triage reports with quick summaries of the issues requiring further attention as well as long form descriptions of all commands run with their input and output for learning purposes.
Automated "Pre-Incident" Response Reports
Connecting Digital Assistants to alert systems (PagerDuty, AlertManager, etc) provides material reduction in MTTR (mean time to identify) as they run hundreds of troubleshooting commands in response to alerts and build sophisticated summary reports in seconds.
Guard Rails for Production Access
Rather than give app developers production credentials, RunWhen users typically give credentials to Digital Assistants that can only access a pre-reviewed set of automated tasks.
This is managed at scale by giving different teams access to different Assistants, e.g. any engineer at RunWhen’s can use Eager Eager (read-only) but only a few can use Admin Ali (read-write access)
Structural Reduction of Observability Costs
Digital Assistants connected to key alerts can generate advanced triage reports whenever a key metric goes out of range. Since the triage reports typically include the output needed for troubleshooting, the vast majority of metrics, logs and traces stored in expensive observability systems become redundant for users embracing this approach.
RunWhen itself reduced our monitoring/logging bills by $40k/month with this approach across our dev clusters.
The integration of AI assistants open up your team to new level of efficiency and workflow that will keep your system more reliable.
Copy/Paste Troubleshooting Cheat Sheet
RunWhen Local is a container that provides a searchable web interface that gives helpful copy & paste CLI commands for troubleshooting apps deployed to your Kubernetes environment.
These commands are tailored for your environment and contributed by a community of expert engineers dedicated to building the largest troubleshooting command library in existence.