Gain greater insight into system behavior in aggregate, across multiple dimensions. A “metric” is a measurement, or value, representing the operational state of your system at a given time. For example, the amount of free […]
Percentiles have become one of the primary service level indicators to represent real systems monitoring performance. When used correctly, they provide a robust metric that can be used for base-of-mission critical service level objectives. However, […]
A guide to the importance of, and techniques for, accurately quantifying your Service Level Objectives. This is the third in a multi-part series about Service Level Objectives. The first part can be found here and […]
Deriving meaningful insights from third-party logs has always been a difficult yet necessary task. Most analysis occurs after-the-fact, when something has gone wrong. Very few tools allow real-time monitoring of logs, so SREs have become […]
A simple primer on the complicated statistical analysis behind setting your Service Level Objectives. This is the second in a multi-part series about Service Level Objectives. The first part can be found here and the […]
In their excellent SLO-workshop at SRECon2018 (program) Liz Fong-Jones, Kristina Bennett and Stephen Thorne (Google) presented some best practice examples for Latency SLI/SLOs. At Circonus we care deeply about measuring latency and SRE techniques such as SLI/SLOs. […]
Advanced analytics
Harness powerful analytics to proactively optimize performance, resolve incidents faster, and make smarter decisions with confidence.
Intelligent alerts
Real-time streaming alerts, analytic alerts, and composite alerts ensure you can prioritize issues, reduce false positives, and identify problems before they become outages.
Dashboards & visualizations
Quickly visualize, query, and correlate data from across your stack in real-time dashboards. Analyze metrics, traces, and logs across your entire environment within a single pane of glass.