Education Archives

Tuning AWS EC2 instances with CloudWatch metric analysis
What happens when cloud-based application infrastructure slows down? Twelve years ago, I attended a meetup at the San Francisco Perl Mongers group where an engineer from Amazon introduced the Elastic Compute Cloud service (EC2). At […]

Read More
Which block I/O scheduler is the best? We asked eBPF
Putting Linux kernel block I/O schedulers under the eBPF microscope eBPF tracing is a broad and deep subject, and can be a bit daunting at first sight. However, when Brendan Gregg issued the dictum “Perhaps […]

Read More
How Safe is Your Home’s Air? The Internet of Things and Air Quality Monitoring during Wildfires
IoT-driven monitoring of Air Quality during the 2018 California Wildfires Over the past few weeks, the Camp Fire in Northern California and the Woolsey Fire in Southern California have devastated people and property. There has […]

Read More
The Problem with Percentiles – Aggregation brings Aggravation
Percentiles have become one of the primary service level indicators to represent real systems monitoring performance. When used correctly, they provide a robust metric that can be used for base-of-mission critical service level objectives. However, […]

Read More
A Guide to Service Level Objectives, Part 3: Quantifying Your SLOs
A guide to the importance of, and techniques for, accurately quantifying your Service Level Objectives. This is the third in a multi-part series about Service Level Objectives. The first part can be found here and […]

Read More
Quantifying WordPress Performance Improvements with circonus-logwatch
Deriving meaningful insights from third-party logs has always been a difficult yet necessary task. Most analysis occurs after-the-fact, when something has gone wrong. Very few tools allow real-time monitoring of logs, so SREs have become […]

Read More
A Guide To Service Level Objectives, Part 2: It All Adds Up
A simple primer on the complicated statistical analysis behind setting your Service Level Objectives. This is the second in a multi-part series about Service Level Objectives. The first part can be found here and the […]

Read More
Latency SLOs Done Right
In their excellent SLO-workshop at SRECon2018 (program) Liz Fong-Jones, Kristina Bennett and Stephen Thorne (Google) presented some best practice examples for Latency SLI/SLOs. At Circonus we care deeply about measuring latency and SRE techniques such as SLI/SLOs. […]

Read More