Fault Detection: New Features and Fixes

One of the trickier problems when detecting faults is detecting the absence of data. Did the check run and not produce data? Did we lose connection and miss the data? The latter problems are where we lost a bit of insight, which we sought to correct. The system is...
Updates From The Tech Team

Updates From The Tech Team

Now that it is fall and the conference season is just about over, I thought it would be a good time to give you an update on some items that didn’t make our change log (and some that did), what is coming shortly down the road and just generally what we have been...
Understanding Data with Histograms

Understanding Data with Histograms

For the last several years, I’ve been speaking about the lies that graphs tell us. We all spend time looking at data, commonly through line graphs, that actually show us averages. A great example of this is showing average response times for API requests. The...

Web Portal Outage

Last night circonus.com became unavailable for 34 minutes, this was due to the primary database server becoming unavailable. Here is a breakdown of events, times are US/Eastern. 8:23 pm kernel panic on primary DB machine, system rebooted but did not start up properly...