We just added a new feature to the UI which displays a graph for every metric in your account.
While the previous view (now called List View) did show a graph for each metric, these graphs were hidden by default. The new Grid View now shows a full page a graphs, one for each metric. You can easily switch between Grid and List views as needed.
These screenshots below illustrate the old list view, the new layout options menu, and the new grid view.
The grid-style layout provides you with an easy way to view a graph for each metric in the list. It lets you click-to-select as many metrics as you want and easily create a graph out of them.
You can also:
Choose from 3 layouts with different graph sizes.
Define how the titles are displayed.
Hover over a graph to see the metric value.
Play any number of graphs to get real-time data.
We hope this feature is as useful to you as it has been to us. More information is available in our user documentation and below is a short video showing off some of these features:
Ok, we know a lot of you have been asking for tags in Circonus for a long time. Well, they’re finally here! The tags feature is currently in beta, and will be released to all customers very soon. (Tags have actually been available to API users for a while, just without UI support in the application.) Let’s jump right in and I’ll give you a quick overview of how tags will work in Circonus.
First Things First: What’s Taggable?
For this initial implementation of tags, you can tag Graphs, Worksheets, Check Bundles, Templates, and Maintenance Windows. You will also see tags on some other pages, such as Alerts, Rulesets, Hosts, and Metrics, but these items aren’t taggable. The tags you see on those pages are inherited from the associated Check Bundles.
In the near future, we’ll be adding to these lists. We are planning on making Annotations and Dashboards taggable, and have some other unique ways we’re planning on using tags to help categorize and group items in the UI.
So, How Does This Work?
First, you’ll need to add tags to some items. All tags are categorized in the system, but if you don’t want to categorize your tags, that’s ok. Simply use the “uncategorized” category for all your tags and the tags will be presented solo (without category) throughout the UI. We have a couple of categories which are already created in the system, but you can create as many custom categories as you wish.
Let’s go to one of the list pages containing taggable items (e.g. the Checks list page) and look for the tags toolbar under an item (it will have an outlined tag with a plus icon). Click the “+” tag to open the “Add Tag” dialog. First choose a category or use the “+ ADD Category” option to enter a new one, then the tags dropdown will be populated with the tags under that category. Choose a tag or enter a new one by choosing the “+ ADD Tag” option, then use the “Add Tag +” button to add the tag to the item.
When the tag is added to the UI, you’ll notice right away that each tag category has its own color. There is a limited set of pre-selected colors which will be assigned to categories as they are created. These particular colors have been chosen to maximize your eye’s ability to distinguish the categories at a glance, and also because they work well under both light and dark icons. So you’ll also notice that the tag you added has its own icon. There’s a set of twelve icons which will be assigned to the first twelve tags in each category. Once a category has twelve tags, any further tags added to that category will receive blank icons. This system of colors and tags will create fairly unique combinations that should help you recognize tags at a glance without needing to read the tag every time. Note: taggable items can have unlimited tags.
After you add a tag to an item, you’ll also notice that a small set of summary tags is added (usually off to the right of the item). This shows the first few tags on the item, providing a way for you to quickly scan down the page and get a glimpse of the tags that are assigned to each item on the page.
One more note about tags and categories. Although you select them separately in the UI, when using the API the categories and tags are joined with a colon (“:”) as the separator. So a tag “windows” in the category of “os” would be represented as “os:windows” in the API.
The power of tags is apparent once you start using tag filters. Look in the upper right corner of the page and you’ll see an outlined tag with a funnel icon and beside it a similar menu button. These are for setting tag filters and saving tag filter sets for easy application later. Click the funnel tag to open the “Tag Filters” dialog, and click the “Add +” button to add a filter to the dialog. You may add as many filters as you wish, and in each one all you have to do is choose a category and tag from the choices (you may not enter new tags or categories here; these are simply the tags you’ve already added to the system). Use the “x” buttons at the right to remove filters, or use the “Clear” button to remove all filters and start with a clean slate. Note: none of your changes in this dialog are applied until you click the “Apply” button. After clicking “Apply,” the page will refresh and you’ll see your newly applied filters at the top of the page. You can then use the “Tag Filters” dialog to change these filters, or you can use the menu button on the right to open the “Tag Filter Sets” dialog, where you may save & apply sets of tag filters for easy switching.
One important feature to note is the “sticky” checkbox that appears when you have one or more tag filters applied. By default (with “sticky” turned off), the tag filters you apply are only visible in the current tab. If you close the tab or open a new one, it will not retain the current tag filters. The benefit of this is that we’ve developed a system to allow you to have multiple concurrent tag filter views open side-by-side. So with the “sticky” setting off, you can open several tabs, use different tag filters in all of them, and each tab will retain its own tag filters as you navigate Circonus in that tab. If at any point you turn the “sticky” setting on, the tag filters from that tab will be applied universally and will override all the other tabs. And not only are “sticky” tag filters applied across all tabs, they’re remembered across all of your user sessions, so they will remain applied until you choose to change or remove them.
One unique feature we’ve already completed is Host Grouping. Head on over to the Hosts page and open the “Layout Options” by clicking on the grid icon at the right side of the page. You’ll see a new option labeled “Group By Tag Category.” If you choose a tag category there, the page will reorganize itself. You’ll now see a subtitle for each tag in the selected category, and under each subtitle you’ll see the Hosts which have Check Bundles with that tag. Because each Host can have many tags, including more than one tag in the same category, you may see a Host appear in more than one group. At the bottom of the page you’ll also see a grouping subtitled “Not in Category.” Under this group you’ll see all the Hosts which don’t have any Check Bundles with tags in the chosen category.
Now that it is fall and the conference season is just about over, I thought it would be a good time to give you an update on some items that didn’t make our change log (and some that did), what is coming shortly down the road and just generally what we have been up to.
CEP woes and engineering salvation.
The summer started out with some interesting challenges involving our streaming event processor. When we first started working on Circonus, we decided to go with Esper as a complex event processor to drive fault detection. Esper offers some great benefits and a low barrier of entry to stream processing by placing your events into windows that are analogous to database tables, and then gives you the ability to query them with a language akin to SQL. Our initial setup worked well, and was designed to scale horizontally (federated by account) if needed. Due to demand, we started to act on this horizontal build out in mid-March. However, as more and more events were fed in, we quickly realized that even when giving an entire server to one account, the volume of data could still overload the engine. We worked on our queries, tweaking them to get more performance, but every gain was wiped away with a slight bump in data volume. This came to a head near the end of May when the engine started generating late alerts and alerts with incorrect data. At this point, too much work was put into making Esper work for not enough gain, so we started on a complete overhaul.
The new system was still in Java, but this time we wrote all the processing code ourselves. The improvement was incredible, events that once took 60ms to process now took on the order of 10µs. To validate the system we split the incoming data stream onto the old and new systems and compared the data coming out. The new system, as expected, found alerts faster, and when we saw a discrepancy, the new system was found to be correct. We launched this behind the scenes for the majority of the users on May 31st, and completed the rollout on June 7th. Unless you were one of the two customers affected by the delinquency of the old system, this mammoth amount of work got rolled out right under your nose and you never even noticed; just the way we like it. In the end we collapsed our CEP system from 3 (rather saturated) nodes back to 1 (almost idle) node and have a lot more faith in the new code. Here is some eye candy that shows the CEP processing time in microseconds over the last year. The green, purple and blue lines are the old CEP being split out, and the single remaining green line is the current system.
We tend to look at this data internally on a logarithmic scale to better see the minor changes in usage. Here is the same graph but with a log base 10 y-axis.
Distributed database updates.
Next up were upgrades to our metric storage system. To briefly describe the setup, it is based on Amazon’s Dynamo, we have a ring of nodes, and as data is fed in, we hash the ids and names to find which node it goes on, insert the data, and use a rather clever means to deterministically find subsequent nodes to meet our redundancy requirements. All data is stored at least twice and never on the same node. Theo gave a talk at last year’s Surge conference that is worth checking out for more details. The numeric data is stored in a proprietary format, highly compact, while text data was placed into a Berkeley DB whenever it changed.
The Berkeley DB decision was haunting us. We started to notice potential issues with locking as the data size grew and the performance and disk usage wasn’t quite where we wanted it to be. To solve this we wanted to move to leveldb. The code changes went smoothly, but the problem arose: how do we get the data from one on-disk format to another.
The storage system was designed from the beginning to allow one node to be destroyed and rebuilt from the others. Of course a lot of systems are like this but who ever actually wants to try it with production data? We do. With the safeguards of ZFS snapshotting, over the course of the summer we would destroy a node, bring it up to date with the most recent code, and then have the other nodes rebuild it. Each destroy, rebuild, bring online cycle took the better part of a work day, and got faster and more reliable after each exercise as we cleaned up some problem areas. During the process user requests were simply served from the active nodes in the cluster, and outside of a few minor delays in data ingestion, no users we impacted. Doing these “game day” rebuilds has given us a huge confidence boost that should a server go belly up, we can quickly be back to full capacity.
More powerful visualizations.
Histograms were another big addition to our product. I won’t speak much about them here, instead you should head to Theo’s post on them here. We’ve been showing these off at various conferences, and have given attendees at this year’s Velocity and Surge insight into the wireless networks with real time dashboards showing client signal strengths, download and uploads and total clients.
API version 2.
Lastly, we’ve received a lot of feedback on our API, some good, some indifferent but a lot of requests to make it better, so we did. This rewrite was mostly from the ground up, but we did try to keep a lot of code the same underneath since we knew it worked (some is shared by the web UI and the current API). It more tightly conforms to what one comes to expect from a RESTful API, and for our automation enabled users, we have added in some idempotence so your consecutive Chef or Puppet runs on the same server won’t create duplicate checks, rules, etc. We are excited about getting this out, stay tuned.
It was a busy summer and we are looking forward to an equally busy fall and winter. We will be giving you more updates, hopefully once a month or so, with more behind the scenes information. Until then keep an eye on the change log.
The Circonus platform delivers time-series data alerts, graphs, dashboards, and machine-learning intelligence that help to optimize not just your operations, but also your business.
IRONdb is a highly resilient stand-alone time-series database designed to power Circonus. Learn more about IRONdb at www.irondb.io.