Surprised By Your Bills? 5 Essential Tips to Manage Cloud Kubernetes Costs

If you’re spending more than you expected on your Kubernetes deployment, you’re not alone. Many Kubernetes operators are experiencing higher Kubernetes costs than what they had predicted. That’s because, like many aspects of Kubernetes, identifying how to manage or lower costs can be challenging. In this article, we provide 5 essential tips for how you can achieve a more cost-efficient Kubernetes deployment.

#1: Decide whether or not the workload is a good candidate for Kubernetes

Kubernetes can provide significant cost benefits for many workloads, but not necessarily all. You can end up creating tech debt down the road, or an operations nightmare if you simply throw everything into Kubernetes and expect to save money. You need to decide whether or not the workload is a good candidate for Kubernetes.

Critical questions to consider are:

“Does this workload need to save state (like a database)?”
“Does it provide tolerations for spinning up and down and additional instances of it being created?”

If moving a workload to Kubernetes will slow down operations and require hiring additional resources, then it may not be a cost-efficient workload to run in Kubernetes.

#2: Understand the general resource requirements for the workload

Before moving your workloads to Kubernetes, you need to do your homework. You will have to make a decision on what size and type of nodes to use in your cluster, so it’s imperative to take inventory of the resource requirements — mainly CPU and memory — across all of your workloads. And for a stateful workload, you’ll need to factor in how much storage is needed as well. Having some idea of resource consumption and the amount of load it can handle, before needing to scale up, will help in overall cost estimation.

The good thing is that most, if not all, programming languages have some form of profiling support to gain an understanding of resource consumption. Pair profiling tools with load testing tools and you’ll get a good idea about how your application will perform under stress. Some developers may not bother to take this step, since workloads have performed well up to this point. But this approach can lead to extra costs. The information gathered during this exercise will inform many decisions you make down the road with regard to your Kubernetes journey.

#3: Select the right size nodes

There is unfortunately no silver bullet solution to selecting the right size nodes. What makes it even more difficult is that Kubernetes allows you to mix node sizes to suit your workload. This will largely be informed by the amount of applications you have to deploy, their resource requirements, and their scale/replication factor. When it comes down to it, remember that the main goal is to reduce the operations burden and take advantage of the increased utilization that containers and their orchestration can offer.

Something additional to consider: set up some type of observability and alerting, within or outside of your cloud provider. This lets you know: how much you’re currently spending and it alerts you when an additional node comes online — either manually added by an ops engineer, or automatically added if cluster vertical scaling is enabled. Tying that event to logs from Kube-Scheduler will give you an idea of what needed to be deployed, that triggered the need for more nodes. Evaluation of node size and applied node-pod affinity is an exercise that should be done frequently to manage cloud Kubernetes costs.

#4: Properly set your resource requests and limits

Based on the resource consumption analysis of your workloads, take the time to properly set your resource requests and limits accordingly. This plays into the value-add of increased utilization brought by container orchestration. When armed with knowledge of resource requirements, Kubernetes can more efficiently bin pack your nodes, leading to no wasted resources.

Once limits are set and workloads are deployed, you can implement more advanced features — like Horizontal Pod Autoscalers (HPA). HPAs are the core component of autoscaling in Kubernetes, which scales pods up or down as necessary. HPAs can operate off of metrics like CPU and memory out of the box; and can be configured to operate on custom metrics as well.

#5: Have observability into your applications

If you’re monitoring just Kubernetes and you’re not actually gaining observability into the applications that are being run there, then you’re only seeing half the picture. In addition to understanding the amount of CPU, memory, network and storage that each of the pods are taking up, you also want the underlying application metrics that matter to your organization —whether that’s requests per second, new users per minute, etc.

HPAs can be configured to operate off of custom collected metrics like these. This can be useful if there’s some pattern of utilization upon which you need to scale that’s not directly exposed, like CPU or memory. This will allow your organization to make more informed decisions regarding cluster size and variability down the road. Other avenues of observability would be directly related to your cloud provider — keeping an eye on spend related to your Kubernetes cluster, alerts for additional nodes added to your Kubernetes cluster, etc.

Whether you’re thinking about deploying Kubernetes, have just started, or have been on your journey for a while, setting these guidelines across your organization can help optimize both your Kubernetes performance and costs. By implementing a couple of these tips to start, you can experience significant cost savings quickly and more accurately predict costs to avoid those surprise bills. While it may take some work to start, the efforts here can pay off substantially.