Manage costs

Grafana Cloud

Manage costs

To manage the cost of cloud resources that your Kubernetes infrastructure is consuming, you need insight into the costs generated throughout your fleet. Kubernetes Monitoring uses an OpenCost integration along with Grafana’s experience in managing Kubernetes-related costs to calculate cost data. OpenCost uses information about your Kubernetes components to provide metrics that measure infrastructure costs in real time.

You can use cost information to make data-driven decisions about resource allocation, scaling strategies, and technology investments. Out of the box, you can:

Observe cost per resource and infrastructure type.
View historical and projected costs.
Identify savings you could achieve by removing or reducing unused CPU, RAM, or storage.
Learn the costs of network egress and GPU.
Compare savings and costs trends.

Cost analysis path

Switch the view to Cost on the Clusters, Namespaces, Workloads, and Nodes tabs found under Cluster navigation for cost information.

You can start at any level to examine cost. For example, the list of Clusters on the Clusters tab shows:

Current compute: The cost of allocated compute (CPU and memory) for the time range selected
Projected compute: The cost of allocated compute (CPU and memory) if it remains the same for the next 30 days
Current CPU idle: The cost of the CPU idleness for the time range selected
Projected CPU idle: The cost of CPU idleness if it remains at the same level for the next 30 days
Current memory idle: The cost of the memory idleness for the time range selected
Projected memory idle: The cost of memory idleness if it remains at the same level for the next 30 days

Sort the list of Clusters for the highest CPU idle cost, and begin drilling into the data.

Click the Cluster with the highest idle cost to open the Cluster details page. The cost of the Cluster is a known cost. This page provides:

Aggregated data from all Nodes in the Cluster
The list of Nodes, with the cost of each Node, which has a known cost

Sort the Node list to see the Nodes with the highest idle CPU. When you click a Node in the list, the Node detail page appears, showing:

Aggregated data from all Pods in the Node
The list of Pods in the Node and the cost of each Pod, which is calculated with an allocation function

When you click a Pod in the list, the Pod detail page appears, showing:

Aggregated data from all containers in the Pod
The list of containers, and the cost of each container, which is calculated with an allocation function

Sort for and select the container with the highest idle CPU to open the container details page. Here you can view recommendations for sizing. Refer to CPU requests and limits for containers for additional details.

Costs overview

The Costs page is accessible from the main menu, and provides:

An Overview tab that shows the cost (calculated within the selected time range) of each cloud service provider you are using, along with the total cost of all providers, a 30-day projected cost of idle CPU cores, and the percentage of unclaimed Persistent Volumes.
Cost overview page
A Savings tab that shows cost compared to savings for CPU, RAM, storage, and GPU if unused resources had been deprovisioned within the selected time range. This page also provides the cost of network egress.
Cost/savings comparison by CPU, RAM, storage, GPU, and network egress

Hover over the circled i icon for more information on each calculation.

Use cost monitoring strategically

Here are some strategies you can use to manage the costs of your Kubernetes infrastructure.

Examine historical costs

Examine costs over a time period on the detail pages:

Use the time range selector to set a time range.
Analyze the costs of CPU and memory usage.
Container detail page with cost of container over two-day span

Look at projected costs

You can use the 30-day projected cost of any resource to prioritize where to make adjustments and save.

Compare costs

On the Costs page, use the Cluster filter to compare costs across different cloud providers or Kubernetes distributions to gain insight per cloud provider and per Cluster.

Cluster filter on Costs Overview tab — Cluster filter on Costs page

Verify scaling

Assess the cost implications of scaling resources by using Kubernetes autoscaling mechanisms. With horizontal and vertical autoscalers, you can dynamically adjust resource allocation based on demand. Then verify your adjustments are reducing costs by using Kubernetes Monitoring to compare the past and present resource allocation.

Optimize resource usage

Costs can be affected by incorrectly configured CPU and memory requests and limits on containers. For more details, refer to:

Refine and monitor practices and policies

Establish cost governance practices and policies based on the cost data available in Kubernetes Monitoring.

Refine cost estimates

Costs shown in Kubernetes Monitoring are estimates of your infrastructure costs based on the node type, size, region, and public pricing lists. However, the default pricing doesn’t include cost adjustments that you receive from the vendor, such as discounts. To further refine the estimate to include any negotiated pricing that is specific to your account, visit these vendor links for instructions:

AWS
Azure
GCP
- Enable the Cloud Billing API.
- Create an API key, and optionally edit the key and restrict to the Cloud Billing API.
- Edit the OpenCost Deployment on the Kubernetes Cluster, and set the CLOUD_PROVIDER_API_KEY to the newly created API key.

Customize cost estimates

If you use a Cluster provider other than the top three vendors, you can configure custom pricing using the OpenCost Helm chart.

Configure or upgrade

Cost monitoring is a choice you can switch on or off when you configure with Grafana Kubernetes Monitoring Helm chart. If you have already deployed Kubernetes Monitoring using Agent or Agent Operator, follow the instructions to upgrade Kubernetes Monitoring.

How costs are calculated

Keep in mind that cost calculations depend on the type of object within Kubernetes.

Nodes, Clusters, and known cost

Cloud provider costs are gathered per Node, so each Node (and therefore each Cluster) has a known cost. OpenCost estimates what proportion of the total Node cost can be associated with CPU and memory.

Allocated cost

For the cost of any resource not at the Node or Cluster level (containers, Pods, namespaces, and workloads), Kubernetes Monitoring uses an allocation algorithm in combination with OpenCost hourly cost estimates for CPU and memory proportional costs.

Allocation is derived by taking the greater amount of either the actual usage or the requested amount. The sum of the CPU and memory allocation in each hour is multiplied by the hourly CPU and memory cost, which is estimated by OpenCost.

Idle costs

For Nodes and Clusters, idle cost is calculated by the difference between usage and physical capacity. For resources not at the Node or Cluster level, idle cost is calculated by the difference between usage and requests. Requests act as reserved resources, so unused requests can’t be used by other objects in Kubernetes.

Comparison with vendor costs

If you want to compare the vendor costs with the cost estimates in Kubernetes Monitoring, it’s best to compare at the Node or Cluster level.

Cardinality with OpenCost

To understand the impact of using OpenCost on cardinality, refer to Available Prometheus Metrics for OpenCost.

Feedback

Manage costs