BLOG

Key Kubernetes Cluster Metrics

Facebook
LinkedIn
Twitter
Email

What is Cluster Metrics?

Kubernetes monitoring is a form of reporting that helps identify issues with the proactive management of clusters. Monitoring a Kubernetes cluster eases the management of containerized infrastructure by tracking the utilization of the cluster’s current health and resources. Including memory, CPU, storage, performance metrics, resource counts, and a top-level overview of what is happening inside your Kubernetes cluster. 

Why is it important?

The explosive growth of containers in enterprise-level businesses has brought many advantages to developers, DevSecOps, and IT teams worldwide. However, the flexibility and scalability that Kubernetes brings in deploying containerized applications also present new challenges. Monitoring Kubernetes also helps with cost control. By giving you a full picture of resource usage and availability, you can make sure that pods, individual containers, and namespaces use underlying resources efficiently. 

Top 6 Metrics

Kubernetes Metrics Server aggregates data collected from the kubelet on each node, passing it through the Metrics API, which can then combine with a number of visualization tools. Some key metrics to consider tracking include: 

CPU/Memory Quota & Utilization

Tracking cluster metrics helps you understand if pods are being launched, maintained, and prepared correctly. Understanding resource utilization helps inform decisions to increase or decrease the size or number of nodes in a cluster and the overall application. Monitoring these metrics allows you to see how much CPU/Memory is utilized and measures the total amount of CPU/Memory time the Kubernetes cluster uses in each CPU/Memory period, measuring the limits, requests, and use of CPU/Memory resources. 

Pod Pending

If a pod is stuck at Pods Pending it means that there is something going wrong on the pods level. It could be of three issues: Scheduling issue, Image Issue, or Dependency issues. Scheduling issues can occur when a pod cannot be schedule to any nodes because of issues with nodes. Image issues can occur when the image for the container is not able to download the Image of the app properly to the pod. Dependency issues occur if any information need from the cluster is missing, Ei. Volume, secret, config map. 

Pods Failed

Tracking pod failures can help you understand if the available nodes are sufficient to handle current workloads against the total amount of pods you may have. It allows you to pinpoint misconfigured launch manifests or issues of resources on your nodes. Watching the rate of events from your cluster can be an excellent early warning indicator. If the rate of failed pods changes suddenly or significantly, it may indicate something is going wrong went your application. 

Performance Monitoring Tools

AppDynamics

For more information here is AppDynamics’ documentation for the Cluster Metrics it monitors and retrieves from the Kubernetes Agent: AppDynamics Cluster Metrics 

Grafana

Able to import dashboard templates, you can find more templates for various metrics and visuals through Grafana Dashboards. 

Conclusion

As there are many cluster metrics to monitor, we believe these top 6 metrics are important to developers. By knowing the Kubernetes cluster, node, and pod metrics you will be able to determine if both the cluster and application are running and healthy. 

While both AppDynamics and Grafana having their own share of advantages and disadvantages it boils down to what you need. AppDynamics develops application performance management (APM) solutions that deliver problem resolution for highly distributed applications through transaction flow monitoring and deep diagnostics. Grafana, an open-source platform for analytics and metric visualization, includes four dashboards: Cluster, Node, Pod/Container and Deployment. Kubernetes admins often install Grafana and leverage the Prometheus data source to create information-rich dashboards.