Key Metrics to Monitor in Starter Kubernetes Setup

After doing a base Kubernetes cluster setup, the question arises, what if something is going wrong or what if some incident happens? Running a containerized application in the orchestrated environment is step one but monitoring the application and cluster is another important part. Even if we are using a managed Kubernetes cluster like GKE where we don’t need to worry about master nodes, autoscaling, auto-healing etc. But what if things go south? In this post, I will be sharing some of the base level monitoring we need to set up assuming we are using prometheus for the metrics.

image source: platform9.com

For a quick monitoring setup, we can use kube-prometheus-stack charts from prometheus community which is a swiss-knife for starter setup. It combines Kubernetes manifests, Grafana dashboards, and Prometheus rules.

Let’s dive into key metrics and prometheus query. We can simply create a Grafana dashboard and integrate alerting from there which makes it much easier.

--

--

DevOps | SRE | #GDE

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store