Skip to main content
Version: 1.22

Kubeadm ServiceMonitor

This package provides monitoring for the following Kubernetes components:

  • kubelet
  • coredns
  • api-server
  • kube-control-manager
  • kube-scheduler
  • etcd

These are components needed to deliver a functioning Kubernetes cluster. If you want to learn more about these components please follow the official documentation of Kubernetes.

⚠️ This package is guaranteed to work only on clusters created using kubeadm, for managed clusters please take a look at the aks-sm, eks-sm and gke-sm packages.

Requirements

Configuration

Prometheus scrapes Kubernetes component metrics on port metrics with following intervals:

  • kube-control-manager: 30s
  • coredns: 15s
  • etcd: 15s
  • api-server: 30s
  • kubelet: 30s
  • kube-scheduler: 30s
  • Dashboards shipped:
    • coredns: CoreDNS < 1.7.0
    • api-server: Kubernetes / API server
    • cluster-total: Kubernetes / Networking / Cluster
    • kubelet: Kubernetes / Kubelet
    • namespace-by-pod: Kubernetes / Networking / Namespace (Pods)
    • namespace-by-workload: Kubernetes / Networking / Namespace (Workload)
    • persistent-volumes-usage: Kubernetes / Persistent Volumes
    • pod-total: Kubernetes / Networking / Pod
    • workload-total: Kubernetes / Networking / Workload
    • controller-manager: Kubernetes / Controller Manager
    • etcd: Etcd
    • scheduler: Kubernetes / Scheduler

Alerts

The followings alerts are already defined for this package.

kubernetes-absent-kubeadm

ParameterDescriptionSeverityInterval
KubeControllerManagerDownThis alert fires if Prometheus target discovery was not able to reach the kube-controller-manager in the last 15 minutes.critical15m
KubeSchedulerDownThis alert fires if Prometheus target discovery was not able to reach the kube-scheduler in the last 15 minutes.critical15m
KubeClientCertificateExpirationThis alert fires when the Kubernetes API client certificate is expiring in less than 30 days.warning
KubeClientCertificateExpirationThis alert fires when the Kubernetes API client certificate is expiring in less than 7 days.critical

coredns

ParameterDescriptionSeverityInterval
CoreDNSPanicThis alert fires if CoreDNS total panic count increased by at least 1 in the last 10 minutes.warning
CoreDNSRequestsLatencyThis alert fires if CoreDNS 99th percentile requests latency was higher than 100ms in the last 10 minutes.warning10m
CoreDNSHealthRequestsLatencyThis alert fires if CoreDNS 99th percentile health requests latency was higher than 10ms in the last 10 minutes.warning10m
CoreDNSProxyRequestsLatencyThis alert fires if CoreDNS 99th percentile proxy requests latency was higher than 500ms in the last 10 minutes.warning10m

etcd3

ParameterDescriptionSeverityInterval
EtcdInsufficientMembersThis alert fires if less than half of Etcd cluster members were online in the last 3 minutes.critical3m
EtcdNoLeaderThis alert fires if the Etcd cluster had no leader in the last minute.critical1m
EtcdHighNumberOfLeaderChangesThis alert fires if the Etcd cluster changed leader more than 3 times in the last hour.warning
EtcdHighNumberOfFailedProposalsThis alert fires if there were more than 5 proposal failure in the last hour.warning
EtcdHighFsyncDurationsThis alert fires if the WAL fsync 99th percentile latency was higher than 0.5s in the last 10 minutes.warning10m
EtcdHighCommitDurationsThis alert fires if the backend commit 99th percentile latency was higher than 0.25s in the last 10 minutes.warning10m