基于kube-prometheus部署Prometheus监控kubernetes集群

1、背景描述

由于Prometheus本身没有提供管理配置的AP接口(尤其是管理监控目标和管理警报规则),也没有提供好用的多实例管理手段,因此这一块往往要自己写一些代码或脚本。为了简化这类应用程序的管理复杂度,CoreOS率先引入了Operator的概念,并且首先推出了针对在Kubernetes下运行和管理Etcd的Etcd Operator。并随后推出了Prometheus Operator

prometheus-operator官方地址:https://github.com/prometheus-operator/prometheus-operator
kube-prometheus官方地址:https://github.com/prometheus-operator/kube-prometheus

The Prometheus Operator:创建CRD自定义的资源对象
Highly available Prometheus:创建高可用的Prometheus
Highly available Alertmanager:创建高可用的告警组件
Prometheus node-exporter:创建主机的监控组件
Prometheus Adapter for Kubernetes Metrics APIs:创建自定义监控的指标工具(例如可以通过nginx的request来进行应用的自动伸缩)
kube-state-metrics:监控k8s相关资源对象的状态指标
Grafana:进行图像展示

区别于常规部署方法,operator对prometheus生态组件做了如下抽象

基于operator的prometheus组件 描述
Prometheus Prometheus Server抽象
Service Monitor Exporter抽象
AlertManager Prometheus AlertManager抽象
PrometheusRule 实现报警规则

2、部署

使用kube-prometheus项目来部署
cd kube-prometheus/mainfests
mkdir -p serviceMonitor prometheus adapter node-exporter kube-state-metrics grafana alertmanager operator other
将各个yaml文件分类放置
tree .

├── adapter
│   ├── prometheus-adapter-apiService.yaml
│   ├── prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml
│   ├── prometheus-adapter-clusterRoleBindingDelegator.yaml
│   ├── prometheus-adapter-clusterRoleBinding.yaml
│   ├── prometheus-adapter-clusterRoleServerResources.yaml
│   ├── prometheus-adapter-clusterRole.yaml
│   ├── prometheus-adapter-configMap.yaml
│   ├── prometheus-adapter-deployment.yaml
│   ├── prometheus-adapter-roleBindingAuthReader.yaml
│   ├── prometheus-adapter-serviceAccount.yaml
│   └── prometheus-adapter-service.yaml
├── alertmanager
│   ├── alertmanager-alertmanager.yaml
│   ├── alertmanager-secret.yaml
│   ├── alertmanager-serviceAccount.yaml
│   └── alertmanager-service.yaml
├── grafana
│   ├── grafana-dashboardDatasources.yaml
│   ├── grafana-dashboardDefinitions.yaml
│   ├── grafana-dashboardSources.yaml
│   ├── grafana-deployment.yaml
│   ├── grafana-serviceAccount.yaml
│   └── grafana-service.yaml
├── kube-state-metrics
│   ├── kube-state-metrics-clusterRoleBinding.yaml
│   ├── kube-state-metrics-clusterRole.yaml
│   ├── kube-state-metrics-deployment.yaml
│   ├── kube-state-metrics-serviceAccount.yaml
│   └── kube-state-metrics-service.yaml
├── node-exporter
│   ├── node-exporter-clusterRoleBinding.yaml
│   ├── node-exporter-clusterRole.yaml
│   ├── node-exporter-daemonset.yaml
│   ├── node-exporter-serviceAccount.yaml
│   └── node-exporter-service.yaml
├── operator
│   ├── 0namespace-namespace.yaml
│   ├── prometheus-operator-0alertmanagerConfigCustomResourceDefinition.yaml
│   ├── prometheus-operator-0alertmanagerCustomResourceDefinition.yaml
│   ├── prometheus-operator-0podmonitorCustomResourceDefinition.yaml
│   ├── prometheus-operator-0probeCustomResourceDefinition.yaml
│   ├── prometheus-operator-0prometheusCustomResourceDefinition.yaml
│   ├── prometheus-operator-0prometheusruleCustomResourceDefinition.yaml
│   ├── prometheus-operator-0servicemonitorCustomResourceDefinition.yaml
│   ├── prometheus-operator-0thanosrulerCustomResourceDefinition.yaml
│   ├── prometheus-operator-clusterRoleBinding.yaml
│   ├── prometheus-operator-clusterRole.yaml
│   ├── prometheus-operator-deployment.yaml
│   ├── prometheus-operator-serviceAccount.yaml
│   └── prometheus-operator-service.yaml
├── other
├── prometheus
│   ├── prometheus-clusterRoleBinding.yaml
│   ├── prometheus-clusterRole.yaml
│   ├── prometheus-prometheus.yaml
│   ├── prometheus-roleBindingConfig.yaml
│   ├── prometheus-roleBindingSpecificNamespaces.yaml
│   ├── prometheus-roleConfig.yaml
│   ├── prometheus-roleSpecificNamespaces.yaml
│   ├── prometheus-rules.yaml
│   ├── prometheus-serviceAccount.yaml
│   └── prometheus-service.yaml
└── serviceMonitor
    ├── alertmanager-serviceMonitor.yaml
    ├── grafana-serviceMonitor.yaml
    ├── kube-state-metrics-serviceMonitor.yaml
    ├── node-exporter-serviceMonitor.yaml
    ├── prometheus-adapter-serviceMonitor.yaml
    ├── prometheus-operator-serviceMonitor.yaml
    ├── prometheus-serviceMonitorApiserver.yaml
    ├── prometheus-serviceMonitorCoreDNS.yaml
    ├── prometheus-serviceMonitorKubeControllerManager.yaml
    ├── prometheus-serviceMonitorKubelet.yaml
    ├── prometheus-serviceMonitorKubeScheduler.yaml
    └── prometheus-serviceMonitor.yaml