1、浅析监控⽅案
heapster是⼀个监控计算、存储、⽹络等集群资源的⼯具,以k8s内置的cAdvisor作为数据源收集集群信息,并汇总出有价值的性能数据(Metrics):cpu、内存、network、filesystem等,然后将这些数据输出到外部存储(backend),如InfluxDB,最后再通过相应的UI界⾯进⾏可视化展⽰,如grafana。 另外heapster的数据源和外部存储都是可插拔的,所以可以很灵活的组建出很多监控⽅案,如:Heapster+ElasticSearch+Kibana等等。Heapster的整体架构图:
2、部署
本篇我们将实践 Heapster + InfluxDB + Grafana 的监控⽅案。使⽤官⽅提供的yml⽂件有⼀些⼩问题,请参考以下改动和说明:
2.1、创建InfluxDB资源对象
apiVersion: apps/v1kind: Deploymentmetadata:
name: monitoring-influxdb namespace: kube-systemspec:
replicas: 1 selector:
matchLabels: task: monitoring k8s-app: influxdb template: metadata: labels:
task: monitoring k8s-app: influxdb spec:
containers:
- name: influxdb
image: k8s.gcr.io/heapster-influxdb-amd64:v1.3.3 volumeMounts: - mountPath: /data
name: influxdb-storage volumes:
- name: influxdb-storage emptyDir: {}---apiVersion: v1kind: Servicemetadata: labels:
task: monitoring
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-influxdb name: monitoring-influxdb namespace: kube-systemspec:
type: NodePort ports:
- nodePort: 31001 port: 8086
targetPort: 8086 selector:
k8s-app: influxdb
2.1、创建Grafana资源对象
apiVersion: apps/v1kind: Deploymentmetadata:
name: monitoring-grafana namespace: kube-systemspec:
replicas: 1 selector:
matchLabels: task: monitoring k8s-app: grafana template: metadata: labels:
task: monitoring k8s-app: grafana spec:
containers:
- name: grafana
image: k8s.gcr.io/heapster-grafana-amd64:v4.4.3 ports:
- containerPort: 3000 protocol: TCP volumeMounts:
- mountPath: /etc/ssl/certs name: ca-certificates readOnly: true - mountPath: /var
name: grafana-storage env:
- name: INFLUXDB_HOST value: monitoring-influxdb
- name: GF_SERVER_HTTP_PORT value: \"3000\"
# The following env variables are required to make Grafana accessible via # the kubernetes api-server proxy. On production clusters, we recommend
# removing these env variables, setup auth for grafana, and expose the grafana # service using a LoadBalancer or a public IP. - name: GF_AUTH_BASIC_ENABLED value: \"false\"
- name: GF_AUTH_ANONYMOUS_ENABLED value: \"true\"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE value: Admin
- name: GF_SERVER_ROOT_URL
# If you're only using the API Server proxy, set this value instead:
# value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy value: / volumes:
- name: ca-certificates hostPath:
path: /etc/ssl/certs - name: grafana-storage emptyDir: {}---apiVersion: v1kind: Servicemetadata: labels:
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-grafana name: monitoring-grafana namespace: kube-systemspec:
# In a production setup, we recommend accessing Grafana through an external Loadbalancer # or through a public IP. # type: LoadBalancer
# You could also use NodePort to expose the service at a randomly-generated port type: NodePort ports:
- nodePort: 30108 port: 80
targetPort: 3000 selector:
k8s-app: grafana
2.2、创建Heapster资源对象
apiVersion: v1
kind: ServiceAccountmetadata:
name: heapster
namespace: kube-system---apiVersion: extensions/v1beta1kind: Deploymentmetadata:
name: heapster
namespace: kube-systemspec:
replicas: 1 selector:
matchLabels: task: monitoring k8s-app: heapster template: metadata: labels:
task: monitoring k8s-app: heapster spec:
serviceAccountName: heapster containers:
- name: heapster
image: k8s.gcr.io/heapster-amd64:v1.4.2 imagePullPolicy: IfNotPresent command: - /heapster
- --source=kubernetes:https://kubernetes.default
- --sink=influxdb:http://150.109.39.33:31001 # 这⾥填写刚刚记录下的InfluxDB服务端的地址。
---apiVersion: v1kind: Servicemetadata: labels:
task: monitoring
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: Heapster name: heapster
namespace: kube-systemspec: ports: - port: 80
targetPort: 8082 selector:
k8s-app: heapster
apiVersion: rbac.authorization.k8s.io/v1beta1kind: ClusterRoleBindingmetadata:
name: heapsterroleRef:
apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-adminsubjects:
- kind: ServiceAccount name: heapster
namespace: kube-system
当创建heapster资源的时候,直接把这段代码加上,就⾏了。
3、从不同维度查看应⽤程序性能指标
在k8s集群中,应⽤程序的性能指标,需要从不同的维度(containers, pods, services, and whole clusters)进⾏统计。以便于使⽤户深⼊了解他们的应⽤程序是如何执⾏的以及可能出现的应⽤程序瓶颈。
3.1、通过dashboard查看集群概况
整个监控⽅案部署成功后,从上图可以看到,在不同粒度/维度下,dashboard上可以呈现对象的具体CPU和内存使⽤率。
3.2、通过Grafana查看集群详情(cpu、memory、filesystem、network)
通过Grafana可以查看某个Node或Pod的所有资源使⽤率,包括集群节点、不同NameSpace下的单个Pod等,⼀部分截图如下所⽰:
从上⾯可以看到,Heapster⽆缝衔接Grafana,提供了完美的数据展⽰,很直观、友好。我们也可以学习 来⾃定制出更美观和满⾜特定业务需求的Dashboard。
4、总结
本篇我们详解了k8s原⽣的监控⽅案,它主要监控的是pod和node,对于kubernetes其他组件(API Server、Scheduler、Controller Manager等)的监控显得⼒不从⼼,
⽽prometheus(⼀套开源的监控&报警&时间序列数据库的组合)功能更全⾯,后⾯有时间会进⾏实战。监控是⼀个⾮常⼤的话题,监控的⽬的是为预警,预警的⽬的是为了指导系统⾃愈。只有把 监控=》预警 =》⾃愈 三个环节都完成了,实现⾃动对应⽤程序性能和故障管理,才算得上是⼀个真正意义的应⽤程序性能管理系统(APM),所以这个系列会⼀直朝着这个⽬标努⼒下去,请⼤家继续关注。如果有什么好的想法,欢迎评论区交流。
延伸阅读
如果你觉得本篇⽂章对您有帮助的话,感谢您的【推荐】。
如果你对 kubernets 感兴趣的话可以关注我,我会定期的在博客分享我的学习⼼得。
因篇幅问题不能全部显示,请点此查看更多更全内容