最近更新时间:2024.04.11 20:47:27
首次发布时间:2024.02.29 15:55:40
Exporter会对配置做一定的检查和矫正,进程启动时会打印原始配置和矫正后的配置。如果出现异常,可以检查下矫正后的配置文件。
error
和info
日志。-log-level
设置为debug
,开启debug
日志。-enable-self-metrics
设置为true(默认即为true),开启自监控指标。/metrics
接口可以获取到监控指标。buckets = prometheus.ExponentialBuckets(1e3, 10, 5) metaLoadDurationUs = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "metric_meta_load_duration_us", Help: "metricMeta load duration us", Buckets: buckets, }, []string{}) metaLoadError = prometheus.NewCounterVec(prometheus.CounterOpts{ Name: "metric_meta_load_error", Help: "metricMeta load error", }, []string{}) monitorObjectsLoadDurationUs = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "monitor_objects_load_duration_us", Help: "metric monitorObjects load duration us", Buckets: buckets, }, []string{}) monitorObjectsCount = prometheus.NewGaugeVec(prometheus.GaugeOpts{ Name: "monitor_objects_count", Help: "monitor objects count of a namespace-subNamespace", }, []string{"namespace", "sub_namespace"}) dataLoadDurationUs = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "data_load_duration_us", Help: "metric data load duration us", Buckets: buckets, }, []string{}) dataSeries = prometheus.NewGaugeVec(prometheus.GaugeOpts{ Name: "data_series", Help: "total series", }, []string{}) limitDurationUs = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "limit_duration_us", Help: "named limiter wait duration us", Buckets: buckets, }, []string{"name"}) reqDataDurationUs = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "req_data_duration_us", Help: "request data duration us", Buckets: buckets, }, []string{"namespace", "sub_namespace", "name"}) reqMonitorObjectDurationUs = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "req_monitor_objects_duration_us", Help: "request monitorObject duration us", Buckets: buckets, }, []string{"namespace", "sub_namespace"}) reqDataError = prometheus.NewCounterVec(prometheus.CounterOpts{ Name: "req_data_error", Help: "request data error", }, []string{"namespace", "sub_namespace", "name"}) reqMonitorObjectError = prometheus.NewCounterVec(prometheus.CounterOpts{ Name: "req_monitor_objects_error", Help: "request monitorObjects count", }, []string{"namespace", "sub_namespace"})
/meta
查看当前指标列表。/monitor_objects
查看当前的监控对象列表。导出的指标量过大,可以考虑缩小导出的指标范围。
/meta
接口确认是否存在该指标,如果不存在,可能是通过Namespaces、SubNamespaces等字段过滤掉了部分指标。/monitor_objects
接口,确认是否存在相关实例,没有实例的话则不会有指标。
/meta
接口,获取当前指标的Namespace和SubNamespace。/monitor_objects
接口的响应,查看namespace/subNamespace
是否有对应数据。{ "VCM_RDS_MySQL/deploy_monitor": [ { "Dimensions": [ { "Name": "ResourceID", "Value": "mysql-68781f23efcc" } ], "Instances": [ { "Name": "ResourceID", "Value": "mysql-68781f23efcc" } ] } ] }
默认允许的指标延迟为120s,一般不会出现问题。如果发现异常,可以将配置文件中的DelaySeconds适当调大。
校验Exporter导出数据的值和云监控页面是否一致时,需要对齐聚合粒度。
DataFreshIntervalSeconds
参数进行配置,默认为60s。所以,默认情况下Exporter的数据和云监控查询1天的时间范围情况下是一致的,和一小时时间范围的数据不一致是正常的。
云监控接口Quota为10qps,Exporter请求接口的QPS由参数LimitQPSGetMetricData
控制,默认值为5。
LimitQPSGetMetricData
。LimitQPSGetMetricData
需要最大为3;如果还是会超,那么可以适当再调小参数。