Comment on page
Scraping custom metrics
Let groundcover automatically scrape your custom metrics
groundcover can scrape your custom metrics by deploying a metrics scraper (vmagent by Victoria Metrics) that will automatically scrape prometheus targets.
The following helm override enables custom metrics scraping
custom-metrics:
enabled: true
Scrape your custom metrics using groundcover CLI (using default scrape jobs):
groundcover deploy --custom-metrics
helm upgrade groundcover -n groundcover -f <overrides.yaml> --reuse-values
Either create a new custom-values.yaml or edit your existing groundcover values.yaml:
global:
logs:
retention: 3d # {amount}[h(ours), d(ays), w(eeks), y(ears)]
traces:
retention: 24h # {amount}[h(ours), d(ays), w(eeks), y(ears)]
victoria-metrics-single:
server:
# -- Data retention period for metrics, {amount}[h(ours), d(ays), w(eeks), y(ears)], default is 7d
retentionPeriod: 7d
Ensure that the Kubernetes resources that contain your Prometheus exporters have been deployed with the following annotations to enable scraping
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "<port>"
#optinoal if path is not /metrics
prometheus.io/path: "<metrics-path>"
by default, the following scrape jobs are deployed
custom-metrics:
enabled: true
config:
scrape_configs:
## COPY from Prometheus helm chart https://github.com/helm/charts/blob/master/stable/prometheus/values.yaml
# Scrape config for API servers.
#
# Kubernetes exposes API servers as endpoints to the default/kubernetes
# service so this uses `endpoints` role and uses relabelling to only keep
# the endpoints associated with the default/kubernetes service using the
# default named port `https`. This works for single API server deployments as
# well as HA API server deployments.
- job_name: "kubernetes-apiservers"
honor_labels: true
kubernetes_sd_configs:
- role: endpointslices
# Default to scraping over https. If required, just disable this or change to
# `http`.
scheme: https
# This TLS & bearer token file config is used to connect to the actual scrape
# endpoints for cluster components. This is separate to discovery auth
# configuration because discovery & scraping are two separate concerns in
# Prometheus. The discovery auth config is automatic if Prometheus runs inside
# the cluster. Otherwise, more config options have to be provided within the
# <kubernetes_sd_config>.
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
# If your node certificates are self-signed or use a different CA to the
# master CA, then disable certificate verification below. Note that
# certificate verification is an integral part of a secure infrastructure
# so this should only be disabled in a controlled environment. You can
# disable certificate verification by uncommenting the line below.
#
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
# Keep only the default/kubernetes service endpoints for the https port. This
# will add targets for each API server which Kubernetes adds an endpoint to
# the default/kubernetes service.
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex: groundcover
- source_labels:
[
__meta_kubernetes_namespace,
__meta_kubernetes_service_name,
__meta_kubernetes_endpoint_port_name,
]
action: keep
regex: default;kubernetes;https
- job_name: "kubernetes-nodes"
honor_labels: true
# Default to scraping over https. If required, just disable this or change to
# `http`.
scheme: https
# This TLS & bearer token file config is used to connect to the actual scrape
# endpoints for cluster components. This is separate to discovery auth
# configuration because discovery & scraping are two separate concerns in
# Prometheus. The discovery auth config is automatic if Prometheus runs inside
# the cluster. Otherwise, more config options have to be provided within the
# <kubernetes_sd_config>.
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
# If your node certificates are self-signed or use a different CA to the
# master CA, then disable certificate verification below. Note that
# certificate verification is an integral part of a secure infrastructure
# so this should only be disabled in a controlled environment. You can
# disable certificate verification by uncommenting the line below.
#
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex: groundcover
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/$1/proxy/metrics
# Scrape config for service endpoints.
#
# The relabeling allows the actual service scrape endpoint to be configured
# via the following annotations:
#
# * `prometheus.io/scrape`: Only scrape services that have a value of `true`
# * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
# to set this to `https` & most likely set the `tls_config` of the scrape config.
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: If the metrics are exposed on a different port to the
# service then set this appropriately.
- job_name: "kubernetes-service-endpoints"
honor_labels: true
kubernetes_sd_configs:
- role: endpointslices
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex: groundcover
- action: drop
source_labels: [__meta_kubernetes_pod_container_init]
regex: true
- action: keep_if_equal
source_labels:
[
__meta_kubernetes_service_annotation_prometheus_io_port,
__meta_kubernetes_pod_container_port_number,
]
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels:
[
__address__,
__meta_kubernetes_service_annotation_prometheus_io_port,
]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod
- source_labels: [__meta_kubernetes_pod_container_name]
target_label: container
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: service
- source_labels: [__meta_kubernetes_service_name]
target_label: job
replacement: ${1}
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: node
# Scrape config for slow service endpoints; same as above, but with a larger
# timeout and a larger interval
#
# The relabeling allows the actual service scrape endpoint to be configured
# via the following annotations:
#
# * `prometheus.io/scrape-slow`: Only scrape services that have a value of `true`
# * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
# to set this to `https` & most likely set the `tls_config` of the scrape config.
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: If the metrics are exposed on a different port to the
# service then set this appropriately.
- job_name: "kubernetes-service-endpoints-slow"
honor_labels: true
scrape_interval: 5m
scrape_timeout: 30s
kubernetes_sd_configs:
- role: endpointslices
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex: groundcover
- action: drop
source_labels: [__meta_kubernetes_pod_container_init]
regex: true
- action: keep_if_equal
source_labels:
[
__meta_kubernetes_service_annotation_prometheus_io_port,
__meta_kubernetes_pod_container_port_number,
]
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
action: keep
regex: true
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels:
[
__address__,
__meta_kubernetes_service_annotation_prometheus_io_port,
]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod
- source_labels: [__meta_kubernetes_pod_container_name]
target_label: container
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: service
- source_labels: [__meta_kubernetes_service_name]
target_label: job
replacement: ${1}
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: node
# Example scrape config for probing services via the Blackbox Exporter.
#
# The relabeling allows the actual service scrape endpoint to be configured
# via the following annotations:
#
# * `prometheus.io/probe`: Only probe services that have a value of `true`
- job_name: "kubernetes-services"
honor_labels: true
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex: groundcover
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: service
# Example scrape config for pods
#
# The relabeling allows the actual pod scrape endpoint to be configured via the
# following annotations:
#
# * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`.
- job_name: "kubernetes-pods"
honor_labels: true
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex: groundcover
- action: drop
source_labels: [__meta_kubernetes_pod_container_init]
regex: true
- action: keep_if_equal
source_labels:
[
__meta_kubernetes_pod_annotation_prometheus_io_port,
__meta_kubernetes_pod_container_port_number,
]
- source_labels:
[__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels:
[__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod
- source_labels: [__meta_kubernetes_pod_container_name]
target_label: container
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: service
- source_labels: [__meta_kubernetes_service_name]
target_label: job
replacement: ${1}
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: node
In case you're interested in disabling autodiscovery scrape jobs, provide the following override
custom-metrics:
enabled: true
config:
scrape_configs: []
in case you're interested in deploying custom scrape jobs, create/add the following override
custom-metrics:
enabled: true
extraScrapeConfigs:
- job_name: "custom-scrape-job"
...
In order to safeguard groundcover's performance, there are default limitations on metrics ingestion in place.
remoteWrite.maxDailySeries: "1000000" # limiting daily churn rate.
remoteWrite.maxHourlySeries: "100000" # limiting the number of active time series.
In order to increase metrics resolution, you can implement the following overrides
Increasing cardinality parameters will increase memory/cpu consumption and might cause OOMKills/CPU Throttling.
Please use with caution and increase the custom metrics agent/metrics server resources accordingly
custom-metrics:
enabled: true
extraArgs:
remoteWrite.maxDailySeries: "<custom value>"
remoteWrite.maxHourlySeries: "<custom value>"
In case you wish to increase metrics server / custom metrics resources, use the following overrides:
custom-metrics:
resources:
limits:
cpu: <>
memory: <>
requests:
cpu: <>
memory: <>
victoria-metrics-single:
server:
resources:
limits:
cpu: <>
memory: <>
requests:
cpu: <>
memory: <>
once custom metrics module is up and running, you can access them using groundcover's dashboard panel
- Create a new dashboard
- Create a new panel
- Select the cluster with custom-metrics enabled
- Custom metrics will be available

Last modified 25d ago