Scraping custom metrics Let groundcover automatically scrape your custom metrics
groundcover can scrape your custom metrics by deploying a metrics scraper (vmagent by Victoria Metrics) that will automatically scrape prometheus targets.
vmagent is fully compatible with prometheus scrape job syntax - more can be found here .
Enabling custom metrics scraping
The following helm override enables custom metrics scraping
Copy custom-metrics :
enabled : true
Using CLI
Scrape your custom metrics using groundcover CLI (using default scrape jobs):
Copy groundcover deploy --custom-metrics
Using Helm
Using Helm (Upgrading existing installation)
Copy helm upgrade groundcover -n groundcover -f < overrides.yam l > --reuse-values
Either create a new custom-values.yaml or edit your existing groundcover values.yaml:
Copy global :
logs :
retention : 3d # {amount}[h(ours), d(ays), w(eeks), y(ears)]
traces :
retention : 24h # {amount}[h(ours), d(ays), w(eeks), y(ears)]
victoria-metrics-single :
server :
# -- Data retention period for metrics, {amount}[h(ours), d(ays), w(eeks), y(ears)], default is 7d
retentionPeriod : 7d
Autodiscovery
Ensure that the Kubernetes resources that contain your Prometheus exporters have been deployed with the following annotations to enable scraping
Copy annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "<port>"
prometheus.io/path: "<metrics-path>" // Optional, if path is not /metrics
By default, the following scrape jobs are deployed when enabling custom-metrics:
Starting November 25th 2024, kubernetes-pods
scraping job will be the only scrape job that is enabled out of the box when activating custom metrics scraping.
You can add back the legacy scrape jobs under extraScrapeConfigs
section as described in Using custom scrape jobs
Copy custom-metrics :
enabled : true
config :
scrape_configs :
- job_name : "kubernetes-apiservers"
honor_labels : true
kubernetes_sd_configs :
- role : endpoints
scheme : https
tls_config :
ca_file : /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify : true
bearer_token_file : /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs :
- action : drop
source_labels :
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex : groundcover
- source_labels :
[
__meta_kubernetes_namespace ,
__meta_kubernetes_service_name ,
__meta_kubernetes_endpoint_port_name ,
]
action : keep
regex : default;kubernetes;https
- job_name : "kubernetes-nodes"
honor_labels : true
scheme : https
tls_config :
ca_file : /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify : true
bearer_token_file : /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs :
- role : node
relabel_configs :
- action : drop
source_labels :
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex : groundcover
- action : labelmap
regex : __meta_kubernetes_node_label_(.+)
- target_label : __address__
replacement : kubernetes.default.svc:443
- source_labels : [ __meta_kubernetes_node_name ]
regex : (.+)
target_label : __metrics_path__
replacement : /api/v1/nodes/$1/proxy/metrics
- job_name : "kubernetes-service-endpoints"
honor_labels : true
kubernetes_sd_configs :
- role : endpointslices
relabel_configs :
- action : drop
source_labels :
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex : groundcover
- action : drop
source_labels : [ __meta_kubernetes_pod_container_init ]
regex : true
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_scrape ]
action : keep
regex : true
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_scheme ]
action : replace
target_label : __scheme__
regex : (https?)
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_path ]
action : replace
target_label : __metrics_path__
regex : (.+)
- source_labels :
[
__address__ ,
__meta_kubernetes_service_annotation_prometheus_io_port ,
]
action : replace
target_label : __address__
regex : ([^:]+)(?::\d+)?;(\d+)
replacement : $1:$2
- action : labelmap
regex : __meta_kubernetes_service_label_(.+)
- source_labels : [ __meta_kubernetes_pod_name ]
target_label : pod
- source_labels : [ __meta_kubernetes_pod_container_name ]
target_label : container
- source_labels : [ __meta_kubernetes_namespace ]
target_label : namespace
- source_labels : [ __meta_kubernetes_service_name ]
target_label : service
- source_labels : [ __meta_kubernetes_service_name ]
target_label : job
replacement : ${1}
- source_labels : [ __meta_kubernetes_pod_node_name ]
action : replace
target_label : node
- job_name : "kubernetes-service-endpoints-slow"
honor_labels : true
scrape_interval : 5m
scrape_timeout : 30s
kubernetes_sd_configs :
- role : endpointslices
relabel_configs :
- action : drop
source_labels :
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex : groundcover
- action : drop
source_labels : [ __meta_kubernetes_pod_container_init ]
regex : true
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_scrape_slow ]
action : keep
regex : true
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_scheme ]
action : replace
target_label : __scheme__
regex : (https?)
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_path ]
action : replace
target_label : __metrics_path__
regex : (.+)
- source_labels :
[
__address__ ,
__meta_kubernetes_service_annotation_prometheus_io_port ,
]
action : replace
target_label : __address__
regex : ([^:]+)(?::\d+)?;(\d+)
replacement : $1:$2
- action : labelmap
regex : __meta_kubernetes_service_label_(.+)
- source_labels : [ __meta_kubernetes_pod_name ]
target_label : pod
- source_labels : [ __meta_kubernetes_pod_container_name ]
target_label : container
- source_labels : [ __meta_kubernetes_namespace ]
target_label : namespace
- source_labels : [ __meta_kubernetes_service_name ]
target_label : service
- source_labels : [ __meta_kubernetes_service_name ]
target_label : job
replacement : ${1}
- source_labels : [ __meta_kubernetes_pod_node_name ]
action : replace
target_label : node
- job_name : "kubernetes-services"
honor_labels : true
metrics_path : /probe
params :
module : [ http_2xx ]
kubernetes_sd_configs :
- role : service
relabel_configs :
- action : drop
source_labels :
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex : groundcover
- source_labels :
[ __meta_kubernetes_service_annotation_prometheus_io_probe ]
action : keep
regex : true
- source_labels : [ __address__ ]
target_label : __param_target
- target_label : __address__
replacement : blackbox
- source_labels : [ __param_target ]
target_label : instance
- action : labelmap
regex : __meta_kubernetes_service_label_(.+)
- source_labels : [ __meta_kubernetes_namespace ]
target_label : namespace
- source_labels : [ __meta_kubernetes_service_name ]
target_label : service
- job_name : "kubernetes-pods"
honor_labels : true
kubernetes_sd_configs :
- role : pod
relabel_configs :
- action : drop
source_labels :
- __meta_kubernetes_pod_label_app_kubernetes_io_part_of
regex : groundcover
- action : drop
source_labels : [ __meta_kubernetes_pod_container_init ]
regex : true
- source_labels :
[ __meta_kubernetes_pod_annotation_prometheus_io_scrape ]
action : keep
regex : true
- source_labels : [ __meta_kubernetes_pod_annotation_prometheus_io_path ]
action : replace
target_label : __metrics_path__
regex : (.+)
- source_labels :
[ __address__ , __meta_kubernetes_pod_annotation_prometheus_io_port ]
action : replace
regex : ([^:]+)(?::\d+)?;(\d+)
replacement : $1:$2
target_label : __address__
- action : labelmap
regex : __meta_kubernetes_pod_label_(.+)
- source_labels : [ __meta_kubernetes_pod_name ]
target_label : pod
- source_labels : [ __meta_kubernetes_pod_container_name ]
target_label : container
- source_labels : [ __meta_kubernetes_namespace ]
target_label : namespace
- source_labels : [ __meta_kubernetes_service_name ]
target_label : service
- source_labels : [ __meta_kubernetes_service_name ]
target_label : job
replacement : ${1}
- source_labels : [ __meta_kubernetes_pod_node_name ]
action : replace
target_label : node
Disable autodiscovery scrape jobs
In case you're interested in disabling autodiscovery scrape jobs, provide the below override
Disabling custom-metrics scrape jobs allows you to scale the custom-metrics deployment horizontally.
Copy custom-metrics :
enabled : true
config :
scrape_configs : []
Using custom scrape jobs
in case you're interested in deploying custom scrape jobs, create/add the following override
Copy custom-metrics :
enabled : true
extraScrapeConfigs :
- job_name : "custom-scrape-job"
...
Cardinality limits
In order to safeguard groundcover's performance, there are default limitations on metrics ingestion in place.
Copy remoteWrite.maxDailySeries : "1000000" # limiting daily churn rate.
remoteWrite.maxHourlySeries : "100000" # limiting the number of active time series.
Increasing metrics cardinality
In order to increase metrics resolution, you can implement the following overrides
Increasing cardinality parameters will increase memory/cpu consumption and might cause OOMKills/CPU Throttling.
Please use with caution and increase the custom metrics agent/metrics server resources accordingly
Copy custom-metrics :
enabled : true
extraArgs :
remoteWrite.maxDailySeries : "<custom value>"
remoteWrite.maxHourlySeries : "<custom value>"
In case you wish to increase metrics server / custom metrics resources, use the following overrides:
Copy custom-metrics :
resources :
limits :
cpu : <>
memory : <>
requests :
cpu : <>
memory : <>
victoria-metrics-single :
server :
resources :
limits :
cpu : <>
memory : <>
requests :
cpu : <>
memory : <>
Accessing the custom metrics
once custom metrics module is up and running, you can access them using groundcover's dashboard panel
Select the cluster with custom-metrics enabled
Custom metrics will be available