Scraping custom metrics

Let groundcover automatically scrape your custom metrics

groundcover can scrape your custom metrics by deploying a metrics scraper (vmagent by Victoria Metrics) that will automatically scrape prometheus targets.

vmagent is fully compatible with prometheus scrape job syntax - more can be found here.

Enabling custom metrics scraping

The following helm override enables custom metrics scraping

custom-metrics:
  enabled: true

Using CLI

Scrape your custom metrics using groundcover CLI (using default scrape jobs):

groundcover deploy --custom-metrics

Using Helm

Using Helm (Upgrading existing installation)

Either create a new custom-values.yaml or edit your existing groundcover values.yaml

helm upgrade groundcover groundcover/groundcover -n groundcover -f values.yaml --reuse-values

Autodiscovery

Ensure that the Kubernetes resources that contain your Prometheus exporters have been deployed with the following annotations to enable scraping

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "<port>"
  prometheus.io/path: "<metrics-path>" // Optional, if path is not /metrics

By default, the following scrape jobs are deployed when enabling custom-metrics:

custom-metrics:
  enabled: true
  config:
    scrape_configs:
      - job_name: "kubernetes-apiservers"
        honor_labels: true  
        kubernetes_sd_configs:
          - role: endpoints
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        relabel_configs:
          - action: drop
            source_labels:
              - __meta_kubernetes_pod_label_app_kubernetes_io_part_of
            regex: groundcover
          - source_labels:
              [
                __meta_kubernetes_namespace,
                __meta_kubernetes_service_name,
                __meta_kubernetes_endpoint_port_name,
              ]
            action: keep
            regex: default;kubernetes;https
      - job_name: "kubernetes-nodes"
        honor_labels: true
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
          - role: node
        relabel_configs:
          - action: drop
            source_labels:
              - __meta_kubernetes_pod_label_app_kubernetes_io_part_of
            regex: groundcover
          - action: labelmap
            regex: __meta_kubernetes_node_label_(.+)
          - target_label: __address__
            replacement: kubernetes.default.svc:443
          - source_labels: [__meta_kubernetes_node_name]
            regex: (.+)
            target_label: __metrics_path__
            replacement: /api/v1/nodes/$1/proxy/metrics
      - job_name: "kubernetes-service-endpoints"
        honor_labels: true
        kubernetes_sd_configs:
          - role: endpointslices
        relabel_configs:
          - action: drop
            source_labels:
              - __meta_kubernetes_pod_label_app_kubernetes_io_part_of
            regex: groundcover
          - action: drop
            source_labels: [__meta_kubernetes_pod_container_init]
            regex: true
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_scheme]
            action: replace
            target_label: __scheme__
            regex: (https?)
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels:
              [
                __address__,
                __meta_kubernetes_service_annotation_prometheus_io_port,
              ]
            action: replace
            target_label: __address__
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_pod_name]
            target_label: pod
          - source_labels: [__meta_kubernetes_pod_container_name]
            target_label: container
          - source_labels: [__meta_kubernetes_namespace]
            target_label: namespace
          - source_labels: [__meta_kubernetes_service_name]
            target_label: service
          - source_labels: [__meta_kubernetes_service_name]
            target_label: job
            replacement: ${1}
          - source_labels: [__meta_kubernetes_pod_node_name]
            action: replace
            target_label: node
      - job_name: "kubernetes-service-endpoints-slow"
        honor_labels: true
        scrape_interval: 5m
        scrape_timeout: 30s
        kubernetes_sd_configs:
          - role: endpointslices
        relabel_configs:
          - action: drop
            source_labels:
              - __meta_kubernetes_pod_label_app_kubernetes_io_part_of
            regex: groundcover
          - action: drop
            source_labels: [__meta_kubernetes_pod_container_init]
            regex: true
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
            action: keep
            regex: true
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_scheme]
            action: replace
            target_label: __scheme__
            regex: (https?)
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels:
              [
                __address__,
                __meta_kubernetes_service_annotation_prometheus_io_port,
              ]
            action: replace
            target_label: __address__
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_pod_name]
            target_label: pod
          - source_labels: [__meta_kubernetes_pod_container_name]
            target_label: container
          - source_labels: [__meta_kubernetes_namespace]
            target_label: namespace
          - source_labels: [__meta_kubernetes_service_name]
            target_label: service
          - source_labels: [__meta_kubernetes_service_name]
            target_label: job
            replacement: ${1}
          - source_labels: [__meta_kubernetes_pod_node_name]
            action: replace
            target_label: node
      - job_name: "kubernetes-services"
        honor_labels: true
        metrics_path: /probe
        params:
          module: [http_2xx]
        kubernetes_sd_configs:
          - role: service
        relabel_configs:
          - action: drop
            source_labels:
              - __meta_kubernetes_pod_label_app_kubernetes_io_part_of
            regex: groundcover
          - source_labels:
              [__meta_kubernetes_service_annotation_prometheus_io_probe]
            action: keep
            regex: true
          - source_labels: [__address__]
            target_label: __param_target
          - target_label: __address__
            replacement: blackbox
          - source_labels: [__param_target]
            target_label: instance
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            target_label: namespace
          - source_labels: [__meta_kubernetes_service_name]
            target_label: service
      - job_name: "kubernetes-pods"
        honor_labels: true
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - action: drop
            source_labels:
              - __meta_kubernetes_pod_label_app_kubernetes_io_part_of
            regex: groundcover
          - action: drop
            source_labels: [__meta_kubernetes_pod_container_init]
            regex: true
          - source_labels:
              [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels:
              [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_pod_name]
            target_label: pod
          - source_labels: [__meta_kubernetes_pod_container_name]
            target_label: container
          - source_labels: [__meta_kubernetes_namespace]
            target_label: namespace
          - source_labels: [__meta_kubernetes_service_name]
            target_label: service
          - source_labels: [__meta_kubernetes_service_name]
            target_label: job
            replacement: ${1}
          - source_labels: [__meta_kubernetes_pod_node_name]
            action: replace
            target_label: node

Disable autodiscovery scrape jobs

In case you're interested in disabling autodiscovery scrape jobs, provide the below override

Disabling custom-metrics scrape jobs allows you to scale the custom-metrics deployment horizontally.

custom-metrics:
  enabled: true
  config:
    scrape_configs: []

Using custom scrape jobs

in case you're interested in deploying custom scrape jobs, create/add the following override

custom-metrics:
  enabled: true
  extraScrapeConfigs:
  - job_name: "custom-scrape-job"
  ...

Cardinality limits

In order to safeguard groundcover's performance, there are default limitations on metrics ingestion in place.

remoteWrite.maxDailySeries: "1000000" # limiting daily churn rate.
remoteWrite.maxHourlySeries: "100000" # limiting the number of active time series.

Increasing metrics cardinality

In order to increase metrics resolution, you can implement the following overrides

custom-metrics:
  enabled: true
  extraArgs:
    remoteWrite.maxDailySeries: "<custom value>"
    remoteWrite.maxHourlySeries: "<custom value>"

In case you wish to increase metrics server / custom metrics resources, use the following overrides:

custom-metrics:
  resources:
    limits:
      cpu: <>
      memory: <>
    requests:
      cpu: <>
      memory: <>

Last updated