Metrics Aggregation
What is Metrics Aggregation?
Metrics aggregation is a real-time data processing feature that automatically transforms and summarizes metrics as they are ingested into the system. Instead of storing every individual metric sample with all its labels, aggregation rules combine multiple metric samples into aggregated values, reducing storage requirements and improving query performance.
groundcover metrics aggregation processes metrics in real-time as they flow through the system. This allows you to:
Reduce storage costs: By aggregating metrics and removing unnecessary labels, you store fewer unique time series.
Improve query performance: Aggregated metrics are pre-computed, making queries faster.
Simplify metric names: Transform complex metric names into cleaner, more consistent formats.
Remove granularity: Drop labels that aren't needed for your use case (e.g., node-level labels when you only care about cluster-level metrics)
To manage the metrics aggregation rules, go to the dedicated page in settings.
Metrics aggregation rules can only be edited by account Admins
Metrics aggregation rule format
Each aggregation rule specifies:
Which metrics to match: Using label selectors and metric name patterns
Which labels to remove: Using the
withoutparameter to drop specific labelsHow to aggregate: Using output functions like
total_prometheus,avg,last, orcount_seriesHow often to aggregate: Using the
intervalparameter to control aggregation frequency
A full reference of the configuration options can be found here.
groundcover has a set of defaults installed. You can always revert to the defaults provided by groundcover by clicking on Restore Defaults.
Walkthrough: Counter Metrics Aggregation Example
Let's walk through one of the default aggregation rules to understand how it works:
Breaking Down the Rule
1. Match Criteria
This rule matches metrics where:
The metric name matches one of these patterns:
groundcover_.+_.+_counter- Any metric ending with_counterthat has the patterngroundcover_*_*_countergroundcover_unavailable_count- The specific unavailable count metricgroundcover_network.+- Any metric starting withgroundcover_network
The metric has a
node_namelabel (thenode_name!=""condition)
Example metrics that would match:
groundcover_http_request_counter{node_name="node-1", namespace="default", service="api"}groundcover_unavailable_count{node_name="node-2", namespace="production"}groundcover_network_bytes_sent{node_name="node-3", pod="web-123"}
2. Label Removal
This removes the node_name and node labels from the aggregated metrics. This means:
Before aggregation: You might have separate time series for each node:
groundcover_http_request_counter{node_name="node-1", namespace="default", service="api"} = 100groundcover_http_request_counter{node_name="node-2", namespace="default", service="api"} = 150groundcover_http_request_counter{node_name="node-3", namespace="default", service="api"} = 75
After aggregation: These are combined into a single time series without node labels:
groundcover_http_request_counter{namespace="default", service="api"} = 325(100 + 150 + 75)
This is useful when you don't need node-level granularity and want to see cluster-wide or namespace-wide metrics.
3. Aggregation Interval
The aggregation runs every 30 seconds. This means:
Metrics are collected for 30 seconds
At the end of each 30-second window, the aggregation function is applied
The aggregated result is stored as a new metric
4. Output Function
The total_prometheus output function:
For counter metrics: Sums all the counter values across the matching time series
Creates a new aggregated metric with the combined value
Maintains Prometheus-compatible counter behavior
Example: If you have three nodes each reporting:
Node 1:
groundcover_http_request_counter{node_name="node-1", namespace="api"} = 1000Node 2:
groundcover_http_request_counter{node_name="node-2", namespace="api"} = 2000Node 3:
groundcover_http_request_counter{node_name="node-3", namespace="api"} = 1500
After aggregation, you get:
groundcover_http_request_counter{namespace="api"} = 4500(1000 + 2000 + 1500)
5. Metric Name Transformation
This relabeling rule cleans up the metric name by:
Matching metric names that contain a colon (
:)Extracting everything before the colon
Replacing the metric name with just the prefix
Example:
Original:
groundcover_http_request_counter:total_prometheusAfter relabeling:
groundcover_http_request_counter
This ensures the aggregated metric has a clean, consistent name without the aggregation function suffix.
Last updated
