Monitor YAML structure

While we strongly suggest building monitors using our Wizard or Catalog, groundcover supports building and editing your Monitors using YAML. If you choose to do so, the following will provide you the necessary definitions.

Monitor fields explained

In this section, you'll find a breakdown of the key fields used to define and configure Monitors within the groundcover platform. Each field plays a critical role in how a Monitor behaves, what data it tracks, and how it responds to specific conditions. Understanding these fields will help you set up effective Monitors to track performance, detect issues, and provide timely alerts.

Below is a detailed explanation of each field, along with examples to illustrate their usage, ensuring your team can manage and respond to incidents efficiently.

Field
Explanation
Example

Title

A string that defines the human-readable name of the Monitor. The title is what you will see in the list of all existing Monitors in the Monitors section.

Description

Additional information about the Monitor.

Severity

When triggered, this will show the severity level of the Monitor's issue. You can set any severity you want here.

s1 for Critical

s2 for High

s3 for Medium

s4 for Low

Header

This is the header of the generated issues from the Monitor.

A short string describing the condition that is being monitored. You can also use this as a pattern using labels from you query.

“HTTP API Error {{ alert.labels.return_code}}”

ResourceHeaderLabels

A list of labels that help you identify the resources that are related to the Monitor. This appear as a secondary header in all Issues tables across the platform.

["span_name", "kind"] for monitors on protocol issues.

ContextHeaderLabels

A list of contextual labels that help you identify the location of the issue. This appears as a subset of the Issue’s labels, and is displayed on all Issues tables across the platform.

["cluster", "namespace", "pod_name"]

Labels

A set of pre-defined labels that were set to Issues related to the selected Monitor. Labels can be static, or dynamic using a Monitor's query results.

team: sre_team

ExecutionErrorState

Defines the actions that take place when a Monitor encounters query execution errors.

Valid options are Alerting, OK and Error.

  • When Alerting is set, query execution errors will result in a firing issue.

  • When Error is set, query execution errors will result in an error state.

  • When OK is set, query execution errors will do neither of the above. This is the default setting

NoDataState

This defines what happens when queries in the Monitor return empty datasets.

Valid options are: NoData , Alerting, OK

  • When NoData is set, monitor instance's state will be No Data.

  • When Alerting is set, monitor instance's state will be Pending and then will change to Alerting once the pending period of the monitor ends.

  • When OK is set, monitor instance's state will be Normal. This is the default setting.

Interval

Defines how frequently the Monitor evaluates the conditions. Common intervals could be 1m, 5m, etc.

PendingFor

Defines the period of consecutive intervals where threshold condition must be met to trigger the alert.

Trigger

Defines the condition under which the Monitor fires. This is the definition of threshold for the Monitor, with op - operator and value .

op: gt, value: 5

Model

Describes the queries, thresholds and data processing of the Monitor. It can have the following fields:

  • Queries: List of one or more queries to run, this can be either SQL over ClickHouse, PromQL over VictoriaMetrics, SqlPipeline. Each query will have a name for reference in the monitor.

  • Thresholds: This is the threshold of your Monitor, a threshold has a name, inputName for data input, operator one of gt , lt , within_range, outside_range and array of values which are the threshold values.

measurementType

Describe how will we present issues of this Monitor. Some Monitors count events, and some a state. And we will display them differently in our dashboards.

  • state - Will present issues in line chart.

  • event - Will present issues in bar chart, counting events.

Monitor YAML Examples

Traces Based Monitors

MySQL Query Errors Monitor

gRPC API Errors Monitor

Log Based Monitors

High Error Log Rate Monitor

Last updated