> For the complete documentation index, see [llms.txt](https://docs.groundcover.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.groundcover.com/use-groundcover/monitors/monitor-yaml-structure.md).

# Monitor YAML structure

While we strongly suggest building monitors using our [Wizard](/use-groundcover/monitors/create-a-new-monitor.md#using-the-monitor-wizard) or [Catalog](/use-groundcover/monitors/monitor-catalog-page.md), groundcover also supports building and editing monitors directly in YAML. This page documents the current schema.

For the query language used inside monitor queries, see the [gcQL Reference](/use-groundcover/querying-your-groundcover-data/groundcover-query-language/groundcover-query-language-gcql-reference.md). For ClickHouse SQL escape hatch monitors, see [SQL Based Monitors](/use-groundcover/monitors/sql-based-monitors.md).

## Top-level fields

<table data-full-width="true"><thead><tr><th width="220">Field</th><th>Description</th><th width="180">Allowed values</th></tr></thead><tbody><tr><td><strong>title</strong> <em>(required)</em></td><td>Human-readable name of the monitor. Shown in the Monitor List.</td><td>string</td></tr><tr><td><strong>display</strong></td><td>Display settings controlling how issues from this monitor are rendered. See <a href="#display">Display</a>.</td><td>object</td></tr><tr><td><strong>severity</strong></td><td>Severity reported on firing issues.</td><td><code>S1</code>, <code>S2</code>, <code>S3</code>, <code>S4</code></td></tr><tr><td><strong>measurementType</strong></td><td>Controls how issues are visualized. <code>state</code> renders as a line chart; <code>event</code> renders as a bar chart counting events.</td><td><code>state</code>, <code>event</code></td></tr><tr><td><strong>model</strong> <em>(required)</em></td><td>Queries, reducers, and thresholds that define what the monitor evaluates. See <a href="#model">Model</a>.</td><td>object</td></tr><tr><td><strong>labels</strong></td><td>Static or templated labels attached to the issue. Values can reference query results via <code>{{ $values.&#x3C;threshold_name>.Labels.&#x3C;key> }}</code>.</td><td><code>map&#x3C;string,string></code></td></tr><tr><td><strong>annotations</strong></td><td>Annotations attached to the alert, often used to wire monitors into workflows.</td><td><code>map&#x3C;string,string></code></td></tr><tr><td><strong>category</strong></td><td>Free-form category used for grouping in the Monitor List.</td><td>string</td></tr><tr><td><strong>executionErrorState</strong></td><td>State when query execution fails. <code>OK</code> suppresses; <code>Error</code> raises an error state; <code>Alerting</code> fires an issue.</td><td><code>OK</code>, <code>Error</code>, <code>Alerting</code></td></tr><tr><td><strong>noDataState</strong></td><td>State when the query returns no rows. <code>OK</code> stays normal; <code>NoData</code> enters a No Data state; <code>Alerting</code> fires.</td><td><code>OK</code>, <code>NoData</code>, <code>Alerting</code></td></tr><tr><td><strong>evaluationInterval</strong></td><td>Evaluation cadence and pending window. See <a href="#evaluationinterval">EvaluationInterval</a>.</td><td>object</td></tr><tr><td><strong>notificationSettings</strong></td><td>How alerts are delivered. See <a href="#notificationsettings">NotificationSettings</a>.</td><td>object</td></tr><tr><td><strong>isPaused</strong></td><td>When true, the monitor is defined but not evaluated.</td><td>boolean</td></tr></tbody></table>

## display

<table data-full-width="true"><thead><tr><th width="260">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>header</strong></td><td>Template for the issue header. Supports alert label substitution, e.g. <code>"gRPC API Error {{ labels.status_code }}"</code>.</td></tr><tr><td><strong>description</strong></td><td>Template for the issue description. Supports the same substitutions as <code>header</code>.</td></tr><tr><td><strong>resourceHeaderLabels</strong></td><td>List of labels identifying the <em>resource</em> the issue relates to. Rendered as the secondary header across Issues tables. Example: <code>["span_name", "role"]</code>.</td></tr><tr><td><strong>contextHeaderLabels</strong></td><td>List of labels identifying the <em>location</em> of the issue. Rendered as a subset of the issue's labels. Example: <code>["cluster", "namespace", "workload"]</code>.</td></tr><tr><td><strong>templateLanguage</strong></td><td>Template engine for <code>header</code> and <code>description</code>. Set to <code>jinja2</code> to opt into Jinja2 syntax (enablesblocks and filters). Omit for the default Go-template syntax.</td></tr></tbody></table>

## model

<table data-full-width="true"><thead><tr><th width="200">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>queries</strong> <em>(required)</em></td><td>One or more queries that produce the data the monitor evaluates. See <a href="#modelqueries">model.queries</a>.</td></tr><tr><td><strong>reducers</strong></td><td>Aggregations applied on top of queries before thresholds run. See <a href="#modelreducers">model.reducers</a>.</td></tr><tr><td><strong>thresholds</strong></td><td>Conditions evaluated against a query or reducer output. See <a href="#modelthresholds">model.thresholds</a>.</td></tr></tbody></table>

### model.queries

Each query describes one data source and one expression. The combination of `dataType` and the query body determines which engine runs the query.

<table data-full-width="true"><thead><tr><th width="220">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>name</strong> <em>(required)</em></td><td>Identifier used by reducers and thresholds to reference this query's output.</td></tr><tr><td><strong>dataType</strong></td><td>Data source for a gcQL query. One of <code>logs</code>, <code>traces</code>, <code>events</code>. <strong>Omit <code>dataType</code> for MetricsQL queries</strong> — see the <a href="#query-engine-field-compatibility">field compatibility</a> note.</td></tr><tr><td><strong>expression</strong></td><td><p>The query itself. The language depends on <code>dataType</code>:</p><ul><li><strong>gcQL</strong> for <code>logs</code>, <code>traces</code>, <code>events</code>. See the <a href="/pages/0rYEo6UxOzVzcXriFK6Y">gcQL Reference</a>.</li><li><a href="https://docs.victoriametrics.com/metricsql/"><strong>MetricsQL</strong></a> when <code>dataType</code> is omitted. MetricsQL is VictoriaMetrics' query language and is backwards-compatible with PromQL, with extra functions (<code>topk_last</code>, <code>rollup_rate</code>, etc.). It does <em>not</em> use the pipe (<code>|</code>) operator; combine operations with nested functions or arithmetic.</li></ul></td></tr><tr><td><strong>datasourceType</strong></td><td>Required for MetricsQL queries. Set to <code>prometheus</code> (the name refers to the Prometheus-compatible API served by the metrics backend).</td></tr><tr><td><strong>queryType</strong></td><td>Required for MetricsQL queries. Use <code>instant</code>.</td></tr><tr><td><strong>filters</strong></td><td>Optional standalone gcQL filter expression, used alongside <code>sqlPipeline</code>. For modern monitors, filters belong directly inside <code>expression</code>.</td></tr><tr><td><strong>relativeTimerange</strong></td><td>Time window relative to evaluation time. Object with <code>from</code> and optional <code>to</code> durations (e.g. <code>from: 5m</code>).</td></tr><tr><td><strong>instantRollup</strong></td><td>Bucket size for <strong>gcQL</strong> queries (logs/traces/events), e.g. <code>1 minutes</code>, <code>5 minutes</code>. Controls the time granularity the monitor evaluates over.</td></tr><tr><td><strong>rollup</strong></td><td><strong>Required</strong> for <strong>MetricsQL</strong> queries (whenever <code>datasourceType: prometheus</code> is set). Server-side rollup. Object with <code>function</code> (<code>avg</code>, <code>max</code>, <code>min</code>, <code>sum</code>, <code>count</code>, <code>stddev</code>, <code>stdvar</code>, <code>last</code>) and <code>time</code> (duration). Not interchangeable with <code>instantRollup</code>.</td></tr><tr><td><strong>sqlPipeline</strong></td><td><em>Legacy.</em> Structured SQL pipeline object. Still accepted for backwards compatibility but new monitors should use <code>expression</code> with gcQL.</td></tr></tbody></table>

#### Query-engine field compatibility

The query fields are tied to the query engine, and several of them must appear **together**. Mixing them across engines is the most common cause of a rejected monitor.

<table data-header-hidden data-full-width="true"><thead><tr><th></th><th></th><th></th></tr></thead><tbody><tr><td>Engine</td><td>Required fields</td><td>Must <em>not</em> be set</td></tr><tr><td><strong>gcQL</strong><br>(<code>dataType</code> = <code>logs</code>, <code>traces</code>, <code>events</code>)</td><td><code>dataType</code>, <code>expression</code>, <code>instantRollup</code></td><td><code>datasourceType</code>, <code>queryType</code>, <code>rollup</code></td></tr><tr><td><strong>MetricsQL</strong><br>(no <code>dataType</code>)</td><td><code>expression</code>, <code>datasourceType: prometheus</code>, <code>queryType: instant</code>, <code>rollup</code></td><td><code>dataType</code>, <code>instantRollup</code></td></tr></tbody></table>

Key rules the backend enforces:

* `datasourceType: prometheus` **requires** `rollup` (object with `function` + `time`). Omitting it is rejected with *"rollup is required for prometheus datasource type"*.
* For MetricsQL, **omit `dataType`** — `datasourceType: prometheus` together with the MetricsQL `expression` and `rollup` is all you need.
* `rollup` (MetricsQL) and `instantRollup` (gcQL) are **not** interchangeable — pick the one for your engine.

### model.reducers

Reducers aggregate a query's output into a single value (or per-group value) before thresholds run. This is how you turn a timeseries into a single number to compare against a threshold.

<table data-full-width="true"><thead><tr><th width="200">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>name</strong> <em>(required)</em></td><td>Identifier used by thresholds.</td></tr><tr><td><strong>inputName</strong></td><td>Name of the query (or another reducer) to read from. Required unless <code>type: math</code>.</td></tr><tr><td><strong>type</strong> <em>(required)</em></td><td>One of <code>last</code>, <code>min</code>, <code>max</code>, <code>mean</code>, <code>sum</code>, <code>count</code>, <code>math</code>.</td></tr><tr><td><strong>expression</strong></td><td>Required when <code>type: math</code>. An arithmetic expression over reducer outputs, e.g. <code>$errors / $total * 100</code>.</td></tr><tr><td><strong>relativeTimerange</strong></td><td>Optional time window specific to this reducer.</td></tr></tbody></table>

### model.thresholds

Thresholds are the final condition that determines whether the monitor fires.

<table data-full-width="true"><thead><tr><th width="200">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>name</strong> <em>(required)</em></td><td>Identifier for this threshold.</td></tr><tr><td><strong>inputName</strong> <em>(required)</em></td><td>Name of the query or reducer this threshold evaluates.</td></tr><tr><td><strong>operator</strong> <em>(required)</em></td><td>One of <code>gt</code>, <code>lt</code>, <code>gte</code>, <code>lte</code>, <code>eq</code>, <code>neq</code>, <code>within_range</code>, <code>outside_range</code>, <code>within_range_included</code>, <code>outside_range_included</code>.</td></tr><tr><td><strong>values</strong> <em>(required)</em></td><td>Array of numbers. One value for comparison operators; two values for <code>within_range</code> / <code>outside_range</code> and their inclusive variants.</td></tr><tr><td><strong>relativeTimerange</strong></td><td>Optional time window specific to this threshold.</td></tr></tbody></table>

## evaluationInterval

<table data-full-width="true"><thead><tr><th width="200">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>interval</strong></td><td>How often the monitor is evaluated, e.g. <code>1m</code>, <code>5m</code>.</td></tr><tr><td><strong>pendingFor</strong></td><td>Duration the threshold must hold before the issue transitions from Pending to Alerting. Use <code>0s</code> to fire immediately.</td></tr></tbody></table>

## notificationSettings

<table data-full-width="true"><thead><tr><th width="240">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>method</strong></td><td>How notifications are delivered. <code>notificationRoutes</code> uses the matching routes defined in <a href="/pages/iJQsKADR1EaNDH0bsmEp">Notification Routes</a>; <code>connectedApps</code> sends directly to the apps listed in <code>connectedApps</code>, bypassing routes; <code>noNotifications</code> suppresses all notifications for this monitor.</td></tr><tr><td><strong>connectedApps</strong></td><td>List of Destination IDs. Used with <code>method: connectedApps</code>. (The YAML field name <code>connectedApps</code> is retained for backward compatibility; in the UI these are called <a href="/pages/fo2WdWlqpxCCjttUSWFa">Destinations</a>.)</td></tr><tr><td><strong>connectedAppParams</strong></td><td>Per-destination delivery options keyed by Destination ID. For Slack App connectors, use <code>channels</code> to set the target channels — see <a href="/pages/Y2tO4Ei086NA8DhVmBzS">Slack</a>. Example: <code>{ "&#x3C;slack-app-id>": { "channels": [{ "id": "C123456", "name": "#alerts" }] } }</code>. Used with <code>method: connectedApps</code>.</td></tr><tr><td><strong>renotificationInterval</strong></td><td>Duration between repeat notifications while the issue remains firing, e.g. <code>4h</code>.</td></tr><tr><td><strong>disableRenotification</strong></td><td>When true, suppresses repeat notifications.</td></tr><tr><td><strong>statusFilters</strong></td><td>List of issue statuses that trigger notifications. Allowed values: <code>Alerting</code>, <code>Resolved</code>. Used with <code>method: connectedApps</code>. Note: the monitor wizard labels these as <strong>Firing</strong> and <strong>Resolved</strong> — <code>Firing</code> in the UI corresponds to <code>Alerting</code> in YAML.</td></tr></tbody></table>

## Examples

### Traces monitor (gcQL)

Fires when gRPC traces return a non-zero status code.

```yaml
title: gRPC API Errors Monitor
display:
  header: gRPC API Error {{ labels.status_code }}
  description: |-
    This monitor detects gRPC API errors by identifying responses with a status code indicating failure.
    Cluster: {{ labels.cluster }}
    Namespace: {{ labels.namespace }}
    Workload: {{ labels.workload }}
    Span: {{ labels.span_name }}
  resourceHeaderLabels:
    - span_name
    - role
  contextHeaderLabels:
    - env
    - cluster
    - namespace
    - workload
  templateLanguage: jinja2
severity: S3
measurementType: event
model:
  queries:
    - name: threshold_input_query
      dataType: traces
      expression: >
        span_type:grpc status_code:!=0 status:error source:ebpf
        | stats by (env, cluster, namespace, workload, status_code, span_name, role) count() errors_total
      instantRollup: 1 minutes
  thresholds:
    - name: threshold_1
      inputName: threshold_input_query
      operator: gt
      values:
        - 0
executionErrorState: OK
noDataState: OK
evaluationInterval:
  interval: 1m
  pendingFor: 0s
```

### Logs monitor (gcQL)

Fires when sensor logs contain panic or fatal errors.

```yaml
title: Sensor Panic / Fatal Errors
display:
  header: Sensor Panic or Fatal Errors
  description: |-
    Detects panic or fatal errors in sensor logs.
    Cluster: {{ labels.cluster }}
    Namespace: {{ labels.namespace }}
    Pod: {{ labels.pod }}
  contextHeaderLabels:
    - pod
    - workload
    - cluster
    - env
    - namespace
severity: S2
measurementType: event
model:
  queries:
    - name: threshold_input_query
      dataType: logs
      expression: >
        container:sensor level:in(panic, fatal)
        | stats by (env, cluster, namespace, workload, pod) count() count_all_result
      instantRollup: 5 minutes
  thresholds:
    - name: threshold_1
      inputName: threshold_input_query
      operator: gt
      values:
        - 0
executionErrorState: OK
noDataState: OK
evaluationInterval:
  interval: 5m
  pendingFor: 0s
notificationSettings:
  renotificationInterval: 4h
```

### Monitor that delivers directly to Slack channels

Bypasses notification routes and sends issues from this monitor straight to two Slack channels via the [Slack](/use-groundcover/connectors/slack.md) Destination. Replace the `slack-app-id` placeholder with the ID of your Slack App Destination, and the channel `id` values with the Slack channel IDs (e.g., `C0123456789`) you want to deliver to.

```yaml
title: Checkout Service 5xx Spike
display:
  header: Checkout 5xx Spike in {{ labels.cluster }}
  description: |-
    HTTP 5xx responses from the checkout service are above threshold.
    Cluster: {{ labels.cluster }}
    Namespace: {{ labels.namespace }}
    Workload: {{ labels.workload }}
severity: S2
measurementType: event
model:
  queries:
    - name: threshold_input_query
      dataType: traces
      expression: >
        workload:checkout status_code:>=500
        | stats by (env, cluster, namespace, workload) count() errors_total
      instantRollup: 1 minutes
  thresholds:
    - name: threshold_1
      inputName: threshold_input_query
      operator: gt
      values:
        - 10
executionErrorState: OK
noDataState: OK
evaluationInterval:
  interval: 1m
  pendingFor: 0s
notificationSettings:
  method: connectedApps
  connectedApps:
    - <slack-app-id>
  connectedAppParams:
    <slack-app-id>:
      channels:
        - id: C0123456789
          name: "#checkout-alerts"
        - id: C0987654321
          name: "#oncall-prod"
  statusFilters:
    - Alerting
    - Resolved
  renotificationInterval: 1h
```

{% hint style="info" %}
Channel `id` is the canonical Slack channel ID (it stays stable if the channel is renamed); `name` is optional and only used for display. You can find the channel ID in Slack via **Channel name → View channel details → About** (the ID is shown at the bottom).
{% endhint %}

### Metrics monitor (MetricsQL)

Fires when a Kubernetes pod is in `CrashLoopBackOff` for more than 5 minutes. Uses a reducer to collapse the timeseries before the threshold runs.

{% hint style="info" %}
Metrics queries use [MetricsQL](https://docs.victoriametrics.com/metricsql/) (PromQL-compatible). kube-state-metrics names are prefixed with `groundcover_`, and the node identity label is `node_name` (not `node`).
{% endhint %}

```yaml
title: K8s Pod Crash Looping Monitor
display:
  header: K8s Pod Crash Looping
  description: Kubernetes pod has been in CrashLoopBackOff for more than 5 minutes.
  resourceHeaderLabels:
    - workload
  contextHeaderLabels:
    - env
    - cluster
    - namespace
severity: S2
measurementType: state
model:
  queries:
    - name: crash_looping_query
      expression: >
        avg_over_time(
          avg by (env, cluster, namespace, workload) (
            groundcover_kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"}
          )[5m]
        )
      datasourceType: prometheus
      queryType: instant
      rollup:
        function: avg
        time: 5m
  reducers:
    - name: crash_looping_mean
      inputName: crash_looping_query
      type: mean
  thresholds:
    - name: threshold_1
      inputName: crash_looping_mean
      operator: gt
      values:
        - 0
executionErrorState: OK
noDataState: OK
evaluationInterval:
  interval: 1m
  pendingFor: 5m
```

### ClickHouse SQL monitor

For advanced cases that need joins, CTEs, or comparisons across time windows, you can drop to ClickHouse SQL. See [SQL Based Monitors](/use-groundcover/monitors/sql-based-monitors.md) for details and examples.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.groundcover.com/use-groundcover/monitors/monitor-yaml-structure.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
