# Create a new Monitor

Learn how to create and configure monitors using the Wizard, Monitor Catalog, or Import options. The following guide will help you set up queries, thresholds, and alert routing for effective monitoring.

> You can either create monitors using our web application following this guide, or use our API, see: [](https://docs.groundcover.com/use-groundcover/monitors "mention") or use our Terraform provider, see: [groundcover-terraform-provider](https://docs.groundcover.com/use-groundcover/groundcover-terraform-provider "mention").

In the Monitors section (left navigation bar), navigate to the **Issues** page or the **Monitor List** page to create a new Monitor. Click on the “Create Monitor” button at the top right and select one of the following options from the dropdown:

<table data-view="cards"><thead><tr><th></th><th data-hidden data-card-cover data-type="files"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><a href="#using-the-monitor-wizard"><strong>Monitor Wizard</strong></a></td><td></td><td><a href="https://docs.groundcover.com/use-groundcover/monitors/create-a-new-monitor#using-the-monitor-wizard">https://docs.groundcover.com/use-groundcover/monitors/create-a-new-monitor#using-the-monitor-wizard</a></td></tr><tr><td><a href="monitor-catalog-page"><strong>Monitor Catalog</strong></a></td><td></td><td><a href="monitor-catalog-page">monitor-catalog-page</a></td></tr><tr><td><a href="#using-the-import-option"><strong>Import</strong></a></td><td></td><td><a href="https://docs.groundcover.com/use-groundcover/monitors/create-a-new-monitor#using-the-import-option">https://docs.groundcover.com/use-groundcover/monitors/create-a-new-monitor#using-the-import-option</a></td></tr></tbody></table>

## Using the Monitor Wizard

### Overview

The Monitor Wizard is a guided, user-friendly approach to creating and configuring monitors tailored to your observability needs. By breaking down the process into simple steps, it ensures consistency and accuracy.

### Section 1: Query

Select the data source, build the query and define thresholds for the monitor.

{% hint style="info" %}
If you're unfamiliar with query building in groundcover, refer to the [Query Builder section](https://docs.groundcover.com/use-groundcover/querying-your-groundcover-data/explore-and-monitors-query-builder) for full details on the different components.
{% endhint %}

* **Data Source (Required):**
  * Select the type of data (Metrics, Logs, Traces, or Events).
* **Query Functions:**
  * Choose how to process the data (e.g., average, count).
  * Add aggregation (group by) clauses if applicable, you MUST use aggregations if you want to add labels to your issues.
  * **Important**: The labels used for aggregation (group by) maybe also be used for notification routes and the issue summary & description.
  * **Examples:** `cluster`, `node`, `container_name`
* **Time Window (Required):**
  * Specify the period over which data is aggregated (the look-behind window).
  * **Example:** “Over the last 5 minutes.”
* **Window Aggregation (Required):**
  * Specify the aggregation function to be used on the selected time window.
  * **Example:** "`avg` over the last 5m"
* **Threshold Conditions (Required):**
  * Define when a monitor should trigger an Issue. You can use:
    * Greater Than - Trigger when the value exceeds X.
    * Lower Than - Trigger when the value falls below X.
    * Within Range - Trigger when the value is between X and Y.
    * Outside Range - Trigger when the value is not between X and Y.
  * **Important:** The units in which the threshold is being measured in must be the same as the units the query uses.
    * For metrics queries the threshold should match the unit the metric is measured in.
    * For logs, traces, and events, it's just a number.
  * **Example:** “Trigger if disk space usage is greater than 10%.”
* **Preview Settings (Optional):**

  * Preview data using Stacked Bar or Line Chart for better clarity while building the monitor.
  * Choose the Y axis units.
  * Choose the rollup to present.
  * **Important**: These configurations only affect the preview graph, not the monitor's evaluation.

  <figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2FnFEgZ7J7hfg9PALojODZ%2Fimage.png?alt=media&#x26;token=1f3eb0da-3e2c-425c-8755-935d19dc5321" alt=""><figcaption></figcaption></figure>
* **Advanced (Optional):**

  * **Evaluation Interval:**
    * Specify how often the monitor evaluates the query
    * Example: “Evaluate every 1 minute.”
  * **Pending Period:**
    * Specify how many times the evaluation needs to pass the threshold in order to trigger an Issue. This refers to a consecutive evaluations passing the threshold.
    * Monitors that have entered the pending period (the first evaluation passed the threshold) will be in 'Pending' state, only after all consecutive evaluations passed the threshold, the monitor will be 'Firing' and an Issue will be created. If even 1 of the evaluations did not pass the threshold, the Monitor will be set right back to 'Normal'.
    * **Example**: “When Evaluation Interval of 5m, setting this to 2 (10m) ensures the condition must be evaluated 3 times before a monitor will fire.
      * Evaluation #1 at 0m
      * Evaluation #2 at 5m
      * Evaluation #3 at 10m -> If all 3 passed the threshold, an the monitor will 'Fire'
    * **Note**: This ensures that transient conditions do not trigger alerts, reducing false positives or smoothing sudden unwanted spikes.
    * **Important**: The default configuration is 0, which means the monitor will trigger an Issue immediately when an evaluation was run and the threshold was passed.
  * **Treat No Data As**:
    * Whether the monitor should treat no data as an Issue or Normal behavior
    * **Example**: "I want to be notified if the metric has a gap for the entire look-behind window of the query, so I will set it to 'Firing'"

  <figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2Fvje5ASzjg6bH5jBtV3Js%2Fimage.png?alt=media&#x26;token=35336123-5e55-4d86-95cf-78dd432e1127" alt=""><figcaption></figcaption></figure>

### Section 2: Monitor Details

Set up the basic information for the monitor.

* **Monitor Name (Required):**
  * Add a title for the monitor. The title will appear in notifications and in the [monitor-list-page](https://docs.groundcover.com/use-groundcover/monitors/monitor-list-page "mention").
  * Give the Monitor a clear, short name, that describes its function **at a high level**.
  * **Examples:**
    * `“Workload High API Error Rate”`
    * `“Node High Memory”`

{% hint style="info" %}
The title will appear in the monitors page table and be accessible in notification routes.
{% endhint %}

* **Severity (required):**
  * Use severity to categorize alerts by importance.
  * Select a severity level (S1-S4).
  * **Important**: For connected apps (OpsGenie, PagerDuty) that require using specific Severities like P1-P4 or Critical-Info, we translate automatically to the relevant respective Severity.
* **Custom Labels (formally called 'metadata labels'):**

  * Add custom labels (key:value) that will be added to all Issues generated by this monitor
  * Example: To create a notification route for my team's issues, add "Team:Infra" and use it in the notification route's scope

  <figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2Fcaq46v6YRvMMN7MIm5C2%2Fimage.png?alt=media&#x26;token=aa342992-b081-4556-9bc3-6974c78c38c5" alt=""><figcaption></figcaption></figure>

### Section 3: Issue Details

Customize how the Monitor’s Issues will appear and what content will be sent in it's notifications. This section also includes a live preview of the way it will appear in the notification.

{% hint style="info" %}
Ensure that the labels you wish to use dynamically (e.g., `cluster`, `workload`) or statically (e.g. `team:infra`) are defined in the query and monitor details.
{% endhint %}

* **Issue Summary (required):**

  * Define a short title for issues that this Monitor will raise. It's useful to include variables that can be informative at first glance.
  * **Example:** Adding `{{ labels.statusCode }}` to the header will inject the status code to the name of the issue - this becomes especially useful when one Monitor raises multiple issues and you want to quickly understand their content without having to open each one.
    * `“HTTP API Error {{ labels.status_code }}”` -> `HTTP API Error 500`
    * `“Workload {{ labels.workload }} Pod Restart”` -> `Workload frontend Pod Restart`
    * `“{{ labels.team }} APIs High Latency”` -> `Infra APIs High Latency`
  * **Note:** Autocomplete is supported to view what is usable in the Issue and will help ensure you put in the variables correctly.

  <figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2FOnW9OhVrbsRwXfg0FLP1%2Fimage.png?alt=media&#x26;token=43e354f5-28e1-4118-8e91-5ffc9fdf302f" alt="" width="375"><figcaption></figcaption></figure>

{% hint style="warning" %}
The new format for templating variables is `{{ variable }}` or `{{ labels.<label> }}` , but the previous format `{{ alert.labels.statusCode }}` used for keep is still supported.
{% endhint %}

* **Description:**
  * Used as the body of the message for the Issue.
  * The templating format is Jinja2, and can be used with variables similarly to the Issue Summary and various more advanced functions.
    * **Example**: Adding all the labels to be shown in the Slack message's body should be inserted into here using `{{labels.<label>}}` , you can add the severity `{{severity}}`, the monitor's name `{{monitor_name}}` and many more.
  * URLs can be rendered using <`url link`|`url title`>
    * **Example:** `<www.groundcover.com/{{labels.env}}|text>` will add the env label from the issue to the URL link and put the link inside a text called 'text'

{% hint style="info" %}
Any pongo2 functionality like { %if } evaluations are supported, which can be used to render different descriptions for different conditions.
{% endhint %}

<figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2F6RrgUDfB6ZTnrmh2UR9b%2Fimage.png?alt=media&#x26;token=9e98df8b-1d2d-462d-aba4-1cc3106d0770" alt=""><figcaption></figcaption></figure>

* **Advanced (Optional)**
  * **Display Labels (formally called 'context labels'):**

    * These Labels will be displayed and filterable in Monitors>Issues page.
    * This list gets automatically populated based on the labels used in the aggregation function in the Query.
    * **Note:** You can remove labels from this list if you do not wish to see them in the Issues page.

    <figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2FvW6JLHUB4zKH8M1MkeTr%2Fimage.png?alt=media&#x26;token=468bd18d-47d8-446b-82aa-ef31260ffb8c" alt=""><figcaption></figcaption></figure>

### Section 4: Notifications

Set up notifications behavior for issues from this monitor

{% hint style="info" %}
Workflows (used for Keep) and Notification Routes may work in parallel and do not affect each other.
{% endhint %}

* **Based on matching notification routes (Required)**
  * The issues generated by this monitor will be evaluated by the Notification Route's scopes and rules and notifications will be sent accordingly
  * **Note**: The 'Preview' can be used in order to align expectations as for which notification routes may match this monitor's future issues
* **Routing (Workflows) (Optional)**
  * **Select Workflow:**
    * Route alerts to existing workflows **only**, this means that other workflows will not process them. Use this to send alerts for a critical application such as Slack or PagerDuty.
  * **No Routing:**
    * This means that any workflow (without filters), will process the issue.

{% hint style="warning" %}
Configure Routing for Keep Workflows only
{% endhint %}

<figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2F83FBomDgXiuvNpQhPqhM%2Fimage.png?alt=media&#x26;token=b9e6a23e-83c8-4bc3-bc7d-5c020d2c1a6f" alt=""><figcaption></figcaption></figure>

* **Advanced (Optional)**

  * Override Renotify Interval
    * Used to override the interval configured on the Notification Route for when a certain monitior's issue should send another notification at a different inerval.
    * **Example**: If it's set to 1m while the evaluation interval is 1m a notification will be sent with every firing evaluation. If it's set to 2d, even if the monitor evaluates every 1m, a notification will be sent once every 2 days.
    * **Note**: If the Issue stops firing and starts firing again, a new notification will be sent, this is not considered 'renotification'
    * **Important**: Minimum interval is the evaluation interval

  <figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2FE8fVDM6nf8RXJYy6ozdb%2Fimage.png?alt=media&#x26;token=b70413a4-465e-43dc-8cfc-728d6d5a7ee7" alt=""><figcaption></figcaption></figure>

## Using the Import option

{% hint style="warning" %}
This is an advanced feature, please use it with caution.
{% endhint %}

<figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2Fgit-blob-2a5641c4ef9ff6e9d71f4800d8a801437ac9839f%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

In the "Import Bulk Monitors" you can add multiple monitors using an array of Monitors that follows the [monitor-yaml-structure](https://docs.groundcover.com/use-groundcover/monitors/monitor-yaml-structure "mention").

Example of importing multiple monitors

```
monitors:
- title: K8s Cluster High Memory Requests Monitor
  display:
    header: K8s Cluster High Memory Requests
    description: Alerts when a K8s Cluster's total Container Memory Requests exceeds 90% of the Allocatable Memory of all the Nodes for 5 minutes    
    contextHeaderLabels:
      - env
      - cluster
  severity: S1
  measurementType: state
  model:
    queries:
      - name: threshold_input_query
        expression: avg_over_time( (((sum(groundcover_node_rt_mem_requests_bytes{}) by (cluster, env)) / (sum(groundcover_node_rt_allocatable_mem_bytes{}) by (cluster, env))) * 100)[5m] )
        queryType: instant
        datasourceType: prometheus
    thresholds:
      - name: threshold_1
        inputName: threshold_input_query
        operator: gt
        values:
          - 90
  noDataState: OK
  evaluationInterval:
    interval: 1m
    pendingFor: 0s
- title: K8s PVC Pending For 5 Minutes Monitor
  display:
    header: K8s PVC Pending Over 5 Minutes
    description: This monitor triggers an alert when a PVC remains in a Pending state for more than 5 minutes.
    contextHeaderLabels:
      - cluster
      - namespace
      - persistentvolumeclaim
  severity: S2
  measurementType: state
  model:
    queries:
      - name: threshold_input_query
        expression: last_over_time(max(groundcover_kube_persistentvolumeclaim_status_phase{phase="Pending"}) by (cluster, namespace, persistentvolumeclaim)[1m])
        queryType: instant
        datasourceType: prometheus
    thresholds:
      - name: threshold_1
        inputName: threshold_input_query
        operator: gt
        values:
          - 0
  executionErrorState: OK
  noDataState: OK
  evaluationInterval:
    interval: 1m
    pendingFor: 5m
```

Click on "**Create Monitors**" to create them.

## Query Best Practices

### Performance Recommendations

To ensure reliable monitor evaluation and avoid timeouts:

* **Avoid excessively long time ranges** with free-text search queries. Maximum recommended range is 7 days for most query types.
* **Use attribute filters instead of free-text search** when possible - free-text searches across large time ranges are expensive.
* **Set appropriate dashboard refresh intervals** - avoid refreshing complex dashboards every 1-2 minutes with week-long queries.
* **Consider logs-to-metrics** for aggregation queries that would otherwise scan large log volumes.
* **Use parsing rules** to add attributes for frequently-filtered fields instead of relying on free-text search.

### PromQL Query Limitations

When using PromQL/MetricsQL in monitors:

**Regex Operators**: The regex not-match operator `!~` may have limitations with certain patterns:

* Pattern `.+` (one or more characters) may not work as expected in some cases
* Use `.*` (zero or more characters) as an alternative: `label!~".*Error"` instead of `label!~".+Error"`

**Switching between data sources**: Changing the data source type (e.g., from Metrics to Logs) will clear the current query.

### gcQL Query Limitations in Monitors

Some gcQL operations behave differently in monitors compared to the Data Explorer:

| Operation                          | Data Explorer | Monitors                                   |
| ---------------------------------- | ------------- | ------------------------------------------ |
| `grok` parsing                     | Full support  | Supported (fixed in recent versions)       |
| `join` operations                  | Full support  | Limited - complex joins may timeout        |
| Free-text search on message fields | Works         | May have limitations in Explore filter bar |

If your query works in Data Explorer but fails in monitors, try simplifying the query or breaking it into multiple monitors.

## Troubleshooting

### Monitor Not Firing

1. **Check the preview graph** - Verify the query returns data and exceeds the threshold
2. **Review pending period** - If set, the condition must be met for multiple consecutive evaluations
3. **Check "No Data" handling** - If data is intermittent, "Treat No Data As: Firing" may cause unexpected behavior
4. **Verify aggregation labels** - Ensure "group by" labels match what you expect

### Labels Not Appearing in Notifications

1. **Labels must be in "group by"** - Only labels used in aggregation are available as `{{ labels.<name> }}`
2. **Check variable syntax** - Use `{{ labels.workload }}` not `{{ alert.labels.workload }}` (though both are supported)
3. **Preview the notification** - Use the live preview to verify variable expansion

### Query Works in Explorer but Not in Monitor

1. **Time alignment** - Explorer and monitors may use different time window handling
2. **Aggregation function order** - For metrics like `sum(sum_over_time(...))`, ensure the step and aggregation window match to avoid inflated values
3. **gcQL limitations** - Some advanced operations may not be fully supported in monitor queries

### Dashboard Shows Different Values Than Monitor

This can occur due to:

* **Step vs aggregation window mismatch** - Using `sum_over_time(...[5m])` with step=1m creates a sliding window that can inflate values. Align step with aggregation window.
* **Different aggregation functions** - Verify both use the same aggregation
* **Time range differences** - Monitors use a specific look-behind window
