Create a new Monitor
Learn how to create and configure monitors using the Wizard, Monitor Catalog, or Import options. The following guide will help you set up queries, thresholds, and alert routing for effective monitoring.
You can either create monitors using our web application following this guide, or use our API, see: https://github.com/groundcover-com/docs/blob/main/use-groundcover/monitors/README.md or use our Terraform provider, see: groundcover Terraform Provider.
In the Monitors section (left navigation bar), navigate to the Issues page or the Monitor List page to create a new Monitor. Click on the “Create Monitor” button at the top right and select one of the following options from the dropdown:
Using the Monitor Wizard
Overview
The Monitor Wizard is a guided, user-friendly approach to creating and configuring monitors tailored to your observability needs. By breaking down the process into simple steps, it ensures consistency and accuracy.
Section 1: Query
Select the data source, build the query and define thresholds for the monitor.
If you're unfamiliar with query building in groundcover, refer to the Query Builder section for full details on the different components.
Data Source (Required):
Select the type of data (Metrics, Logs, Traces, or Events).
Query Functions:
Choose how to process the data (e.g., average, count).
Add aggregation (group by) clauses if applicable, you MUST use aggregations if you want to add labels to your issues.
Important: The labels used for aggregation (group by) maybe also be used for notification routes and the issue summary & description.
Examples:
cluster,node,container_name
Time Window (Required):
Specify the period over which data is aggregated (the look-behind window).
Example: “Over the last 5 minutes.”
Window Aggregation (Required):
Specify the aggregation function to be used on the selected time window.
Example: "
avgover the last 5m"
Threshold Conditions (Required):
Define when a monitor should trigger an Issue. You can use:
Greater Than - Trigger when the value exceeds X.
Lower Than - Trigger when the value falls below X.
Within Range - Trigger when the value is between X and Y.
Outside Range - Trigger when the value is not between X and Y.
Important: The units in which the threshold is being measured in must be the same as the units the query uses.
For metrics queries the threshold should match the unit the metric is measured in.
For logs, traces, and events, it's just a number.
Example: “Trigger if disk space usage is greater than 10%.”
Preview Settings (Optional):
Preview data using Stacked Bar or Line Chart for better clarity while building the monitor.
Choose the Y axis units.
Choose the rollup to present.
Important: These configurations only affect the preview graph, not the monitor's evaluation.

Advanced (Optional):
Evaluation Interval:
Specify how often the monitor evaluates the query
Example: “Evaluate every 1 minute.”
Pending Period:
Specify how many times the evaluation needs to pass the threshold in order to trigger an Issue. This refers to a consecutive evaluations passing the threshold.
Monitors that have entered the pending period (the first evaluation passed the threshold) will be in 'Pending' state, only after all consecutive evaluations passed the threshold, the monitor will be 'Firing' and an Issue will be created. If even 1 of the evaluations did not pass the threshold, the Monitor will be set right back to 'Normal'.
Example: “When Evaluation Interval of 5m, setting this to 2 (10m) ensures the condition must be evaluated 3 times before a monitor will fire.
Evaluation #1 at 0m
Evaluation #2 at 5m
Evaluation #3 at 10m -> If all 3 passed the threshold, an the monitor will 'Fire'
Note: This ensures that transient conditions do not trigger alerts, reducing false positives or smoothing sudden unwanted spikes.
Important: The default configuration is 0, which means the monitor will trigger an Issue immediately when an evaluation was run and the threshold was passed.
Treat No Data As:
Whether the monitor should treat no data as an Issue or Normal behavior
Example: "I want to be notified if the metric has a gap for the entire look-behind window of the query, so I will set it to 'Firing'"

Section 2: Monitor Details
Set up the basic information for the monitor.
Monitor Name (Required):
Add a title for the monitor. The title will appear in notifications and in the Monitor List page.
Give the Monitor a clear, short name, that describes its function at a high level.
Examples:
“Workload High API Error Rate”“Node High Memory”
The title will appear in the monitors page table and be accessible in notification routes.
Severity (required):
Use severity to categorize alerts by importance.
Select a severity level (S1-S4).
Important: For connected apps (OpsGenie, PagerDuty) that require using specific Severities like P1-P4 or Critical-Info, we translate automatically to the relevant respective Severity.
Custom Labels (formally called 'metadata labels'):
Add custom labels (key:value) that will be added to all Issues generated by this monitor
Example: To create a notification route for my team's issues, add "Team:Infra" and use it in the notification route's scope

Section 3: Issue Details
Customize how the Monitor’s Issues will appear and what content will be sent in it's notifications. This section also includes a live preview of the way it will appear in the notification.
Ensure that the labels you wish to use dynamically (e.g., cluster, workload) or statically (e.g. team:infra) are defined in the query and monitor details.
Issue Summary (required):
Define a short title for issues that this Monitor will raise. It's useful to include variables that can be informative at first glance.
Example: Adding
{{ labels.statusCode }}to the header will inject the status code to the name of the issue - this becomes especially useful when one Monitor raises multiple issues and you want to quickly understand their content without having to open each one.“HTTP API Error {{ labels.status_code }}”->HTTP API Error 500“Workload {{ labels.workload }} Pod Restart”->Workload frontend Pod Restart“{{ labels.team }} APIs High Latency”->Infra APIs High Latency
Note: Autocomplete is supported to view what is usable in the Issue and will help ensure you put in the variables correctly.

The new format for templating variables is {{ variable }} or {{ labels.<label> }} , but the previous format {{ alert.labels.statusCode }} used for keep is still supported.
Description:
Used as the body of the message for the Issue.
The templating format is Jinja2, and can be used with variables similarly to the Issue Summary and various more advanced functions.
Example: Adding all the labels to be shown in the Slack message's body should be inserted into here using
{{labels.<label>}}, you can add the severity{{severity}}, the monitor's name{{monitor_name}}and many more.
Any pongo2 functionality like { %if } evaluations are supported, which can be used to render different descriptions for different conditions.

Advanced (Optional)
Display Labels (formally called 'context labels'):
These Labels will be displayed and filterable in Monitors>Issues page.
This list gets automatically populated based on the labels used in the aggregation function in the Query.
Note: You can remove labels from this list if you do not wish to see them in the Issues page.

Section 4: Notifications
Set up notifications behavior for issues from this monitor
Workflows (used for Keep) and Notification Routes may work in parallel and do not affect each other.
Based on matching notification routes (Required)
The issues generated by this monitor will be evaluated by the Notification Route's scopes and rules and notifications will be sent accordingly
Note: The 'Preview' can be used in order to align expectations as for which notification routes may match this monitor's future issues
Routing (Workflows) (Optional)
Select Workflow:
Route alerts to existing workflows only, this means that other workflows will not process them. Use this to send alerts for a critical application such as Slack or PagerDuty.
No Routing:
This means that any workflow (without filters), will process the issue.
Configure Routing for Keep Workflows only

Advanced (Optional)
Override Renotify Interval
Used to override the interval configured on the Notification Route for when a certain monitior's issue should send another notification at a different inerval.
Example: If it's set to 1m while the evaluation interval is 1m a notification will be sent with every firing evaluation. If it's set to 2d, even if the monitor evaluates every 1m, a notification will be sent once every 2 days.
Note: If the Issue stops firing and starts firing again, a new notification will be sent, this is not considered 'renotification'
Important: Minimum interval is the evaluation interval

Using the Import option
This is an advanced feature, please use it with caution.

In the "Import Bulk Monitors" you can add multiple monitors using an array of Monitors that follows the Monitor YAML structure.
Example of importing multiple monitors
Click on "Create Monitors" to create them.
Last updated
