Log Pipelines

Overview

groundcover supports the configuration of log pipelines to process and customize your logs before ingestion. With log parsing pipelines, you gain full flexibility to transform data as it flows into the platform.

Log parsing in groundcover is powered by OpenTelemetry Transformation Language (OTTL), a powerful and flexible language designed for transforming telemetry data. OTTL provides a rich set of functions and operators that allow you to parse, filter, enrich, and transform your logs with precision.

What You Can Do

Log parsing pipelines give you a structured way to:

  • Parse - Extract structured data from unstructured log messages (JSON, GROK patterns, key-value pairs)

  • Filter - Drop noisy or irrelevant logs to reduce costs and focus on what matters

  • Obfuscate - Mask or remove sensitive data to maintain privacy and compliance

  • Convert to Metrics - Transform logs into time-series metrics for long-term monitoring and alerting

Each pipeline is made up of transformation steps—each step defines a specific operation like parsing JSON, extracting key-value pairs, or modifying attributes. You can configure these transformations directly in your groundcover deployment.

Parsing Playground

The Parsing Playground is an interactive testing environment where you can develop and validate parsing rules before deploying them to production.

Accessing the Playground

To access the Parsing Playground:

  1. Navigate to the Logs view

  2. Click on any log entry to view its details

  3. Click the Parsing Playground button in the top right corner of the log detail view

The playground opens with the selected log pre-loaded, allowing you to write and test transformation rules in real-time.

Using the Playground

In the Parsing Playground, you can:

  • Write transformation rules - Create and edit parsing statements

  • Test in real-time - See immediate results as you modify your rules

  • View extracted attributes - See exactly what fields are being extracted

  • Validate syntax - Get instant feedback on rule syntax and errors

  • Add rules - Once satisfied, add the rule to deploy it in your pipelines

Writing Parsing Rules

Available Fields and Keys

When writing parsing rules, you have access to various fields that represent different aspects of the log. Understanding which fields are writable and which are read-only is crucial for effective rule creation.

Writable Fields

These fields can be modified in your parsing rules:

Field
Type
Description

cache

Map

Temporary storage for intermediate values. Access sub-keys with cache["key"]

l2m

Map

Logs-to-metrics data. Access sub-keys with l2m["key"]

time

Time

Timestamp of the log

body

String

The main log message content

attributes

Map

Custom attributes extracted from the log. Access sub-keys with attributes["key"]

trace_id

String

Trace identifier for distributed tracing correlation

span_id

String

Span identifier for distributed tracing

level

String

Log severity level (info, error, debug, etc.)

format

String

Log format (json, clf, unknown, etc.)

drop

Boolean

Set to true to drop/filter the log

Read-Only Fields

These fields provide context but cannot be modified. Attempting to set them will result in an error:

Field
Type
Description

workload

String

Name of the Kubernetes workload (deployment, statefulset, etc.)

source

String

Source of the log data

cluster

String

Kubernetes cluster name

env

String

Environment name

container_name

String

Container that generated the log

namespace

String

Kubernetes namespace

Special Fields

cache - Temporary Storage

The cache field is particularly useful for multi-step transformations:

statements:
  # Extract to cache first
  - 'set(cache, ExtractGrokPatterns(body, "pattern"))'
  # Process cached values
  - 'replace_pattern(cache["field"], "old", "new")'
  # Merge into attributes
  - 'merge_maps(attributes, cache, "insert")'

attributes - Custom Fields

The attributes map is where you store extracted structured data:

statements:
  - 'set(attributes["user_id"], "12345")'
  - 'set(attributes["action"], "login")'
  - 'set(attributes["ip"], "192.168.1.1")'

drop - Log Filtering

The drop field is a special boolean flag for filtering logs:

statements:
  # Drop unconditionally
  - 'set(drop, true)'
  
  # Drop conditionally
  - 'set(drop, true) where IsMatch(body, "/healthz")'

Setting Conditions

Use conditions to apply transformations only when specific attributes match. This ensures your pipeline runs efficiently and only on relevant logs.

Common fields you can use(Full List Mentioned Above):

  • workload – Name of the service or app

  • container_name – Container where the log originated

  • level – Log severity (e.g., info, error, debug)

  • format – Log format (e.g., JSON, CLF, unknown)

  • namespace – Kubernetes namespace

  • pod – Pod name

Common Functions

Some commonly used functions in groundcover pipelines:

  • ExtractGrokPatterns – Extract structured data using GROK patterns

  • ParseJSON – Parse JSON strings into structured attributes

  • ParseKeyValue – Parse key=value formatted strings

  • replace_pattern – Replace text patterns (useful for obfuscation)

  • delete_key – Remove specific attributes

  • merge_maps – Merge extracted data into attributes

  • set – Set attribute values

  • IsMatch – Match patterns in log fields

For a complete list of available OTTL functions, see the OTTL Functions Reference.

Automatically Generate Parsing Rules

From a Single Log

When using the parsing playground you can click on Suggest Configuration to ask the platform to attempt generating a parsing rule for the log that's currently loaded to the playground.

The platform will suggest a rule and apply it, demonstrating the extracted attributes. The rule can then be added to the pipeline via Add Rule .

From Multiple Patterns

groundcover can automatically generate parsing rules for you based on Log Patterns. This powerful feature analyzes the structure of your logs and creates optimized parsing rules automatically.

How It Works

The Generate Parsing Rules feature:

  1. Analyzes log patterns in your current view

  2. Identifies common structures and fields

  3. Automatically generates parsing rules for each pattern

  4. Creates rules optimized for your specific log formats

Requirements

To use Generate Parsing Rules:

  • Your logs must have fewer than 10 distinct patterns in the filtered view

  • Patterns must be clearly identifiable and consistent

If you have more than 10 patterns, apply filters (workload, namespace, level, etc.) to narrow down your logs before generating rules.

How To Generate Rules

  1. In the Logs view, click the Actions dropdown in the top right

  2. Select Generate Parsing Rules

  3. Review the generated rules

  4. Make any necessary adjustments

  5. Deploy the rules to your pipeline

The generated rules are automatically added to your pipeline configuration and will start processing logs immediately.

The Rules Pipeline

In groundcover, all parsing rules are part of a single pipeline that processes your logs. Each rule you create becomes a step in this pipeline, executed sequentially from top to bottom. This unified pipeline approach ensures consistent processing and makes it easy to understand the flow of transformations applied to your logs.

Viewing Your Pipelines

After creating parsing rules, they are deployed to Settings → Pipelines where you can monitor their performance and effectiveness. https://app.groundcover.com/settings/pipelines

The Pipelines page provides a comprehensive view of all your log parsing rules with real-time metrics:

Pipeline Metrics

For each rule in the Pipelines page, you'll see four key metrics:

Process Rate

The throughput of logs being processed by this rule, measured in Logs/s (logs per second). This shows how many logs are actively being transformed or evaluated by the rule.

Drop Rate

The rate at which logs are being filtered out by this rule, measured in Logs/s. This metric helps you understand the effectiveness of your filtering rules in reducing log volume.

Error Rate

The percentage of logs that encounter errors during processing, shown as a percentage (%). This includes both condition evaluation errors and transformation execution errors. A healthy rule should have an error rate close to 0%.

Processing Latency

The average time it takes to process a single log through this rule, typically displayed in nanoseconds (ns) or microseconds (μs). Lower values indicate better performance.

Understanding the Metrics

The Pipelines page shows real-time statistics to help you monitor the health and performance of your parsing rules.

What Good Metrics Look Like

Process Rate:

  • Should match your expected log volume for rules that apply to all logs

  • Will be lower for rules with specific conditions

  • Zero process rate may indicate the rule conditions never match

Drop Rate:

  • For filtering rules: Should show significant volume if working correctly

  • For parsing rules: Should typically be 0 (unless explicitly dropping logs)

  • High unexpected drop rates may indicate rule errors

Error Rate:

  • Should be 0% or very close to it

  • Any error rate above 1% indicates a problem that needs investigation

Processing Latency:

  • Typically ranges from 100ns to 1ms per log

  • Complex parsing operations may take longer

  • Values consistently above 10ms indicate performance issues

If you see high processing latency, consider simplifying regex patterns or splitting complex rules into multiple steps.

Rule Order Matters

Why Rule Order Is Important

  1. Earlier rules process more logs - A rule at position 1 sees ALL logs, while a rule at position 10 only sees logs that weren't dropped by rules 1-9

  2. Dropping affects downstream rules - If rule #3 drops 80% of logs, rule #4 will only process the remaining 20%

  3. Transformations are cumulative - Fields extracted in rule #1 are available to rule #2 and beyond

  4. Performance optimization - Place expensive parsing operations AFTER filter rules to process fewer logs

Best Practices for Rule Ordering

Recommended order:

  1. Drop rules first - Filter out noisy logs before expensive parsing

  2. Quick parsing - Fast, simple extractions

  3. Complex parsing - Resource-intensive transformations

  4. Obfuscation rules last - Mask sensitive data after all useful fields are extracted

Example of good ordering:

ottlRules:
  # 1. Drop health checks (filters out 60% of logs)
  - ruleName: "drop_health_checks"
    statements:
      - 'set(drop, true) where IsMatch(body, "/healthz")'
  
  # 2. Drop debug logs (filters another 20%)
  - ruleName: "drop_debug"
    conditions:
      - 'level == "debug"'
    statements:
      - 'set(drop, true)'
  
  # 3. Fast parsing for remaining 20%
  - ruleName: "extract_basic_fields"
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "simple pattern"))'
  
  # 4. Complex parsing (now only processing 20% of original volume)
  - ruleName: "parse_json_payload"
    statements:
      - 'set(cache["parsed"], ParseJSON(body))'
  
  # 5. Obfuscate sensitive data
  - ruleName: "mask_emails"
    statements:
      - 'replace_pattern(attributes["email"], "pattern", "****")'

Pipeline Structure

Rules are defined as a list of steps which are executed one after another. The rules can be viewed and edited by admins in the settings tab, under "Pipelines". https://app.groundcover.com/settings/pipelines

The rules run in groundcover's sensors. This is ideal for cost savings, as the original logs are not sent to or stored in the backend. This is useful particularly when the pipeline is used to drop logs.

For the rules to take place, groundcover sensors need to be configured with Ingestion Keys that allow them to pull remote configuration from the Fleet Manager.

The pipeline is stored in yaml format and can be edited in the UI. The result yaml can be exported to be used with groundcover's terraform provider, if you prefer to use a version control system (e.g. git). Notice that the pipeline is a singleton resource, so your terraform script must only define a single one.

Each rule must have a unique and non-empty ruleName

Basic Structure

ottlRules:
  - ruleName: "rule1"
    conditions:
      - 'workload == "service1" or workload == "service2"'
    statements:
      - statement1
      - statement2
  - ruleName: "rule2"
    conditions:
      - 'level == "debug" or container_name == "test"'
    statements:
      - statement1
      - statement2

Required Attributes

To define a pipeline rule, make sure to include the following fields:

  • ruleName – Unique identifier for the rule (required)

  • statements – List of transformations to apply

  • conditions – Logic for when the rule should trigger

  • statementsErrorMode – How to handle errors (e.g., skip, propagate, fail)

  • conditionLogicOperator – Used when you define multiple conditions (and, or)

Troubleshooting Rule Issues

High error rates:

  • Test the rule in Parsing Playground with sample logs

  • Check for regex syntax errors

  • Verify field names exist in your logs

High processing latency:

  • Simplify complex regex patterns

  • Split multi-step rules into separate rules

  • Place drop rules earlier to reduce processing volume

Low condition met percentage:

  • Verify your conditions match actual log attributes

  • Check if upstream rules are dropping logs unexpectedly

  • Review filter conditions for typos

Unexpected drop rates:

  • Review the rule logic and conditions

  • Check if drop statements have unintended where clauses

  • Verify the rule order isn't causing issues

Last updated