Log Parsing with OpenTelemetry Pipelines

Configure custom log transformations in groundcover using OpenTelemetry Transformation Language (OTTL). Tailor your logs with structured pipelines for parsing, filtering, and enriching data before ing

Overview

groundcover supports the configuration of log pipelines using OpenTelemetry Transformation Language (OTTL) to process and customize your logs. With OTTL, you gain full flexibility to transform data as it flows into the platform.

Transforming Data with OTTL

groundcover uses OTTL to enrich and shape log data inside your monitored environments. OTTL pipelines give you a structured way to parse, filter, and modify logs before ingestion.

Each pipeline is made up of transformation steps—each step defines a specific operation (like parsing JSON, extracting key-value pairs, or modifying attributes). You can configure these transformations directly in your groundcover deployment.

To test your logic before going live, we recommend using our Parsing Playground (click the top right corner when viewing a specific log).

Required Attributes

To define an OTTL pipeline, make sure to include the following fields:

  • statements – List of transformations to apply.

  • conditions – Logic for when the rule should trigger.

  • statementsErrorMode – How to handle errors (e.g., skip, fail).

  • conditionLogicOperator – Used when you define multiple conditions.

Deploying OTTL in groundcover

Rules are defined as a list of steps which are executed one after another. The rules can be viewed and edited by admins in the settings tab, under "Pipelines".

The rules run in groundcover's sensors. This is ideal for cost savings, as the original logs are not sent to or stored in the backend. This is useful particularly when the pipeline is used to drop logs.

For the rules to take place, groundcover sensors need to be configured with Ingestion Keys that allow them to pull remote configuration from the Fleet Manager.

The pipeline is stored in yaml format and can be edited in the UI. The result yaml can be exported to be used with groundcover's terraform provider, if you prefer to use a version control system (e.g. git). Notice that the pipeline is a singleton resource, so your terraform script must only define a single one.

Each rule must have a unique and non-empty ruleName

Example Structure

ottlRules:
  - ruleName: "rule1"
    conditions:
      - 'workload == "service1" or workload == "service2"'
    statements:
      - statement1
      - statement2
  - ruleName: "rule2"
    conditions:
      - 'level == "debug" or container_name == "test"'
    statements:
      - statement1
      - statement2

Setting Conditions

Use conditions to apply transformations only when specific attributes match. This ensures your pipeline runs efficiently and only on relevant logs.

Common fields you can use:

  • workload – Name of the service or app.

  • container_name – Container where the log originated.

  • level – Log severity (e.g., info, error).

  • format – Log format (e.g., JSON, CLF, unknown).

Writing OTTL Statements

Some commonly used functions in groundcover:

  • ExtractGrokPatterns

  • ParseJSON

  • Replace_pattern

  • Delete_key

  • ToLowerCase

  • Concat

  • ParseKeyValue

Examples

Simple GROK Pattern Extraction

Log

{
  "body": "2025-03-23 10:30:45 INFO User login attempt from 192.168.1.100"
}

Statements

- 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:level}%{SPACE}User login attempt from %{IP:source_ip}"))'
- 'merge_maps(attributes, cache, "insert")'

Results

{
  "timestamp": "2025-03-23 10:30:45",
  "level": "INFO",
  "source_ip": "192.168.1.100"
}

Grok + Replace + ParseKeyValue

Log

{
  "body": "2025-03-23 15:20:12,512 - EventProcessor - DEBUG - Completed event processing [analyzer_name=disk-space-check] [node_id=7f5e9aa8412d4c0003a7b2c5] [service_id=813dd10298f77700029d54e3] [sensor_id=3] [tracking_code=19fd5b6e72c7e94088a9ff3d] [log_id=b'67acfe0c92d43000'] [instance_id=microservice-7894563210]"
}

Statements

- 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}-%{SPACE}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{LOGLEVEL:level}%{DATA}(?<kv>\\[%{GREEDYDATA})"))'
- 'replace_pattern(cache["kv"], "[\\[\\]]", "")'
- 'merge_maps(attributes, ParseKeyValue(cache["kv"]), "insert")'
- 'set(attributes["timestamp"], cache["timestamp"])'

Results

{
  "instance_id": "microservice-7894563210",
  "analyzer_name": "disk-space-check",
  "node_id": "7f5e9aa8412d4c0003a7b2c5",
  "service_id": "813dd10298f77700029d54e3",
  "sensor_id": "3",
  "tracking_code": "19fd5b6e72c7e94088a9ff3d",
  "log_id": "b67acfe0c92d43000",
  "timestamp": "2025-03-23 15:20:12,512"
}

Grok + ToLowerCase + ParseJSON

Log

{
  "body": "2025-03-23 14:55:12,456 ERROR {\"event\":\"user_login\",\"user_id\":12345,\"status\":\"failed\",\"ip\":\"192.168.1.10\"}"
}

Statements

- 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:level}%{SPACE}(?<json_body>\\{.*\\})"))'
- 'set(attributes["timestamp"], cache["timestamp"])'
- 'set("level", ToLowerCase(cache["level"]))'
- 'set(cache["parsed_json"], ParseJSON(cache["json_body"]))'
- 'merge_maps(attributes, cache["parsed_json"], "insert")'

Results

{
  "timestamp": "2025-03-23 14:55:12,456",
  "level": "error",
  "event": "user_login",
  "user_id": 12345,
  "status": "failed",
  "ip": "192.168.1.10"
}

Last updated