# Traces Pipeline

## Overview

Traces pipeline is powered by [OpenTelemetry Transformation Language (OTTL)](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl), a powerful and flexible language designed for transforming telemetry data. OTTL provides a rich set of functions and operators that allow you to filter, enrich, and transform your traces with precision.

Here are some common use cases:

* **Transform attributes** - Rename or reshape span attributes so they match your naming or schema.
* **Filter traces** - Drop spans you don’t need (e.g. health checks) to reduce storage and noise.
* **Enrich data** - Parse or derive new attributes from existing fields for better filtering in the UI.
* **Obfuscation** - Remove sensitive information from traces.

## Setting up pipelines

{% hint style="info" %}
Traces pipelines can only be edited by Admins
{% endhint %}

Go to [Settings > Data Pipelines > Traces Pipeline](https://app.groundcover.com/settings/traces-pipeline?backendId=groundcover\&tenantUUID=f038dbeb-8971-42fa-aede-b50ad4734d86\&duration=Last+5+minutes) to see your existing pipelines.

<figure><img src="/files/iFlCfw6Mjn6MutBAQQYx" alt=""><figcaption></figcaption></figure>

To edit your rules, click on Edit Pipeline at the top right.

## Writing Pipeline Rules

#### Configuration rule format

The below snippet is the format of a rule

```yaml
ottlRules:
  - ruleName: <rule name>
    ruleDisabled: false
    statements:
      - <list of statements>
    conditions:
      - <list of conditions>
    statementsErrorMode: ignore
    conditionLogicOperator: and
```

* `ruleName` – Use a meaningful name for a rule. This name will appear in the main screen and will allow you to track its performance.
* `ruleDisabled` - use this parameter to temporary disable rules
* `statements` - use 1 or more [functions](#common-functions) to describe the transformation to do. For example, use `replace_all_matches` to locate a specific value and replace it with another.
* `conditions` - use this optional parameter to set conditions on when the statements will run. Use the [OTTL Comparison Rules](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/LANGUAGE.md#comparison-rules) to define the list of conditions.

### **Common Functions**

Some commonly used functions in groundcover traces pipelines:

* `set` – Set attribute values
* `delete_key` – Remove specific attributes
* `merge_maps` – Merge extracted data into attributes
* `replace_pattern` – Replace text patterns (useful for obfuscation)
* `ParseJSON` – Parse JSON strings into structured attributes
* `IsMatch` – Match patterns in span fields
* `Concat` – Concatenate multiple values
* `Substring` – Extract portion of a string
* `obfuscate_pii` – Automatically detect and redact sensitive data

For a complete list of available OTTL functions, see the [OTTL Functions Reference](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/ottlfuncs/README.md).

**drop** - Span Filtering

The `drop` field is a special boolean flag for filtering spans:

```yaml
statements:
  # Drop unconditionally
  - 'set(drop, true)'

  # Drop conditionally
  - 'set(drop, true) where IsMatch(span_name, "/health")'
```

### **Setting Conditions**

Use `conditions` to apply transformations only when specific criteria match. This ensures your pipeline runs efficiently and only on relevant spans.

Common fields you can use (Full List Mentioned Above):

* `workload` – Name of the service or app
* `namespace` – Kubernetes namespace
* `protocol_type` – Protocol type (e.g. `http`, `grpc`)
* `source` – Data source (`ebpf`, `otel`)
* `span_name` – Name of the span operation
* `kind` – Span kind (`Client`, `Server`, etc.)
* `status` – Span status code

### Available Fields and Keys

When writing pipeline rules, you have access to various fields that represent different aspects of the span. Understanding which fields are writable and which are read-only is crucial for effective rule creation.

**Writable Fields**

These fields can be modified in your pipeline rules:

| Field              | Type    | Description                                                                    |
| ------------------ | ------- | ------------------------------------------------------------------------------ |
| `cache`            | Map     | Temporary storage for intermediate values. Access sub-keys with `cache["key"]` |
| `attributes`       | Map     | Custom span attributes. Access sub-keys with `attributes["key"]`               |
| `drop`             | Boolean | Set to `true` to drop/filter the span                                          |
| `span_name`        | String  | The operation name of the span                                                 |
| `trace_id`         | String  | Trace identifier (32 hex characters)                                           |
| `span_id`          | String  | Span identifier (16 hex characters)                                            |
| `parent_id`        | String  | Parent span identifier (16 hex characters)                                     |
| `start_time`       | Time    | Start timestamp of the span                                                    |
| `end_time`         | Time    | End timestamp of the span                                                      |
| `request_body`     | String  | HTTP request payload content                                                   |
| `response_body`    | String  | HTTP response payload content                                                  |
| `request_headers`  | Map     | HTTP request headers. Access sub-keys with `request_headers["header-name"]`    |
| `response_headers` | Map     | HTTP response headers. Access sub-keys with `response_headers["header-name"]`  |
| `is_pii`           | Boolean | Mark span as containing PII. Automatically set by `obfuscate_pii`              |

**Read-Only Fields**

These fields provide context but **cannot be modified**. Attempting to set them will result in an error:

| Field           | Type     | Description                                               |
| --------------- | -------- | --------------------------------------------------------- |
| `workload`      | String   | Name of the Kubernetes workload                           |
| `source`        | String   | Source of the trace data                                  |
| `cluster`       | String   | Kubernetes cluster name                                   |
| `env`           | String   | Environment name                                          |
| `namespace`     | String   | Kubernetes namespace                                      |
| `protocol_type` | String   | Protocol type (e.g. `http`, `grpc`, `redis`)              |
| `kind`          | String   | Span kind (`Client`, `Server`, `Internal`, etc.)          |
| `status`        | String   | Span status code (`Ok`, `Error`, `Unset`)                 |
| `duration`      | Duration | Span duration (computed from `start_time` and `end_time`) |
| `tags`          | Map      | Span tags (read-only). Access sub-keys with `tags["key"]` |

**Special Fields**

**cache** - Temporary Storage

The `cache` field is useful for multi-step transformations:

```yaml
statements:
  # Extract to cache first
  - 'merge_maps(cache, ParseJSON(response_body), "insert") where response_body != nil'
  # Process cached values
  - 'set(attributes["user_id"], cache["id"]) where cache["id"] != nil'
```

**attributes** - Custom Span Attributes

The `attributes` map is where you store extracted or enriched structured data:

```yaml
statements:
  - 'set(attributes["team"], "payments")'
  - 'set(attributes["priority"], "high")'
```

**request\_body / response\_body** - Payload Access

Access HTTP payloads for obfuscation or inspection:

```yaml
statements:
  - 'replace_pattern(request_body, "password=([^&]+)", "password=[REDACTED]")'
  - 'set(response_body, "<REDACTED>") where is_pii == true'
```

**request\_headers / response\_headers** - HTTP Headers

Access and modify HTTP headers:

```yaml
statements:
  - 'set(response_headers["set-cookie"], "[REDACTED]")'
  - 'set(attributes["user_agent"], request_headers["user-agent"])'
```

### **Troubleshooting Rule Issues**

**High error rates:**

* Review the rule syntax for typos
* Check that field names exist on your spans
* Verify regex patterns are valid

**High processing latency:**

* Simplify complex regex patterns
* Split multi step rules into separate rules
* Place drop rules earlier to reduce processing volume

**Low condition met percentage:**

* Verify your conditions match actual span attributes
* Check if upstream rules are dropping spans unexpectedly
* Review filter conditions for typos

**Unexpected drop rates:**

* Review the rule logic and conditions
* Check if drop statements have unintended `where` clauses
* Verify the rule order isn't causing issues


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.groundcover.com/use-groundcover/data-pipelines/traces-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
