# Privacy Controls

## Privacy Controls

With groundcover's [BYOC (Bring Your Own Cloud)](/architecture/byoc.md) deployment, all AI telemetry — including prompts, responses, and model metadata — stays within your own infrastructure. groundcover processes AI data exclusively in your environment; nothing is sent to external services.

For teams that need additional control, groundcover supports two approaches: **disable collection entirely** — a hard guarantee that no AI spans reach storage — or **metadata-only mode**, which keeps performance metadata (tokens, cost, latency, model) while stripping prompts and responses (best-effort; see scope note below). Both use [Traces Pipeline](/use-groundcover/data-pipelines/traces-pipeline.md) rules.

***

### Disable GenAI Data Collection

To prevent all GenAI spans from reaching storage, add the following rule to your traces pipeline. Non-GenAI traffic (HTTP, gRPC, DB, etc.) is completely unaffected.

1. Open [**Traces Pipeline Settings**](https://app.groundcover.com/settings/traces-pipeline)
2. Switch to **YAML mode**
3. Add the rule below
4. Save — changes take effect on new spans immediately, no sensor restart required

{% code title="Traces Pipeline YAML" %}

```yaml
ottlRules:
  - ruleName: gc-genai-off
    conditions:
      - 'protocol_type == "gen_ai"'
    statements:
      - 'set(drop, true)'
```

{% endcode %}

This drops all GenAI spans — both eBPF-captured and SDK-instrumented — before they reach storage.

To re-enable GenAI data collection, remove the rule and save.

#### Custom Providers

groundcover auto-detects **OpenAI**, **Anthropic**, and **AWS Bedrock** traffic. If you use a provider that isn't auto-detected (e.g., Cohere, Gemini, or a self-hosted model), those calls appear as regular HTTP spans and are not affected by the rule above.

To include custom providers in the drop rule, add a rule matching their hostname:

{% code title="Traces Pipeline YAML" %}

```yaml
ottlRules:
  - ruleName: gc-genai-custom-providers
    conditions:
      - 'attributes["http.host"] == "api.cohere.com"'
      - 'attributes["net.peer.name"] == "api.cohere.com"'
      - 'attributes["http.host"] == "llm.internal.yourcompany.com"'
      - 'attributes["net.peer.name"] == "llm.internal.yourcompany.com"'
    statements:
      - 'set(drop, true)'
    conditionLogicOperator: or
```

{% endcode %}

Replace the hostnames with your provider endpoints. Each hostname needs both `http.host` and `net.peer.name` conditions because different instrumentation sources use different attribute names for the same host.

***

### Metadata-Only Mode

To keep performance metadata (tokens, cost, latency, model) while stripping prompts and responses from storage, use two layers:

#### SDK Spans

Most OTel GenAI instrumentation libraries support disabling content capture at the source. The standard environment variable is:

```bash
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=false
```

Check your SDK's documentation — some libraries use different configuration keys. When content capture is off, spans arrive with all performance metadata intact and no prompt or response text.

#### eBPF Spans

Add this pipeline rule to clear content fields from eBPF-captured GenAI spans. It covers Era 3 attributes (`gen_ai.input.messages`, etc.), Era 1 attributes (`gen_ai.prompt`, `gen_ai.completion`), and the indexed variants emitted by Traceloop/OpenLLMetry SDKs.

{% code title="Traces Pipeline YAML" %}

```yaml
ottlRules:
  - ruleName: gc-genai-reduce-content
    conditions:
      - 'protocol_type == "gen_ai"'
    statements:
      - 'delete_key(attributes, "gen_ai.input.messages")'
      - 'delete_key(attributes, "gen_ai.output.messages")'
      - 'delete_key(attributes, "gen_ai.system_instructions")'
      - 'delete_key(attributes, "gen_ai.tool.definitions")'
      - 'delete_key(attributes, "gen_ai.prompt")'
      - 'delete_key(attributes, "gen_ai.completion")'
      - 'delete_matching_keys(attributes, "^gen_ai\\.(prompt|completion)\\.[0-9]+\\.")'
      - 'set(request_body, "")'
      - 'set(response_body, "")'
    statementsErrorMode: ignore
```

{% endcode %}

If you already have an `ottlRules:` block in your pipeline YAML, add only the `- ruleName:` entry to the existing list — do not add a second `ottlRules:` key.

{% hint style="warning" %}
**Scope of pipeline-side stripping:** This rule covers content fields for auto-detected providers across all SDK convention versions. It does not guarantee zero content in storage — tool call arguments, error messages that echo input, and spans from providers not yet auto-detected are not covered. For a hard guarantee, use the [kill switch](#disable-genai-data-collection) above. For SDK spans, disabling content capture at the source is the definitive control.
{% endhint %}

For more surgical control — replacing specific JSON keys within request/response bodies with a placeholder rather than stripping entire fields — see [Obfuscate Traces](https://docs.groundcover.com/use-groundcover/data-pipelines/traces-pipeline/obfuscate-traces).

{% hint style="info" %}
Pipeline rules apply to **new spans only**. Existing data in storage is not affected.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.groundcover.com/use-groundcover/ai-observability/privacy-controls.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
