Obfuscate Logs

Overview

Protect sensitive data in your logs by masking or removing it before ingestion. By integrating data obfuscation directly into your log processing pipelines, you maintain privacy and meet compliance requirements while still retaining the necessary operational details.

Why Obfuscate Logs?

Logs often contain sensitive information that needs to be protected:

Personal Identifiable Information (PII) - emails, names, addresses
Credentials - API keys, tokens, passwords
Financial data - credit card numbers, account numbers
Internal system details - internal IPs, service IDs

Obfuscating this data helps you:

Meet compliance requirements (GDPR, PCI-DSS, HIPAA, etc.)
Protect customer privacy
Reduce security risks from leaked credentials
Maintain audit trails while removing sensitive details

Obfuscation happens in the sensor before logs are sent to storage, ensuring sensitive data never leaves your cluster.

Obfuscation Approaches

There are two primary approaches to obfuscating sensitive data:

1. Masking with replace_pattern

Replace parts of a string with a masking token (e.g., replacing email characters with asterisks). Use this when you want to preserve the field structure while hiding the sensitive value.

Best for: Emails, credit cards, partial IPs, phone numbers

2. Removing with delete_key

Remove fields that contain sensitive data entirely. Use this when the field is not required for downstream processing or analysis.

Best for: API keys, passwords, tokens, unnecessary PII

Best Practices

Apply obfuscation early - Process at the sensor level before data is stored
Be specific with patterns - Avoid over-matching by using precise regex patterns
Test thoroughly - Use the Parsing Playground to verify obfuscation rules
Document your rules - Use clear ruleName values to explain what each rule protects
Balance utility and privacy - Mask data in a way that preserves operational value
Use conditions wisely - Only apply obfuscation where necessary to minimize overhead
Combine with dropping - Consider dropping entire logs containing sensitive data when appropriate
Regular audits - Periodically review logs to ensure obfuscation is working correctly

Common Use Cases

Masking Email Addresses

Preserve email structure while hiding most characters.

ottlRules:
  - ruleName: "mask_email"
    conditions:
      - 'attributes["email"] != nil'
    statements:
      - 'replace_pattern(attributes["email"], "(.{2}).+(@.*)", "$1****$2")'

💡 Example: [email protected] → us****@example.com

Obfuscating Credit Card Numbers

Hide credit card digits while keeping the format recognizable.

ottlRules:
  - ruleName: "obfuscate_credit_card"
    conditions:
      - 'container_name == "payment-service"'
    statements:
      - 'replace_pattern(body, "(credit card:)[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{4}", "$1****-****-****-****")'

💡 Example: credit card:1234-5678-9012-3456 → credit card:****-****-****-****

Removing API Keys

Completely remove API key fields from logs.

ottlRules:
  - ruleName: "remove_api_key"
    conditions:
      - 'attributes["api_key"] != nil'
    statements:
      - 'delete_key(attributes, "api_key")'

💡 What it does: Removes the entire api_key field from the log attributes.

Masking IP Addresses

Partially mask IP addresses for privacy while maintaining usefulness.

ottlRules:
  - ruleName: "mask_ip"
    conditions:
      - 'attributes["client_ip"] != nil'
    statements:
      - 'replace_pattern(attributes["client_ip"], "(\\d+\\.\\d+\\.)(\\d+\\.\\d+)", "$1xxx.xxx")'

💡 Example: 192.168.1.100 → 192.168.xxx.xxx

Obfuscating Passwords in Logs

Remove password values from log messages.

ottlRules:
  - ruleName: "remove_passwords"
    conditions:
      - 'IsMatch(body, "password")'
    statements:
      - 'replace_pattern(body, "(password[\"\\s:=]+)([^\\s,}]+)", "$1[REDACTED]")'

💡 Example: {"username": "john", "password": "secret123"} → {"username": "john", "password": "[REDACTED]"}

Mask SSN while keeping the last 4 digits.

ottlRules:
  - ruleName: "mask_ssn"
    conditions:
      - 'workload == "user-service"'
    statements:
      - 'replace_pattern(body, "\\b(\\d{3}-\\d{2}-)(\\d{4})\\b", "***-**-$2")'

💡 Example: 123-45-6789 → ***-**-6789

Removing Authentication Tokens

Strip bearer tokens and authorization headers.

ottlRules:
  - ruleName: "remove_auth_tokens"
    conditions:
      - 'IsMatch(body, "Bearer") or IsMatch(body, "Authorization")'
    statements:
      - 'replace_pattern(body, "(Bearer\\s+)[A-Za-z0-9\\-._~+/]+=*", "$1[REDACTED]")'
      - 'replace_pattern(body, "(Authorization[\"\\s:=]+)[^\\s,}]+", "$1[REDACTED]")'

💡 What it does: Replaces token values while preserving the field names.

Comprehensive PII Protection

Combine multiple obfuscation rules for complete protection.

ottlRules:
  - ruleName: "protect_pii"
    conditions:
      - 'workload == "user-service"'
    statements:
      # Mask emails
      - 'replace_pattern(body, "([a-zA-Z0-9._%+-]{2})[a-zA-Z0-9._%+-]+(@[a-zA-Z0-9.-]+)", "$1****$2")'
      # Mask phone numbers
      - 'replace_pattern(body, "(\\+?\\d{1,3}[-.\\s]?)(\\d{3})[-.\\s]?(\\d{3})[-.\\s]?(\\d{4})", "$1***-***-$4")'
      # Remove credit cards
      - 'replace_pattern(body, "\\b\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}\\b", "****-****-****-****")'

💡 What it does: Applies multiple obfuscation patterns in a single rule.

Key Functions

replace_pattern

Replaces text matching a pattern with a replacement string.

Syntax:

- 'replace_pattern(target_field, "pattern", "replacement")'

With capture groups:

- 'replace_pattern(target_field, "(keep_this)remove_this(keep_this)", "$1****$2")'

Common patterns:

.+ - Match one or more characters (greedy)
.* - Match zero or more characters (greedy)
[A-Za-z0-9]+ - Match alphanumeric characters
\\d+ - Match digits
\\s+ - Match whitespace
[^\\s]+ - Match non-whitespace

delete_key

Completely removes a field from the attributes.

Syntax:

- 'delete_key(attributes, "field_name")'

Example:

- 'delete_key(attributes, "api_key")'
- 'delete_key(attributes, "password")'
- 'delete_key(cache, "temporary_field")'

Regular Expression Tips

Capture Groups

Use parentheses to capture parts you want to keep:

# Pattern with capture groups
"(prefix)(middle)(suffix)"

# Replacement using $1, $2, $3
"$1[MASKED]$3"

Common Regex Patterns

Email:

"([a-zA-Z0-9._%+-]{2})[a-zA-Z0-9._%+-]+(@[a-zA-Z0-9.-]+)"

Credit Card:

"\\b\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}\\b"

IP Address:

"\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\b"

Phone Number:

"\\+?\\d{1,3}[-.\\s]?\\d{3}[-.\\s]?\\d{3}[-.\\s]?\\d{4}"

Last updated 5 hours ago

Overview

Why Obfuscate Logs?

Obfuscation Approaches

1. Masking with replace_pattern

2. Removing with delete_key

Best Practices

Common Use Cases

Masking Email Addresses

Obfuscating Credit Card Numbers

Removing API Keys

Masking IP Addresses

Obfuscating Passwords in Logs

Masking Social Security Numbers

Removing Authentication Tokens

Comprehensive PII Protection

Key Functions

replace_pattern

delete_key

Regular Expression Tips

Capture Groups

Common Regex Patterns