Parsing Logs

Overview

Transform unstructured log messages into structured, queryable data. By extracting fields from log messages, you can unlock powerful filtering, searching, and analysis capabilities in groundcover.

Why Parse Logs?

Raw log messages often contain valuable information buried in unstructured text. Parsing allows you to:

  • Extract meaningful fields from log messages for better searchability

  • Structure your data to enable powerful filtering and querying

  • Enrich logs with additional context and metadata

  • Standardize formats across different services and applications

Best Practices

  1. Use conditions effectively - Only parse logs from relevant workloads to minimize processing overhead

  2. Test in the Parsing Playground - Always test your parsing rules before deploying

  3. Cache intermediate results - Use the cache variable for temporary storage during multi-step transformations

  4. Be specific with patterns - More specific GROK patterns perform better and are less likely to cause false matches

Common Parsing Use Cases

Parsing JSON Logs

Extract fields from JSON-formatted log messages.

Log

{
  "body": "2025-03-23 14:55:12,456 ERROR {\"event\":\"user_login\",\"user_id\":12345,\"status\":\"failed\",\"ip\":\"192.168.1.10\"}"
}

Rule

ottlRules:
  - ruleName: "parse_json_logs"
    conditions:
      - 'workload == "auth-service"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL}%{SPACE}(?<json_body>\\{.*\\})"))'
      - 'set(cache["parsed_json"], ParseJSON(cache["json_body"]))'
      - 'merge_maps(attributes, cache["parsed_json"], "insert")'

Result

{
  "event": "user_login",
  "user_id": 12345,
  "status": "failed",
  "ip": "192.168.1.10"
}

Extracting Structured Data with GROK Patterns

Use GROK patterns to extract specific fields from formatted log messages.

Log

{
  "body": "2025-03-23 10:30:45 INFO User login attempt from 192.168.1.100"
}

Rule

ottlRules:
  - ruleName: "extract_login_info"
    conditions:
      - 'workload == "auth-service"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:level}%{SPACE}User login attempt from %{IP:source_ip}"))'
      - 'keep_matching_keys(cache, "^[a-z_]+$")'
      - 'merge_maps(attributes, cache, "insert")'

Result

{
  "timestamp": "2025-03-23 10:30:45",
  "level": "INFO",
  "source_ip": "192.168.1.100"
}

Parsing Key-Value Pairs

Extract multiple key-value pairs from bracketed log sections.

Log

{
  "body": "2025-03-23 15:20:12,512 - EventProcessor - DEBUG - Completed event processing [analyzer_name=disk-space-check] [node_id=7f5e9aa8412d4c0003a7b2c5] [service_id=813dd10298f77700029d54e3] [sensor_id=3] [tracking_code=19fd5b6e72c7e94088a9ff3d] [log_id=b'67acfe0c92d43000'] [instance_id=microservice-7894563210]"
}

Rule

ottlRules:
  - ruleName: "parse_event_processor"
    conditions:
      - 'workload == "event-processor"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}-%{SPACE}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{LOGLEVEL:level}%{DATA}(?<kv>\\[%{GREEDYDATA})"))'
      - 'replace_pattern(cache["kv"], "[\\[\\]]", "")'
      - 'merge_maps(attributes, ParseKeyValue(cache["kv"]), "insert")'
      - 'set(attributes["timestamp"], cache["timestamp"])'

Result

{
  "instance_id": "microservice-7894563210",
  "analyzer_name": "disk-space-check",
  "node_id": "7f5e9aa8412d4c0003a7b2c5",
  "service_id": "813dd10298f77700029d54e3",
  "sensor_id": "3",
  "tracking_code": "19fd5b6e72c7e94088a9ff3d",
  "log_id": "b67acfe0c92d43000",
  "timestamp": "2025-03-23 15:20:12,512"
}

Common GROK Patterns

Here are some commonly used GROK patterns:

  • %{TIMESTAMP_ISO8601} - ISO8601 timestamps (2025-03-23T10:30:45)

  • %{LOGLEVEL} - Log levels (INFO, ERROR, DEBUG, etc.)

  • %{IP} - IP addresses

  • %{NUMBER} - Numeric values

  • %{WORD} - Single words

  • %{NOTSPACE} - Non-whitespace characters

  • %{GREEDYDATA} - Match everything (greedy)

  • %{DATA} - Match everything (non-greedy)

  • %{SPACE} - Whitespace characters

Key Functions

ExtractGrokPatterns

Extracts structured data using GROK patterns.

Basic Usage:

- 'set(cache, ExtractGrokPatterns(body, "pattern"))'

Example - Extract timestamp and error code:

ottlRules:
  - ruleName: "extract_error_details"
    conditions:
      - 'level == "error"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "^%{TIMESTAMP_ISO8601:timestamp}.*Error %{NUMBER:error_code}:"))'
      - 'merge_maps(attributes, cache, "insert")'

ParseJSON

Parses JSON strings into structured attributes.

Basic Usage:

- 'set(cache["parsed"], ParseJSON(cache["json_string"]))'

Example - Parse embedded JSON:

ottlRules:
  - ruleName: "parse_json_payload"
    conditions:
      - 'format == "json"'
    statements:
      - 'set(attributes["parsed_data"], ParseJSON(body))'
      - 'set(attributes["user_id"], attributes["parsed_data"]["user"]["id"])'

ParseKeyValue

Parses key=value formatted strings.

Basic Usage:

- 'merge_maps(attributes, ParseKeyValue(cache["kv"]), "insert")'

Example - Parse query parameters:

ottlRules:
  - ruleName: "parse_query_params"
    conditions:
      - 'IsMatch(body, "params=")'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "params=(?<params>.+)$"))'
      - 'merge_maps(attributes, ParseKeyValue(cache["params"], "&", "="), "insert")'

merge_maps

Merges extracted data into the attributes map.

Basic Usage:

- 'merge_maps(attributes, cache, "insert")'

Modes:

  • insert - Only add keys that don't exist

  • update - Only update keys that exist

  • upsert - Add or update keys (default behavior)

Example:

ottlRules:
  - ruleName: "merge_extracted_fields"
    conditions:
      - 'workload == "api-gateway"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "user=(?<user>\\w+) action=(?<action>\\w+)"))'
      - 'merge_maps(attributes, cache, "upsert")'

keep_matching_keys

Filters a map to keep only keys matching a regex pattern.

Basic Usage:

- 'keep_matching_keys(cache, "^[a-z_]+$")'

Example - Keep only lowercase field names:

ottlRules:
  - ruleName: "filter_extracted_fields"
    conditions:
      - 'workload == "data-processor"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "(?<TIMESTAMP>\\d+) (?<user_id>\\d+) (?<action>\\w+)"))'
      - 'keep_matching_keys(cache, "^[a-z_]+$")'  # Keeps user_id and action, drops TIMESTAMP
      - 'merge_maps(attributes, cache, "insert")'

set

Sets a value to a specific field or attribute.

Basic Usage:

- 'set(attributes["field"], "value")'
- 'set(cache["temp"], body)'

Example - Set computed fields:

ottlRules:
  - ruleName: "add_computed_fields"
    conditions:
      - 'workload == "payment-service"'
    statements:
      - 'set(attributes["environment"], "production")'
      - 'set(attributes["service_type"], "payment")'
      - 'set(attributes["processed_at"], UnixNano())'

replace_pattern

Replaces parts of a string matching a regex pattern.

Basic Usage:

- 'replace_pattern(target_field, "pattern", "replacement")'

Example - Clean up log messages:

ottlRules:
  - ruleName: "clean_log_format"
    conditions:
      - 'container_name == "nginx"'
    statements:
      - 'replace_pattern(body, "\\[|\\]", "")'  # Remove brackets
      - 'replace_pattern(body, "\\s+", " ")'    # Normalize whitespace

delete_key

Removes a specific key from a map.

Basic Usage:

- 'delete_key(attributes, "field_name")'

Example - Remove temporary fields:

ottlRules:
  - ruleName: "cleanup_fields"
    conditions:
      - 'workload == "processor"'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "(?<temp_field>\\w+) (?<user_id>\\d+)"))'
      - 'merge_maps(attributes, cache, "insert")'
      - 'delete_key(attributes, "temp_field")'

Concat

Concatenates multiple strings or values together.

Basic Usage:

- 'set(attributes["combined"], Concat([value1, value2, value3], " - "))'

Example - Create composite fields:

ottlRules:
  - ruleName: "create_composite_id"
    conditions:
      - 'attributes["user_id"] != nil'
    statements:
      - 'set(attributes["request_id"], Concat([attributes["user_id"], attributes["timestamp"]], "_"))'
      - 'set(attributes["full_path"], Concat([attributes["namespace"], "/", attributes["workload"]], ""))'

ToLowerCase / ToUpperCase

Converts strings to lowercase or uppercase.

Basic Usage:

- 'set(attributes["normalized_level"], ToLowerCase(attributes["level"]))'
- 'set(attributes["region"], ToUpperCase(attributes["region"]))'

Example - Normalize field values:

ottlRules:
  - ruleName: "normalize_values"
    conditions:
      - 'attributes["level"] != nil'
    statements:
      - 'set(attributes["level"], ToLowerCase(attributes["level"]))'
      - 'set(attributes["env"], ToUpperCase(attributes["env"]))'

Substring

Extracts a portion of a string.

Basic Usage:

- 'set(attributes["short_id"], Substring(attributes["request_id"], 0, 8))'

Example - Extract prefixes and suffixes:

ottlRules:
  - ruleName: "extract_id_prefix"
    conditions:
      - 'attributes["transaction_id"] != nil'
    statements:
      - 'set(attributes["tx_prefix"], Substring(attributes["transaction_id"], 0, 6))'
      - 'set(attributes["tx_type"], Substring(attributes["transaction_id"], 0, 3))'

Split

Splits a string into an array based on a delimiter.

Basic Usage:

- 'set(cache["parts"], Split(body, ","))'

Example - Parse comma-separated values:

ottlRules:
  - ruleName: "parse_csv_log"
    conditions:
      - 'format == "csv"'
    statements:
      - 'set(cache["fields"], Split(body, ","))'
      - 'set(attributes["user_id"], cache["fields"][0])'
      - 'set(attributes["action"], cache["fields"][1])'
      - 'set(attributes["timestamp"], cache["fields"][2])'

Len

Returns the length of a string or array.

Basic Usage:

- 'set(attributes["message_length"], Len(body))'

Example - Conditional processing based on length:

ottlRules:
  - ruleName: "truncate_long_messages"
    conditions:
      - 'Len(body) > 1000'
    statements:
      - 'set(attributes["original_length"], Len(body))'
      - 'set(body, Substring(body, 0, 1000))'
      - 'set(attributes["truncated"], true)'

Int / Double / String

Type conversion functions.

Basic Usage:

- 'set(attributes["status_code_int"], Int(attributes["status_code"]))'
- 'set(attributes["response_time_ms"], Double(attributes["response_time"]))'
- 'set(attributes["user_id_str"], String(attributes["user_id"]))'

Example - Convert and compute:

ottlRules:
  - ruleName: "convert_and_calculate"
    conditions:
      - 'attributes["duration"] != nil'
    statements:
      - 'set(attributes["duration_ms"], Int(attributes["duration"]))'
      - 'set(attributes["duration_sec"], Double(attributes["duration"]) / 1000.0)'
      - 'set(attributes["is_slow"], attributes["duration_ms"] > 5000)'

IsMatch

Checks if a field matches a regex pattern.

Basic Usage:

- 'set(attributes["is_error"], IsMatch(body, "ERROR|FATAL"))'

Example - Conditional field extraction:

ottlRules:
  - ruleName: "extract_http_status"
    conditions:
      - 'IsMatch(body, "HTTP/\\d\\.\\d\\s+\\d{3}")'
    statements:
      - 'set(cache, ExtractGrokPatterns(body, "HTTP/%{NUMBER:http_version}\\s+%{NUMBER:status_code}"))'
      - 'merge_maps(attributes, cache, "insert")'
      - 'set(attributes["is_error"], Int(attributes["status_code"]) >= 400)'

Last updated