Merging Multiline Logs
Overview
Many applications write multiline logs: stack traces, pretty-printed JSON, or human-readable errors split across several stdout/stderr lines. Kubernetes and most log collectors treat each line as a separate log record by default, which breaks search, correlation, and parsing.
Use the OTTL function multiline_merge in the groundcover logs pipeline to reassemble continuation lines into a single log record whose body contains embedded newlines.
For the broader parsing workflow and other OTTL examples, see Parsing Logs. This page focuses on multiline behavior, common first-line patterns, and how to scope merge rules.
What Are Multiline Logs?
A multiline log is one logical event that spans multiple physical lines. Common cases:
Stack traces (Java, Python, Node.js, Go)
Pretty-printed or indented JSON
System messages where only the first line carries a timestamp
Each line is still one line of text; the issue is ingestion boundaries, not the application binary.
Why Merge Multiline Logs?
Without merging:
Stack frames appear as unrelated log rows
Only the first line may match time or level parsers
JSON split across lines cannot be parsed as a single object
Noise increases (many low-value rows per incident)
After multiline_merge, one record represents the full event, so downstream parsing, ExtractGrokPatterns, JSON parsing, and log-based metrics behave as intended.
How multiline_merge Works
multiline_merge Worksmultiline_merge takes a first-line regex. Any line matching the pattern starts a new buffered block. Lines that do not match are treated as continuations of the current block until the buffer flushes (see Parsing Logs — multiline_merge).
Important
Put
multiline_mergefirst in the rule’sstatementslist so later statements see the mergedbodywhen the buffer flushes.Conditions must include continuation lines — for example scope by
workloadornamespace, not by a pattern that matches only the first line of a stack trace. See Parsing Logs.
Optional arguments:
first_line_pattern
Regex for the start of a new block
(required)
max_lines
Flush after this many lines in one block
128
max_time
Max wait for continuation (e.g. "800ms")
"800ms"
Common First-Line Patterns
The exact regex depends on how the first line of each logical event is formatted. Below are common anchors. Escape backslashes in YAML strings (e.g. ^\\d{4}).
Java Stack Traces
Often the first line includes a timestamp and level, then the exception:
Before merge (three separate ingested lines)
Example rule (same style as Parsing Logs)
After merge (one record)
If the first line is ISO-8601 with T, tighten the pattern, for example:
'multiline_merge("^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}")'
Python Tracebacks
The block usually starts with Traceback (most recent call last)::
Go Panics
Typical first line: panic: ... or fatal error: ...:
Adjust if your runtime prints a different leader (for example a timestamp prefix on the same line).
Node.js Error Stacks
Often Error: or TypeError: on the first line, then at ... lines:
Validate in the Parsing Playground: some stacks start with UnhandledPromiseRejectionWarning or custom prefixes — prefer the most stable prefix your app emits.
Multiline JSON
If each event starts with { or [ on its own line and continuation lines are indented or additional JSON lines, a simple anchor is the opening brace at column 1:
This can collide with single-line JSON logs that also start with {. Narrow with workload, namespace, or a more specific prefix (for example a known {"level": header).
Timestamp-Prefixed Logs
Any scheme where only new events begin with a timestamp works well:
Tune for comma versus dot milliseconds and timezone tokens.
Bracket-Prefixed Leaders
Some services print a tag line (for example [INFO], [ERROR], === summary) and continue with indented lines that are not timestamp-prefixed. A timestamp-only multiline_merge pattern never treats those tag lines as a new block, so each continuation line becomes its own log row.
Use a first-line pattern that matches those leaders, for example:
Escape backslashes as required by your YAML layer — the pipeline editor and the API-returned YAML often show doubled backslashes before \d, \[, and similar tokens.
Mixed Multiline Types in One Workload
If one process emits Java stacks, Python tracebacks, and JSON blobs, a single regex rarely fits all first lines. Practical options:
Emit an explicit block header line — for example a fixed prefix such as
^\d{4}-\d{2}-\d{2}T... \[my-app\] channel=before each multiline block, then use onemultiline_mergescoped to that workload.Split by container — different
container_namevalues with one rule each.Multiple rules with disjoint
conditions— only if each log line can match at most one rule’s condition set for the whole merge window (avoid conditions that only match the first line).
Last updated
