Skip to content

Parse structured messages when normalizing log events #130333

@eyalkoren

Description

@eyalkoren

Description

It's common for the message in a log event to actually be a JSON itself. Specifically, there are shippers that produce ECS-JSON.
We would like to handle this automatically through the normalize_for_stream ingest processor. The idea is to add an additional step, so that it now does the following:

  1. If it's OTel data: just use as is
  2. If it's not OTel:
    1. apply a cheap check whether message is a JSON-encoded string (e.g. if it starts with { and ends with })
    2. If yes:
      1. Parse the message as JSON*
      2. Apply a cheap check whether the resulting object is ECS (e.g. if contains a @timestamp key)
      3. If it's ECS: merge the resulting object back into the root of the document*
      4. If not: add the resulting object as is as the value of the body.structured field
    3. Proceed with namespacing/normalization as before

* JSON parsing and merge should behave exactly exactly as defined in logs@json-pipeline

@dakrone is it a problem to have dependencies between one processor and another? More specifically, is it possible for the NormalizeForStreamProcessor to use JsonProcessor#apply?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions