otlpjsonfilereceiver

package module
v0.154.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 9, 2026 License: Apache-2.0 Imports: 15 Imported by: 10

README

OTLP JSON File Receiver

This receiver will read pipeline data from JSON files. The data is written in Protobuf JSON encoding using OpenTelemetry protocol.

Status
Stability development: profiles
alpha: traces, metrics, logs
Distributions contrib
Issues Open issues Closed issues
Code coverage codecov
Code Owners @atoulme | Seeking more code owners!

The receiver will watch the directory and read files. If a file is updated or added, the receiver will read it in its entirety again.

The data is serialized according to the OpenTelemetry Protocol File Exporter.

Getting Started

The following settings are required:

  • include: set a glob path of files to include in data collection

Example:

receivers:
  otlp_json_file:
    include:
      - "/var/log/*.log"
    exclude:
      - "/var/log/example.log"

[!NOTE] The deprecated component type otlpjsonfile (without the underscores) can still be used as an alias and will log a deprecation warning.

Configuration

Field Default Description
include required A list of file glob patterns that match the file paths to be read.
exclude [] A list of file glob patterns to exclude from reading. This is applied against the paths matched by include.
exclude_older_than Exclude files whose modification time is older than the specified age.
start_at end At startup, where to start reading logs from the file. Options are beginning or end.
multiline A multiline configuration block. See the File Log Receiver documentation for details.
force_flush_period 500ms Time since last time new data was found in the file, after which a partial log at the end of the file may be emitted.
encoding utf-8 The encoding of the file being read. See the list of supported encodings below for available options.
preserve_leading_whitespaces false Whether to preserve leading whitespaces.
preserve_trailing_whitespaces false Whether to preserve trailing whitespaces.
include_file_name true Whether to add the file name as the attribute log.file.name.
include_file_path false Whether to add the file path as the attribute log.file.path.
include_file_name_resolved false Whether to add the file name after symlinks resolution as the attribute log.file.name_resolved.
include_file_path_resolved false Whether to add the file path after symlinks resolution as the attribute log.file.path_resolved.
include_file_owner_name false Whether to add the file owner name as the attribute log.file.owner.name. Not supported for windows.
include_file_owner_group_name false Whether to add the file group name as the attribute log.file.owner.group.name. Not supported for windows.
include_file_permissions false Whether to add the file permissions as the attribute log.file.permissions in 3-digit octal format (e.g., 755). Not supported for windows.
include_file_record_number false Whether to add the record number in the file as the attribute log.file.record_number.
include_file_record_offset false Whether to add the record offset in the file as the attribute log.file.record_offset.
poll_interval 200ms The duration between filesystem polls.
fingerprint_size 1000 The number of bytes, read from the start of a file, used to uniquely identify it. Must be at least 16. Decreasing this value will trigger re-ingestion of files larger than the new fingerprint size.
initial_buffer_size 16KiB The initial size of the to read buffer for headers and logs, the buffer will be grown as necessary. Larger values may lead to unnecessary large buffer allocations, and smaller values may lead to lots of copies while growing the buffer.
max_log_size 1MiB The maximum size of a log entry to read. The behavior for oversized log entries is controlled by max_log_size_behavior. Protects against reading large amounts of data into memory.
max_log_size_behavior split Behavior when a log entry exceeds max_log_size. Options are split (default) which splits oversized entries into multiple log entries, or truncate which truncates the entry and drops the remainder.
max_concurrent_files 1024 The maximum number of log files from which logs will be read concurrently. If the number of files matched in the include pattern exceeds this number, then files will be processed in batches.
max_batches 0 Only applicable when files must be batched in order to respect max_concurrent_files. This value limits the number of batches that will be processed during a single poll interval. A value of 0 indicates no limit.
delete_after_read false If true, each log file will be read and then immediately deleted. Requires that the filelog.allowFileDeletion feature gate is enabled. Must be false when start_at is set to end.
acquire_fs_lock false Whether to attempt to acquire a filesystem lock before reading a file (Unix only).
file_cache_advise false Hints the operating system to release cached file pages after they are read, helping reduce page cache usage for large sequential workloads. (Linux only).
storage none The ID of a storage extension to be used to store file offsets. File offsets allow the receiver to pick up where it left off in the case of a collector restart. If no storage extension is used, the receiver will manage offsets in memory only.
header nil Specifies options for parsing header metadata. Requires that the filelog.allowHeaderMetadataParsing feature gate is enabled. Must not be set when start_at is set to end. Note: because this receiver does not run a stanza pipeline, attributes produced by header.metadata_operators are not propagated to emitted records; the header block is accepted only to allow header lines to be consumed without being parsed as OTLP JSON.
header.pattern required for header metadata parsing A regex that matches every header line.
header.metadata_operators required for header metadata parsing A list of operators used to parse metadata from the header.
ordering_criteria.regex Regular expression used for sorting, should contain a named capture groups that are to be used in regex_key.
ordering_criteria.group_by Regular expression used for grouping, which is done pre-sorting. Should contain a named capture groups.
ordering_criteria.top_n 1 The number of files to track when using file ordering. The top N files are tracked after applying the ordering criteria.
ordering_criteria.sort_by.regex_key Regular expression named capture group defined in ordering_criteria.regex to use for sorting.
ordering_criteria.sort_by.sort_type Type of sorting to be performed (e.g., numeric, alphabetical, timestamp, mtime)
ordering_criteria.sort_by.location Relevant if sort_type is set to timestamp. Defines the location of the timestamp of the file.
ordering_criteria.sort_by.format Relevant if sort_type is set to timestamp. Defines the strptime format of the timestamp being sorted.
ordering_criteria.sort_by.ascending Sort direction
compression Indicate the compression format of input files. If set accordingly, files will be read using a reader that uncompresses the file before scanning its content. Options are ``, gzip, or auto. auto auto-detects file compression type. Currently, gzip files are the only compressed files auto-detected, based on its headers See RFC 1952. auto option is useful when ingesting a mix of compressed and uncompressed files with the same receiver.
polls_to_archive 0 This setting controls the number of poll cycles to store on disk, rather than being discarded. By default, the receiver will purge the record of readers that have existed for 3 generations. Refer archiving in the File Log Receiver documentation and polling for more details. Note: This feature is experimental.
on_truncate ignore Behavior when a file with the same fingerprint is detected but with a smaller size (indicating a copytruncate rotation). Options are ignore, read_whole_file, or read_new. See handling copytruncate rotation.
replay_file false If true, the receiver will not track file offsets and will re-read files from the beginning on every poll.
Supported encodings
Key Description
nop No encoding validation. Treats the file as a stream of raw bytes
utf-8 UTF-8 encoding
utf-8-raw UTF-8 encoding without replacing invalid UTF-8 bytes
utf-16le UTF-16 encoding with little-endian byte order
utf-16be UTF-16 encoding with big-endian byte order
ascii ASCII encoding
big5 The Big5 Chinese character encoding

Other less common encodings are supported on a best-effort basis. See https://www.iana.org/assignments/character-sets/character-sets.xhtml for other encodings available.

Time parameters

All time parameters must have the unit of time specified. e.g.: 200ms, 1s, 1m.

Handling Copytruncate Rotation

When log files are rotated using the copytruncate strategy (where the file is copied and then truncated in place), the receiver can detect when a file has been truncated by comparing the stored offset with the current file size. The on_truncate setting controls how the receiver behaves when truncation is detected:

  • ignore (default): The receiver keeps the original offset and will not read any data until the file grows past the original offset. This prevents duplicate log ingestion when a file is rotated.
  • read_whole_file: The receiver resets the offset to 0 and reads the entire file from the beginning. Use this mode when you want to ensure no data loss, even if it means potentially re-reading some logs.
  • read_new: The receiver updates the offset to the current file size (the position after truncation). This allows reading new data that is written after the truncation without re-reading existing content.

Example configuration:

receivers:
  otlp_json_file:
    include:
      - /var/log/otlp/*.json
    on_truncate: read_whole_file  # Read entire file after copytruncate rotation

Documentation

Overview

Package otlpjsonfilereceiver implements a receiver that can be used by the OpenTelemetry collector to receive logs, traces and metrics from files See https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/file-exporter.md#json-file-serialization

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewFactory

func NewFactory() receiver.Factory

NewFactory creates a factory for file receiver

Types

type Config

type Config struct {
	fileconsumer.Config `mapstructure:",squash"`
	StorageID           *component.ID `mapstructure:"storage"`
	ReplayFile          bool          `mapstructure:"replay_file"`
}

Directories

Path Synopsis
internal
metadata
Package metadata contains the autogenerated telemetry and build information for the receiver/otlp_json_file component.
Package metadata contains the autogenerated telemetry and build information for the receiver/otlp_json_file component.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL