Reduce

Example Configuration

Merge Ruby exceptions

Config
Input
Output
1[transforms.my_transform_id]
2type = "reduce"
3starts_when = "match(.message, /^[^\\s]/)"
4group_by = [ "host", "pid", "tid" ]
5
6 [transforms.my_transform_id.merge_strategies]
7 message = "concat_newline"
1[
2 {
3 "log": {
4 "timestamp": "2020-10-07T12:33:21.223543Z",
5 "message": "foobar.rb:6:in `/': divided by 0 (ZeroDivisionError)",
6 "host": "host-1.hostname.com",
7 "pid": 1234,
8 "tid": 5678
9 }
10 },
11 {
12 "log": {
13 "timestamp": "2020-10-07T12:33:21.223543Z",
14 "message": " from foobar.rb:6:in `bar'",
15 "host": "host-1.hostname.com",
16 "pid": 1234,
17 "tid": 5678
18 }
19 },
20 {
21 "log": {
22 "timestamp": "2020-10-07T12:33:21.223543Z",
23 "message": " from foobar.rb:2:in `foo'",
24 "host": "host-1.hostname.com",
25 "pid": 1234,
26 "tid": 5678
27 }
28 },
29 {
30 "log": {
31 "timestamp": "2020-10-07T12:33:21.223543Z",
32 "message": " from foobar.rb:9:in `<main>'",
33 "host": "host-1.hostname.com",
34 "pid": 1234,
35 "tid": 5678
36 }
37 },
38 {
39 "log": {
40 "timestamp": "2020-10-07T12:33:22.123528Z",
41 "message": "Hello world, I am a new log",
42 "host": "host-1.hostname.com",
43 "pid": 1234,
44 "tid": 5678
45 }
46 }
47]
1[
2 {
3 "log": {
4 "timestamp": "2020-10-07T12:33:21.223543Z",
5 "message": "foobar.rb:6:in `/': divided by 0 (ZeroDivisionError)\n from foobar.rb:6:in `bar'\n from foobar.rb:2:in `foo'\n from foobar.rb:9:in `<main>'",
6 "host": "host-1.hostname.com",
7 "pid": 1234,
8 "tid": 5678
9 }
10 },
11 {
12 "log": {
13 "timestamp": "2020-10-07T12:33:22.123528Z",
14 "message": "Hello world, I am a new log",
15 "host": "host-1.hostname.com",
16 "pid": 1234,
17 "tid": 5678
18 }
19 }
20]

Reduce Rails logs into a single transaction

Config
Input
Output
1[transforms.my_transform_id]
2type = "reduce"
1[
2 {
3 "log": {
4 "timestamp": "2020-10-07T12:33:21.223543Z",
5 "message": "Received GET /path",
6 "request_id": "abcd1234",
7 "request_path": "/path",
8 "request_params": {
9 "key": "val"
10 }
11 }
12 },
13 {
14 "log": {
15 "timestamp": "2020-10-07T12:33:21.832345Z",
16 "message": "Executed query in 5.2ms",
17 "request_id": "abcd1234",
18 "query": "SELECT * FROM table",
19 "query_duration_ms": 5.2
20 }
21 },
22 {
23 "log": {
24 "timestamp": "2020-10-07T12:33:22.457423Z",
25 "message": "Rendered partial _partial.erb in 2.3ms",
26 "request_id": "abcd1234",
27 "template": "_partial.erb",
28 "render_duration_ms": 2.3
29 }
30 },
31 {
32 "log": {
33 "timestamp": "2020-10-07T12:33:22.543323Z",
34 "message": "Executed query in 7.8ms",
35 "request_id": "abcd1234",
36 "query": "SELECT * FROM table",
37 "query_duration_ms": 7.8
38 }
39 },
40 {
41 "log": {
42 "timestamp": "2020-10-07T12:33:22.742322Z",
43 "message": "Sent 200 in 15.2ms",
44 "request_id": "abcd1234",
45 "response_status": 200,
46 "response_duration_ms": 5.2
47 }
48 }
49]
1{
2 "log": {
3 "timestamp": "2020-10-07T12:33:21.223543Z",
4 "timestamp_end": "2020-10-07T12:33:22.742322Z",
5 "request_id": "abcd1234",
6 "request_path": "/path",
7 "request_params": {
8 "key": "val"
9 },
10 "query_duration_ms": 13,
11 "render_duration_ms": 2.3,
12 "status": 200,
13 "response_duration_ms": 5.2
14 }
15}

Configuration Options

Required Options

inputs(required)

A list of upstream source or transform IDs. Wildcards (*) are supported.

See configuration for more info.

TypeSyntaxDefaultExample
arrayliteral["my-source-or-transform-id","prefix-*"]
type(required)

The component type. This is a required field for all components and tells Vector which component to use.

TypeSyntaxDefaultExample
stringliteral["reduce"]

Advanced Options

ends_when(optional)

A condition used to distinguish the final event of a transaction. If this condition resolves to true for an event, the current transaction is immediately flushed with this event.

TypeSyntaxDefaultExample
stringliteral[".status_code != 200 && !includes([\"info\", \"debug\"], .severity)"]
expire_after_ms(optional)

A maximum period of time to wait after the last event is received before a combined event should be considered complete.

TypeSyntaxDefaultExample
uint30000
flush_period_ms(optional)

Controls the frequency that Vector checks for (and flushes) expired events.

TypeSyntaxDefaultExample
uint1000
group_by(optional)

An ordered list of fields by which to group events. Each group is combined independently, allowing you to keep independent events separate. When no fields are specified, all events will be combined in a single group. Events missing a specified field will be combined in their own group.

TypeSyntaxDefaultExample
arrayliteral["request_id","user_id","transaction_id"]
merge_strategies(optional)

A map of field names to custom merge strategies. For each field specified this strategy will be used for combining events rather than the default behavior.

The default behavior is as follows:

  1. The first value of a string field is kept, subsequent values are discarded.
  2. For timestamp fields the first is kept and a new field [field-name]_end is added with the last received timestamp value.
  3. Numeric values are summed.
TypeSyntaxDefaultExample
hashliteral[{"method":"discard","path":"discard","duration_ms":"sum","query":"array"}]
starts_when(optional)

A condition used to distinguish the first event of a transaction. If this condition resolves to true for an event, the previous transaction is flushed (without this event) and a new transaction is started.

TypeSyntaxDefaultExample
stringliteral[".status_code != 200 && !includes([\"info\", \"debug\"], .severity)"]

How it Works

State

This component is stateful, meaning its behavior changes based on previous inputs (events). State is not preserved across restarts, therefore state-dependent behavior will reset between restarts and depend on the inputs (events) received since the most recent restart.