Skip to content

Commit

Permalink
Performance-Tuning enhancement proposal.
Browse files Browse the repository at this point in the history
  • Loading branch information
alanconway committed Feb 21, 2024
1 parent 9469001 commit e8e62c1
Showing 1 changed file with 274 additions and 0 deletions.
274 changes: 274 additions & 0 deletions enhancements/cluster-logging/performance-tuning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
---
title: performance-tuning
authors:
- "@alanconway"
reviewers:
- "@jcantrill"
approvers:
- "@jcantrill"
api-approvers:
- "@jcantrill"
creation-date: 2023-12-15
last-updated: 2023-12-15
tracking-link:
- https://issues.redhat.com/browse/OBSDA-549
see-also:
replaces:
superseded-by:
---

# Performance tuning

## Summary

A _performance tuning_ API to control performance, reliability and special protocol features of an output,
without exposing the complexity of the underlying collector configuration.

**Note**
- Only vector is be supported initial, there are no current plans to back-port to fluentd.
- Existing `output[].limits` rate limiting feature is separate from this proposal. The implementations may interact.

## Motivation

Performance and reliability tuning in the underlying collector configuration is complex and error prone.
We want to expose a sufficient set of controls for realistic use-cases,
but we don't want to expose the full configuration surface of Vector.

### User Stories

#### As a cluster logging administrator I want to minimize log loss.

``` yaml
outputs:
new - name: minimize-log-loss
tuning:
delivery: AtLeastOnce
```
#### As a cluster logging administrator I want to maximize throughput.
``` yaml
outputs:
- name: maximize-throughput
tuning:
delivery: AtMostOnce
```
#### As a cluster logging administrator I want to tweak features of a specific output.
``` yaml
outputs:
- name: detailed-tweaks
tuning:
compression: "zlib" # Enable protocol-specific compression
reconnect:
maxDelay: 1s # Wait max of 1 second between reconnect attempts
batch:
maxBytes: 50K # Send max 50K of data in each write.
```
### Goals
- Accommodate users with different performance and reliability trade-offs.
- Allow users to tune output parameters that we cannot optimize automatically.
- Expose a simple, general purpose, collector-neutral tuning API
- Use vector's end-to-end acknowledgements for more efficient reliability.
### Non-Goals
- Not a full end-to-end delivery guarantee, we are limited by reliability of source and sink.
- Not exposing underlying vector configuration details.
- No rate limiting - already provided by `outputs[].limit` field.
- No plans to support fluentd.

## Proposal

### Workflow Description

**logging administrator** is a human user responsible for setting output tuning parameters as needed.
The logging administrator needs to tune log collection performance and reliability to suit local requirements.

### API Extensions

New field `ClusterLogForwarder.output[].tuning` with the following sub-fields:

#### delivery: AtMostOnce|AtLeastOnce|AtLeastOncePersistent

##### `AtMostOnce`

Logs may be lost if the collector restarts or other faults occur.
Logs will not be duplicated _by the collector_.
They may be duplicated in protocol exchanges with the source or sink.
This mode does no persistent storage and no acknowledgement book-keeping, so has the lowest CPU and disk use.

Use `AtMostOnce` when:
- It is acceptable to lose some logs due to restarts or faults (network outage, remote store failures)
- It is important to minimize memory, CPU, disk and network costs.
- The logging system is expected to run near capacity and is likely to be overloaded.

##### `AtLeastOnce`

Logs _read by the collector_ will not be lost _by the collector_ due to restarts or faults.
Logs may still be lost
- Before the collector reads them: the collector cannot keep up with incoming log rate, e.g. log file rotation rate.
- After the collector sends them: if the output protocol is unreliable, or the target store can drop logs.
- If rate limiting is enabled: logs may be dropped by the collector to enforce rate limits.

Use `AtLeastOnce` when:
- It is important to avoid log loss due to restarts or failures.
- The collector is properly resourced (memory, CPU, disk) for the maximum log throughput at expected peak loads.
- Log rates are not expected to exceed the collection rate in normal operation.

`AtLeastOnce` uses end-to-end acknowledgements without persistent buffering if possible.
Acknowledgements give similar reliability to persistent buffering but are more efficient.
A persistent buffer is used if acknowledgements are not available.

##### `AtLeastOncePersistent`

Like `AtLeastOnce`, but forces the use of a persistent buffer, possibly in addition to acknowledgements.
Using buffer and acknowledgements together gives better reliability than buffering alone.

Use `AtLeastOncePersistent` when:
- You have a special situation where end-to-end acknowledgements alone do not work.
- You want to force the use of an extra-large buffer to work around large overload spikes.

**Note**: Very large buffers _cannot fix long-term overload_, they can only work-around temporary spikes.
The _long-term_ average throughput _must_ be within the collectors capacity, otherwise the system cannot "catch up".
Once the buffer fills, logs will be lost as if the buffer wasn't there,
and logs that are delivered will have high latency from waiting around in the buffer.

**Note**: The log collector _cannot_ guarantee fully reliable end-to-end delivery.
It has no control over the reliability or throughput of sources and sinks.
`AtMostOnce` takes advantage of reliability features at the source and sink, but the end-to-end result is only as good as the weakest link.

#### compression: string

Enable compression if supported
- Value indicates compression type (e.g. "zlib", "gzip"), valid values depend on the output type.
- Error if output does not support compression or does not recognize the value.

#### batch: object

A "batch" is the content of a single "write" or "send" operation.
Defaults are determined by the underlying sink.

- `maxRecords`: Max number of records to include in a batch.
- `maxBytes`: Max size in bytes of a batch

#### reconnect: object

Controls connect and reconnect attempts.
Defaults are determined by the underlying sink.

- `minDelay`: minimum delay between attempts.
- `maxDelay`: maximum delay between attempts.

`minDelay` for the first attempt, delay increases up to `maxDelay` and then repeats.
Back-off algorithm is determined by the underlying sink.

### Implementation Details

#### Causes of log loss

1. *Overload*: Logs are produced faster than they can be processed.
2. *Faults*: Collector restarts, network errors, remote store problems etc.

Log loss can be avoided for _temporary_ faults or overloads.
Persistent buffers and/or acknowledgements can store or re-send lost in-memory data.

_Sustained overload_ lasting long enough to exceed buffering capacity _will_ cause data loss.
The only remedy is to ensure that the log collector and store can keep up with the rate of log production.

#### Acknowledgements

[End-to-end acknowledgement](https://vector.dev/docs/about/under-the-hood/architecture/end-to-end-acknowledgements)
means that source acknowledgements are _delayed_ until all relevant sinks have received the data.
After restart, the source can re-send data that did not reach all sinks - the source acts as a persistent buffer.

This is more efficient than duplicating the data again in a vector disk buffer,
but only works for sources that support acknowledgement and "at-least-once" reliable delivery.

Examples of acknowledgement sources:
- Kafka: Kafka protocol has acknowledgements and at-least-once delivery.
- HTTP: HTTP response can be used to implement at-least-once.
Not all HTTP clients do this, but REST clients for data streaming usually do.
- **File**: Vector's persistent "position" file can be used like an "acknowledgement" for file sources.
Reading from the persisted position after restart is equivalent to at-least-once delivery.

#### Delivery policy implementation

`AtLeastOnce` always enables end-to-end persistence on sources and sinks that allow it.
If a source does not allow it, then `AtLeastOnce` implements a persistent buffer using the
same default size as Vector's default in-memory buffer.

``` pseudo-code
for each `AtLeastOnce` output:
set sink.acknowledgement=true for attached sink(s).
for each source, of each input, of each pipeline to the output:
if source can participate in acknowledgement:
set source.acknowledgement=true
else
set output.buffer=disk
```

`AtLeastOncePersistent` always enables disk buffering on the output.
It _also_ enables end-to-end persistence on sources and sinks that allow it.

Buffering and acknowledgement is more reliable than buffering alone.
Without acknowledgement records are dropped from the buffer as soon as they are sent,
with acknowledgement they are held until the remote acknowledges that the are safely stored.

**Note**: Rate limits set by `outputs[].limit` re still enforced with `AtLeastOnce*`, even though this
means deliberately dropping log records from the collector.
Review the existing limit code when implementing delivery policy so that the two work together properly.

### Risks and Mitigations

- Added complexity to forwarder.
- Support cost of customers abusing or misunderstanding the parameters.
- Increased customer demand for help "sizing" the logging stack: setting resources and predicting performance.

No new security risks are expected.

### Drawbacks
None.
## Design Details

### Open Questions
None.

### Test Plan
None yet.

May need improved tests to measure throughput and loss under sustained load.

### Graduation Criteria
None.

### Upgrade / Downgrade Strategy

Default configuration is backwards compatible.

### Version Skew Strategy
None.

### Operational Aspects of API Extensions

Backwards compatible extension to `ClusterLogForwarder` CR.

#### Failure Modes

Should make existing failure modes more predictable and reliable.

#### Support Procedures

None.


## Implementation History

None.

## Alternatives

Expose detailed vector configuration: fails our overall mission of simplified configuration.

0 comments on commit e8e62c1

Please sign in to comment.