Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce spooling to disk #6581

Merged
merged 4 commits into from
Apr 5, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ jobs:
go: $GO_VERSION
stage: test
- os: linux
env: TARGETS="-C libbeat stress-tests"
env: STRESS_TEST_OPTIONS="-timeout=20m -race -v -parallel 1" TARGETS="-C libbeat stress-tests"
go: $GO_VERSION
stage: test

Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ https:/elastic/beats/compare/v6.0.0-beta2...master[Check the HEAD di
- Add appender support to autodiscover {pull}6469[6469]
- Add add_host_metadata processor {pull}5968[5968]
- Retry configuration to load dashboards if Kibana is not reachable when the beat starts. {pull}6560[6560]
- Add support for spooling to disk to the beats event publishing pipeline. {pull}6581[6581]

*Auditbeat*

Expand Down
63 changes: 63 additions & 0 deletions NOTICE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,16 @@ License type (autodetected): Apache-2.0
Apache License 2.0


--------------------------------------------------------------------
Dependency: github.com/elastic/go-txfile
Version: v0.0.1
Revision: 7e7e33cc236f30fff545f3ee2c35ada5b70b6b13
License type (autodetected): Apache-2.0
./vendor/github.com/elastic/go-txfile/LICENSE:
--------------------------------------------------------------------
Apache License 2.0


--------------------------------------------------------------------
Dependency: github.com/elastic/go-ucfg
Version: v0.5.1
Expand Down Expand Up @@ -2052,6 +2062,41 @@ DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE
OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

--------------------------------------------------------------------
Dependency: github.com/theckman/go-flock
Version: v0.4.0
Revision: b139a2487364247d91814e4a7c7b8fdc69e342b2
License type (autodetected): BSD-3-Clause
./vendor/github.com/theckman/go-flock/LICENSE:
--------------------------------------------------------------------
Copyright (c) 2015, Tim Heckman
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of linode-netint nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

--------------------------------------------------------------------
Dependency: github.com/tsg/gopacket
Revision: f289b3ea3e41a01b2822be9caf5f40c01fdda05c
Expand Down Expand Up @@ -2087,6 +2132,24 @@ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

--------------------------------------------------------------------
Dependency: github.com/urso/go-bin
Revision: 781c575c9f0eb3cb9dca94521bd7ad7d5aec7fd4
License type (autodetected): Apache-2.0
./vendor/github.com/urso/go-bin/LICENSE:
--------------------------------------------------------------------
Apache License 2.0


--------------------------------------------------------------------
Dependency: github.com/urso/qcgen
Revision: 0b059e7db4f40a062ca3d975b7500c6a0a968d87
License type (autodetected): Apache-2.0
./vendor/github.com/urso/qcgen/LICENSE:
--------------------------------------------------------------------
Apache License 2.0


--------------------------------------------------------------------
Dependency: github.com/vmware/govmomi
Revision: 2cad15190b417804d82edb4981e7b3e62907c4ee
Expand Down
60 changes: 60 additions & 0 deletions auditbeat/auditbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,66 @@ auditbeat.modules:
# if the number of events stored in the queue is < min_flush_events.
#flush.timeout: 1s

# The spool queue will store events in a local spool file, before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume only 1 queue can be used at the same time. Should we mention that in a comment on line 130?

Copy link
Author

@urso urso Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to? If you configure 2 queue types, beats will report an error that only one is allowed. It's using the dictionary style plugin config we have in quite a few places in beats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly brought it up as people keep trying to use 2 outputs in Beats and they will for queues. So having it documented makes it easy to point people to it. So they know it already from the docs before running the docs.

# forwarding the events to the outputs.
#
# Beta: spooling to disk is currently a beta feature. Use with care.
#
# The spool file is a circular buffer, which blocks once the file/buffer is full.
# Events are put into a write buffer and flushed once the write buffer
# is full or the flush_timeout is triggered.
# Once ACKed by the output, events are removed immediately from the queue,
# making space for new events to be persisted.
#spool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call this file or disk? It would make it more obvious that is persistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably align with the naming/convention if/when possible with Logstash persistent queue.

In that case we are using queue.type accepting theses two values: memory (default) persisted

Copy link
Author

@urso urso Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ph @ruflin

Hmmm... introducing queue.type would be a bc-breaking change. As one can have only one queue type, the queue config follows the dictionary style config-pattern like:

queue.<type>:

Whats wrong with spool?

Right now this one implements a FIFO. For metricbeat use cases we might actually introduce another persistent LIFO queue type. This one should have another type. How would you name any of these now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bc-breaking After looking at filebeat.reference.yml

#queue:
  # Queue type by name (default 'mem')
  # The memory queue will present all available events (up to the outputs
  # bulk_max_size) to the output, the moment the output is ready to server
  # another batch of events.
  #mem:
    # Max number of events the queue can buffer.

You are right, actually I think our config make it cleaner. @ruflin @kvch I would vote to keep @urso suggestion.
Concerning the naming: spool or disk or file, I think spool is a really common name for that kind of settings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the LIFO, FIFO, its more like a priority, It could also become more complex and take into consideration the kind of events, on a storage system disk usage vs cpu (probably a bad example.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should introduce a type because of the reasons mentioned by @urso. There can only be one queue at the time.

For me spooling can be to disk or memory. Historically we had a spooler in filebeat which kept the data in memory. Other options would be to call it file?

+1 on the proposal from @ph about the priority. FIFO or LIFO is a config option of the queue. It can mean in the background a completely different implementation but the user should not have to worry about that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding priority I was initially thinking the same. But if we have a priority setting, user must be allowed to change the setting between restarts.

But async ACK behaviour of the queue makes the on-disk structure a little more complicated. When introducing stack like functionality, we will end up with a many holes. That is, freeing space will be somewhat more complicated in the LIFO case. I'd like to solve the LIFO case separately, potentially merging both cases into a common file format, later in time. Priority based queue, using heaps might become even more complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@urso Look like a good compromise.

@ruflin You have a point, Disk or File at the discretion of @urso.

# The file namespace configures the file path and the file creation settings.
# Once the file exists, the `size`, `page_size` and `prealloc` settings
# will have no more effect.
#file:
# Location of spool file. The default value is ${path.data}/spool.dat.
#path: "${path.data}/spool.dat"

# Configure file permissions if file is created. The default value is 0600.
#permissions: 0600

# File size hint. The spool blocks, once this limit is reached. The default value is 100 MiB.
#size: 100MiB

# The files page size. A file is split into multiple pages of the same size. The default value is 4KiB.
#page_size: 4KiB

# If prealloc is set, the required space for the file is reserved using
# truncate. The default value is true.
#prealloc: true

# Spool writer settings
# Events are serialized into a write buffer. The write buffer is flushed if:
# - The buffer limit has been reached.
# - The configured limit of buffered events is reached.
# - The flush timeout is triggered.
#write:
# Sets the write buffer size.
#buffer_size: 1MiB

# Maximum duration after which events are flushed, if the write buffer
# is not full yet. The default value is 1s.
#flush.timeout: 1s

# Number of maximum buffered events. The write buffer is flushed once the
# limit is reached.
#flush.events: 16384

# Configure the on-disk event encoding. The encoding can be changed
# between restarts.
# Valid encodings are: json, ubjson, and cbor.
#codec: cbor
#read:
# Reader flush timeout, waiting for more events to become available, so
# to fill a complete batch, as required by the outputs.
# If flush_timeout is 0, all available events are forwarded to the
# outputs immediately.
# The default value is 0s.
#flush.timeout: 0s

# Sets the maximum number of CPUs that can be executing simultaneously. The
# default is the number of logical CPUs available in the system.
#max_procs:
Expand Down
60 changes: 60 additions & 0 deletions filebeat/filebeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,66 @@ filebeat.inputs:
# if the number of events stored in the queue is < min_flush_events.
#flush.timeout: 1s

# The spool queue will store events in a local spool file, before
# forwarding the events to the outputs.
#
# Beta: spooling to disk is currently a beta feature. Use with care.
#
# The spool file is a circular buffer, which blocks once the file/buffer is full.
# Events are put into a write buffer and flushed once the write buffer
# is full or the flush_timeout is triggered.
# Once ACKed by the output, events are removed immediately from the queue,
# making space for new events to be persisted.
#spool:
# The file namespace configures the file path and the file creation settings.
# Once the file exists, the `size`, `page_size` and `prealloc` settings
# will have no more effect.
#file:
# Location of spool file. The default value is ${path.data}/spool.dat.
#path: "${path.data}/spool.dat"

# Configure file permissions if file is created. The default value is 0600.
#permissions: 0600

# File size hint. The spool blocks, once this limit is reached. The default value is 100 MiB.
#size: 100MiB

# The files page size. A file is split into multiple pages of the same size. The default value is 4KiB.
#page_size: 4KiB

# If prealloc is set, the required space for the file is reserved using
# truncate. The default value is true.
#prealloc: true

# Spool writer settings
# Events are serialized into a write buffer. The write buffer is flushed if:
# - The buffer limit has been reached.
# - The configured limit of buffered events is reached.
# - The flush timeout is triggered.
#write:
# Sets the write buffer size.
#buffer_size: 1MiB

# Maximum duration after which events are flushed, if the write buffer
# is not full yet. The default value is 1s.
#flush.timeout: 1s

# Number of maximum buffered events. The write buffer is flushed once the
# limit is reached.
#flush.events: 16384

# Configure the on-disk event encoding. The encoding can be changed
# between restarts.
# Valid encodings are: json, ubjson, and cbor.
#codec: cbor
#read:
# Reader flush timeout, waiting for more events to become available, so
# to fill a complete batch, as required by the outputs.
# If flush_timeout is 0, all available events are forwarded to the
# outputs immediately.
# The default value is 0s.
#flush.timeout: 0s

# Sets the maximum number of CPUs that can be executing simultaneously. The
# default is the number of logical CPUs available in the system.
#max_procs:
Expand Down
60 changes: 60 additions & 0 deletions heartbeat/heartbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,66 @@ heartbeat.scheduler:
# if the number of events stored in the queue is < min_flush_events.
#flush.timeout: 1s

# The spool queue will store events in a local spool file, before
# forwarding the events to the outputs.
#
# Beta: spooling to disk is currently a beta feature. Use with care.
#
# The spool file is a circular buffer, which blocks once the file/buffer is full.
# Events are put into a write buffer and flushed once the write buffer
# is full or the flush_timeout is triggered.
# Once ACKed by the output, events are removed immediately from the queue,
# making space for new events to be persisted.
#spool:
# The file namespace configures the file path and the file creation settings.
# Once the file exists, the `size`, `page_size` and `prealloc` settings
# will have no more effect.
#file:
# Location of spool file. The default value is ${path.data}/spool.dat.
#path: "${path.data}/spool.dat"

# Configure file permissions if file is created. The default value is 0600.
#permissions: 0600

# File size hint. The spool blocks, once this limit is reached. The default value is 100 MiB.
#size: 100MiB

# The files page size. A file is split into multiple pages of the same size. The default value is 4KiB.
#page_size: 4KiB

# If prealloc is set, the required space for the file is reserved using
# truncate. The default value is true.
#prealloc: true

# Spool writer settings
# Events are serialized into a write buffer. The write buffer is flushed if:
# - The buffer limit has been reached.
# - The configured limit of buffered events is reached.
# - The flush timeout is triggered.
#write:
# Sets the write buffer size.
#buffer_size: 1MiB

# Maximum duration after which events are flushed, if the write buffer
# is not full yet. The default value is 1s.
#flush.timeout: 1s

# Number of maximum buffered events. The write buffer is flushed once the
# limit is reached.
#flush.events: 16384

# Configure the on-disk event encoding. The encoding can be changed
# between restarts.
# Valid encodings are: json, ubjson, and cbor.
#codec: cbor
#read:
# Reader flush timeout, waiting for more events to become available, so
# to fill a complete batch, as required by the outputs.
# If flush_timeout is 0, all available events are forwarded to the
# outputs immediately.
# The default value is 0s.
#flush.timeout: 0s

# Sets the maximum number of CPUs that can be executing simultaneously. The
# default is the number of logical CPUs available in the system.
#max_procs:
Expand Down
60 changes: 60 additions & 0 deletions libbeat/_meta/config.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,66 @@
# if the number of events stored in the queue is < min_flush_events.
#flush.timeout: 1s

# The spool queue will store events in a local spool file, before
# forwarding the events to the outputs.
#
# Beta: spooling to disk is currently a beta feature. Use with care.
#
# The spool file is a circular buffer, which blocks once the file/buffer is full.
# Events are put into a write buffer and flushed once the write buffer
# is full or the flush_timeout is triggered.
# Once ACKed by the output, events are removed immediately from the queue,
# making space for new events to be persisted.
#spool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mark this beta at first, in the code as a log message, here in the config and in the docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# The file namespace configures the file path and the file creation settings.
# Once the file exists, the `size`, `page_size` and `prealloc` settings
# will have no more effect.
#file:
# Location of spool file. The default value is ${path.data}/spool.dat.
#path: "${path.data}/spool.dat"

# Configure file permissions if file is created. The default value is 0600.
#permissions: 0600

# File size hint. The spool blocks, once this limit is reached. The default value is 100 MiB.
#size: 100MiB

# The files page size. A file is split into multiple pages of the same size. The default value is 4KiB.
#page_size: 4KiB

# If prealloc is set, the required space for the file is reserved using
# truncate. The default value is true.
#prealloc: true

# Spool writer settings
# Events are serialized into a write buffer. The write buffer is flushed if:
# - The buffer limit has been reached.
# - The configured limit of buffered events is reached.
# - The flush timeout is triggered.
#write:
# Sets the write buffer size.
#buffer_size: 1MiB

# Maximum duration after which events are flushed, if the write buffer
# is not full yet. The default value is 1s.
#flush.timeout: 1s

# Number of maximum buffered events. The write buffer is flushed once the
# limit is reached.
#flush.events: 16384

# Configure the on-disk event encoding. The encoding can be changed
# between restarts.
# Valid encodings are: json, ubjson, and cbor.
#codec: cbor
#read:
# Reader flush timeout, waiting for more events to become available, so
# to fill a complete batch, as required by the outputs.
# If flush_timeout is 0, all available events are forwarded to the
# outputs immediately.
# The default value is 0s.
#flush.timeout: 0s

# Sets the maximum number of CPUs that can be executing simultaneously. The
# default is the number of logical CPUs available in the system.
#max_procs:
Expand Down
4 changes: 4 additions & 0 deletions libbeat/publisher/includes/includes.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
package includes

import (
// import queue types
_ "github.com/elastic/beats/libbeat/publisher/queue/memqueue"
_ "github.com/elastic/beats/libbeat/publisher/queue/spool"

// load supported output plugins
_ "github.com/elastic/beats/libbeat/outputs/console"
_ "github.com/elastic/beats/libbeat/outputs/elasticsearch"
Expand Down
Loading