Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zqd: Support pcap ingest for archive stores #1450

Merged
merged 10 commits into from
Oct 14, 2020
Merged

Conversation

mattnibs
Copy link
Collaborator

@mattnibs mattnibs commented Oct 9, 2020

Because of differences in how an archive ingests data implement
new ingest.PcapOp (ingest.archivePcapOp) for archive-backed spaces.

In order to take advantage of the streaming nature of ingest, abandon
the brute force snapshort approach taken with file stores. Instead,
temporary log files created by each pcap analyzer is are tailed and
written to the archive store as new data arrives (see fs.TFile and
Fs.DirWatcher). As a result for pcap ingest on archive stores, new
data should constantly arriving.

In order to continue pcap ingest support for soon-to-be deprecated
filestore spaces, keep around existing ingest.PcapOp (renamed to
ingest.legacyPcapOp).

closes #1132

@mattnibs mattnibs force-pushed the archive-pcap-ingest branch 4 times, most recently from 5f8a9c2 to 661152b Compare October 9, 2020 16:13
@mattnibs mattnibs requested a review from a team October 9, 2020 16:14
pkg/fs/tail.go Outdated Show resolved Hide resolved
pkg/fs/tail.go Show resolved Hide resolved
pkg/fs/tail.go Outdated Show resolved Hide resolved
pkg/fs/tail.go Outdated Show resolved Hide resolved
pkg/fs/tail_test.go Outdated Show resolved Hide resolved
zqd/ingest/legacypcap.go Outdated Show resolved Hide resolved
Because of differences in how an archive ingests data implement
new ingest.PcapOp (ingest.archivePcapOp) for archive-backed spaces.

In order to take advantage of the streaming nature of ingest, abandon
the brute force snapshort approach taken with file stores. Instead,
temporary log files created by each pcap analyzer is are tailed and
written to the archive store as new data arrives (see fs.TFile and
Fs.DirWatcher).  As a result for pcap ingest on archive stores, new
data should constantly arriving.

In order to continue pcap ingest support for soon-to-be deprecated
filestore spaces, keep around existing ingest.PcapOp (renamed to
ingest.legacyPcapOp).

closes #1132
@mattnibs
Copy link
Collaborator Author

I'm noticing a race with fs.DirWatcher that appears to show up in linux tests not mac: DirWatcher.Stop() can be called when fsnotify has not yet got the event for a created file. I'm going to add an additional poll call once Stop() is called to scan and emit any existing files that were not caught.

driver/driver.go Outdated Show resolved Hide resolved
pkg/fs/tail.go Show resolved Hide resolved
pkg/fs/tail.go Outdated Show resolved Hide resolved
Copy link
Contributor

@alfred-landrum alfred-landrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this gets merged, I'd like to try this out with Brim, by temporarily making archive store the default.

zqd/ingest/archivepcap.go Outdated Show resolved Hide resolved
zqd/ingest/archivepcap.go Outdated Show resolved Hide resolved
zqd/ingest/archivepcap.go Outdated Show resolved Hide resolved
return atomic.LoadInt64(&r.recordsRead)
}

// Snap for archivePcapOp is functionally useless. It is only here to satisfy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we can run Brim against this branch, I'm curious how this impacts the live updating. I thought Brim triggered updating its view during pcap by looking for new snapshots, but I haven't looked at that code in while.

Copy link
Collaborator Author

@mattnibs mattnibs Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, it is. Without an incrementing snapshot brim will not know new data is available to query. As discussed, this should really be keeping track of record count (and bytes) written to the archive, return it in the status updates, and Brim should use this to establish if new data is available. Will file an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dr *logTailer
}

func TestLogTailer(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to register that I'm +1 on using testify/suite.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is definitely very useful for certain circumstances.

pkg/fs/tail.go Outdated Show resolved Hide resolved
zqd/ingest/logtailer.go Outdated Show resolved Hide resolved
mattnibs and others added 2 commits October 13, 2020 17:45
Co-authored-by: Noah Treuhaft <[email protected]>
Co-authored-by: Noah Treuhaft <[email protected]>
Copy link
Contributor

@alfred-landrum alfred-landrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled this branch, merged to master, altered zqd to choose archive store by default, and then ran Brim against it. I then uploaded the 'all.pcap' file & watched the archive tsdir's get created during import. (as noted, the live updates in Brim aren't working yet, but that will happen soon via #1488 ). I was able to verify that the resulting archive had the same count as the doing the process with the filestore. So 👍 from me.

@mattnibs
Copy link
Collaborator Author

Thanks for the thorough review @alfred-landrum

(as noted, the live updates in Brim aren't working yet, but that will happen soon via #1448 ).

You mean #1488, correct?

@alfred-landrum
Copy link
Contributor

Thanks for the thorough review @alfred-landrum

(as noted, the live updates in Brim aren't working yet, but that will happen soon via #1448 ).

You mean #1488, correct?

oops, yes, fixed the comment!

@mattnibs mattnibs merged commit 5c0f1f5 into master Oct 14, 2020
@mattnibs mattnibs deleted the archive-pcap-ingest branch October 14, 2020 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zqd: Support archive space pcap ingest support
3 participants