Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create large-logs-dataset challenge #634

Closed

Conversation

kkrik-es
Copy link
Contributor

@kkrik-es kkrik-es commented Aug 2, 2024

-- Cloned from @salvatore-campagna 's #632

Introduce a new large-logs-dataset challenge to elastic/logs track which duplicates data indexed by restoring
a snapshot multiple times. The number of snapshot restore operations is controlled by the variable snapshot_restore_counts which by default has a value of 100.

This would result in indexing raw_data_volume_per_day bytes multiplied by snapshot_restore_counts.
As an example if raw_data_volume_per_day is 50 GB then the index will have about 5 TB of raw data.
Note that the index, anyway, will include duplicated data.

This is meant to be used just as a fast way to increase the amount of data in an index skipping the expensive data
generation and indexing process.

Resolves #631

@kkrik-es kkrik-es self-assigned this Aug 2, 2024
@kkrik-es kkrik-es closed this Oct 22, 2024
@kkrik-es kkrik-es deleted the new-large-logs-dataset-challenge branch October 22, 2024 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New large-logs-dataset challenge in elastic/logs track
1 participant