Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset Quality] Add fix it flow for field limit #195561

Open
wants to merge 59 commits into
base: main
Choose a base branch
from

Conversation

achyutjhunjhunwala
Copy link
Contributor

@achyutjhunjhunwala achyutjhunjhunwala commented Oct 9, 2024

Summary

Closes - #190330

This PR implements the logic to support

  • One click increasing of Field Limit for Field Limit Issues (applicable on for Integrations). For Non Integrations, only text is displayed as to how they can do it.
  • The One click increase updates the linked custom component template as well as the last backing Index
  • If Last Backing Index update fails due to any reason, it provides user an option to trigger a Rollover manually.

Demo

Not possible, to many things to display 😆

What's Pending ?

Tests

  • API tests
    • Settings API
    • Rollover API
    • Apply New limit API
  • FTR tests
    • Displaying of various issues for integrations and non integrations
    • Fix it Flow Good case, without Rollover
    • Fix it Flow Good case, with Rollover
    • Manual Mitigation - Click on Component Template shold navigate to proper logic based on Integration / Non
    • Manual Mitigation - Ingest Pipeline
    • Link for official Documentation

How to setup a local environment

We will be setting up 2 different data streams, one with integration and one without. Please follow the steps in the exact order

  1. Start Local ES and Local Kibana
  2. Install Nginx Integration 1st
  3. Ingest data as per script here - https://gist.github.com/achyutjhunjhunwala/03ea29190c6594544f584d2f0efa71e5
  4. Set the Limit for the 2 datasets
PUT logs-synth.3-default/_settings
{
    "mapping.total_fields.limit": 36
}

// Set the limit for Nginx
PUT logs-nginx.access-default/_settings
{
    "mapping.total_fields.limit": 52
}
  1. Now uncomment line number 59 from the synthtrace script to enable cloud.project.id field and run the scenario again
  2. Do a Rollover
POST logs-synth.3-default/_rollover
POST logs-nginx.access-default/_rollover
  1. Get last backing index for both dataset
GET _data_stream/logs-synth.3-default/
GET _data_stream/logs-nginx.access-default
  1. Increase the Limit by 1 but for last backing index
PUT .ds-logs-synth.3-default-2024.10.10-000002/_settings
{
    "mapping.total_fields.limit": 37
}

PUT .ds-logs-nginx.access-default-2024.10.10-000002/_settings
{
    "mapping.total_fields.limit": 53
}
  1. Run the same Synthtrace scenario again.

This setup will give you 3 fields for testings

  1. cloud.availability_zone - Which will show the character limit isue
  2. cloud.project - Which will show an obsolete error which happened in the past and now does not exists due to field limit
  3. cloud.project.id - A current field limit issue

@achyutjhunjhunwala achyutjhunjhunwala marked this pull request as ready for review October 21, 2024 11:58
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

Comment on lines 35 to 32
const { body } = await supertestWithoutAuth
const { body } = await roleScopedSupertestWithCookieCredentials
.post(`/api/fleet/epm/custom_integrations`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it wasn't intentional: Fleet API is public (no internal in route path) and should be called with API key as it was before change.

I think you need to revert it and pass API key from test. You can get it similar to cookie header:

      supertestAdminWithApiKey = await roleScopedSupertest.getSupertestWithRoleScope(
        'admin',
        {
          withInternalHeaders: true,
        }
      );

Keep in mind to invalidate it in after hook

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point. The reason i did that was for all our tests, which are internal API, i was already generating roleScopedSupertestWithCookieCredentials, so i though to use the same for Fleet packages.

But your point is valid as fleet api is public, i must authenticate using API key rather than header. I will get both API key (for Fleet) and headers for my test

@botelastic botelastic bot added the ci:project-deploy-observability Create an Observability project label Oct 21, 2024
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

| {
pipeline: string;
}
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use the Management plugin locator and not the Index Management/Ingest Pipeline locators? I'm concerned that we hardcode the paths and if we change app/section id's in the future, this locator would not work properly. For Index Management, we recently added a locator that can be enhanced for templates: #195299. Ingest pipelines already has an existing locator that can be used for this use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ElenaStoeva ,
thank you for pointing towards the Ingest Pipeline locator, that's really great.
For Component Template and Index Template, do you propose to extend the IndexManagementLocator instead of Management Locator ?

I don't this understand how this will break the existing Management Locator. The Management Locator will still break if app/section id's are changed in the future. I just tried adding more options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Component Template and Index Template, do you propose to extend the IndexManagementLocator instead of Management Locator ?

Yes, that's correct.

I don't this understand how this will break the existing Management Locator. The Management Locator will still break if app/section id's are changed in the future. I just tried adding more options.

These changes won't break the Management Locator but there's a risk that the hardcoded paths that you added (e.g. /data/index_management/component_templates/) might not be valid at some point if we decide to change the index management section or app ID. Therefore, getLocation would return an incorrect path. That's why it's safer if we use the Index Management/Ingest Pipelines locators as they take care of using the correct path components.

@achyutjhunjhunwala achyutjhunjhunwala added the backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) label Oct 21, 2024
@elasticmachine
Copy link
Contributor

elasticmachine commented Oct 22, 2024

💔 Build Failed

  • Buildkite Build
  • Commit: 5f3c371
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-195561-5f3c3710ca7f

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #66 / apis Dataset quality Update field limit "after all" hook for "should update last backing index and custom component template"
  • [job] [logs] FTR Configs #66 / apis Dataset quality Update field limit "after all" hook for "should update last backing index and custom component template"
  • [job] [logs] FTR Configs #66 / apis Dataset quality Update field limit "before all" hook for "should handles failure gracefully when invalid datastream provided "
  • [job] [logs] FTR Configs #66 / apis Dataset quality Update field limit "before all" hook for "should handles failure gracefully when invalid datastream provided "
  • [job] [logs] FTR Configs #90 / Serverless Observability - Deployment-agnostic api integration tests Dataset quality Update field limit "after all" hook for "should update last backing index and custom component template"
  • [job] [logs] FTR Configs #90 / Serverless Observability - Deployment-agnostic api integration tests Dataset quality Update field limit "after all" hook for "should update last backing index and custom component template"
  • [job] [logs] FTR Configs #90 / Serverless Observability - Deployment-agnostic api integration tests Dataset quality Update field limit "before all" hook for "should handles failure gracefully when invalid datastream provided "
  • [job] [logs] FTR Configs #90 / Serverless Observability - Deployment-agnostic api integration tests Dataset quality Update field limit "before all" hook for "should handles failure gracefully when invalid datastream provided "

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
datasetQuality 225 235 +10

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
datasetQuality 229.0KB 246.9KB +17.9KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
management 10.8KB 11.0KB +252.0B

History

cc @achyutjhunjhunwala

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) ci:project-deploy-observability Create an Observability project Feature:Dataset Health release_note:feature Makes this part of the condensed release notes Team:obs-ux-logs Observability Logs User Experience Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants