Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAC] Solidify schema (UUIDs per document) #112541

Closed
jasonrhodes opened this issue Sep 20, 2021 · 12 comments
Closed

[RAC] Solidify schema (UUIDs per document) #112541

jasonrhodes opened this issue Sep 20, 2021 · 12 comments
Assignees
Labels
refined Issue refined, ready to work on Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Theme: rac label obsolete v7.16.0

Comments

@jasonrhodes
Copy link
Member

jasonrhodes commented Sep 20, 2021

Document Relationship Map

AC:

  • Fill out the (?) values in the following field and log a ticket or fix the actual mappings so that they represent this relationship mapping

Field source spreadsheet

Need to figure these out now

Entity Primary Key(s) Foreign Keys?
Rule Type ID used when calling alerting.registerType

Note that rule types are registered when kibana loads and are not persisted.
producer: app that registers the rule
Rule Saved Object ID (1) rule_type_id: string used to register the rule type

consumer: plugin that registered the rule

Alert type(2)
Execution Not persisted. It exists as a foreign key in the alert, in kibana.alert.rule.execution.uuid
Alert _id and kibana.alert.uuid (same value, but two different fields)

kibana.alert.instance.id used to identify a pre-existing alert with the executor
kibana.alert.rule.rule_type_id: string used to register the rule type

kibana.alert.rule.uuid: rule instance(1)

kibana.alert.rule.producer: the rule type producer

kibana.alert.rule.consumer: the rule consumer

kibana.alert.group.id or kibana.alert.rule.execution.uuid (3) (see #110135)

We just need to be aware of these but we can figure them out for certain later

Document Primary Key(s) Foreign Keys?
Evaluation ? (Alert ID, Rule ID)
Log Event (?) ? ?

footnotes

(1) Saved object IDs used as foreign keys might be affected by the planned ID changes

(2) Security needs some other way to identify an "alert type" that is not unique

(3) This field is proposed but hasn't materialized yet

@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 20, 2021
@jasonrhodes jasonrhodes added Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Theme: rac label obsolete labels Oct 5, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Oct 5, 2021
@jasonrhodes jasonrhodes added needs-team Issues missing a team label v7.16.0 labels Oct 5, 2021
@botelastic botelastic bot removed the needs-team Issues missing a team label label Oct 5, 2021
@jasonrhodes jasonrhodes added the refined Issue refined, ready to work on label Oct 5, 2021
@jasonrhodes jasonrhodes changed the title [RAC] Solidify schema (with UUIDs, but only necessary fields for observability alerts table and cases) [RAC] Solidify schema (UUIDs per document) Oct 7, 2021
@gmmorris
Copy link
Contributor

gmmorris commented Oct 12, 2021

I'm following the Alerts-as-data schema document when I say this, but my understanding is that we agreed on alerts using the following:

  1. kibana.alert.instance.id maps to the alert.id that is used in the framework. The framework uses this to group the alerts together, so if the rule reports the same alert.id across multiple successive executions we group these together into a single alert that is active for the span of these multiple executions. Once an execution omits that alert.id we consider this alert recovered and schedule the recovery actions. This id can repeat if the rule detects the same alert.id and the alert will go back to an active state.
  2. kibana.alert.uuid is a unique ID that alerts-as-data produces and should correspond with a unique span of executions during which the same alert.id was active (so in the example above you would end up with two uuids, one for each span). This doesn't exist at framework level yet and once alerts-as-data becomes framework level we'd expect this UUID to be available to actions so that customers can distinguish spans from one another and produce different groupings (for PagerDuty incidents, for example).

@afgomez
Copy link
Contributor

afgomez commented Oct 14, 2021

This id can repeat if the rule detects the same alert.id and the alert will go back to an active state.

Does this mean we end up with two alert documents or that we modify the status of the original document back to active?

@weltenwort
Copy link
Member

weltenwort commented Oct 15, 2021

Does this mean we end up with two alert documents or that we modify the status of the original document back to active?

I don't think so. In the alert executor's state we're tracking the mapping of kibana.alert.instance.id to kibana.alert.uuid. On evaluation the following happens:

  1. If an alert instance is created and the kibana.alert.instance.id is in the set of tracked alerts, we take the associated kibana.alert.uuid and use it to update the existing alert document. The ids remain in the set of tracked alerts. This means the alert stays active.
  2. If an alert instance is created and the kibana.alert.instance.id is not in the set of tracked alerts, we generate a new kibana.alert.uuid, index a new alert document and add the mapping of kibana.alert.instance.id to kibana.alert.uuid to the set of tracked alerts. This means a new alert has become active.
  3. If in the end there are kibana.alert.instance.ids in the set of tracked alerts, we take the associated kibana.alert.uuid and use it to update the existing alert document to become resolved. The ids are then removed from the set of tracked alerts, so subsequent occurrences of the kibana.alert.instance.id (e.g. the same hostname) will be considered to be new alerts and handled according to case 2 above.

@afgomez
Copy link
Contributor

afgomez commented Oct 15, 2021

If I understand correctly we end up with two alert documents then, with a shared kibana.alert.instance.id.

In any case I think we can still consider it as a primary key of the alert document

@weltenwort
Copy link
Member

If I understand correctly we end up with two alert documents then, with a shared kibana.alert.instance.id.

Yep.

In any case I think we can still consider it as a primary key of the alert document

Wouldn't "primary key" imply that it's unique?

@jasonrhodes
Copy link
Member Author

On the "Alert" document, as a FK we have this:

kibana.alert.rule.uuid: rule instance(1)

We should specify exactly what type of ID we want to store here, whether it's a so-called "legacy" Saved Object ID or the newly generated type (or if this question even makes sense)?

@mikecote @gmmorris can you let us know about how the Rule saved objects are or will be affected by the saved object ID migration in 8.0?

@mikecote
Copy link
Contributor

@mikecote @gmmorris can you let us know about how the Rule saved objects are or will be affected by the saved object ID migration in 8.0?

If you store the current rule SO ID regardless of the version and capture the release version somewhere alongside. You can build queries to filter data based on the legacy or new ID (example: (7.x && id:1) || (8.x && id:2)). As the framework knows the legacyId that was used in 7.x.

To do a lookup of the rule based on the alert document, you can use the SO resolve API unless there's a more robust way to do so (maybe rule lookup by legacyId & space? 🤔)

@jasonrhodes
Copy link
Member Author

jasonrhodes commented Oct 18, 2021

To do a lookup of the rule based on the alert document

Yes this is the use case I'm thinking about. So to clarify:

  1. Pre-8.0, the saved object ID we have access to is a "legacy SOID"
  2. Starting in 8.0, (all? | some?) saved objects will have a new form of SOID
  3. Rules maintain a reference to both IDs (id vs legacyId)
  4. Given a rule, you can query for its associated alerts similarly to this pseudocode:
(alert.kibana.version < 8 && alert.kibana.alert.rule.id === rule.legacyId) 
  || (alert.kibana.version >= 8 && alert.kibana.alert.rule.id === rule.id)
  1. Given an alert, you can query for its associated rule similarly to this pseudocode:
if (alert.kibana.version < 8) {
  // Construct query with condition: rule.legacyId === alert.kibana.rule.id
} else {
  // Construct query with condition: rule.id === alert.kibana.rule.id
}

If the above is true, we should be ok to only store kibana.alert.rule.id and set it to whatever saved object ID is given to us for the rule at the time of indexing (rather than needing to store the legacy ID and pre-calculate the new ID stored separately).

@jportner would you mind glancing at this and letting me know if I'm making any sense?

@jportner
Copy link
Contributor

  1. Starting in 8.0, (all? | some?) saved objects will have a new form of SOID

Here is the criteria for determine if an existing object's ID will change in 8.0:

  1. The object type is converted to become "share-capable" (namespaceType: 'multiple-isolated'), and
  2. The object exists in custom space (non-Default space)

The alert ("Rule") and action ("Connector") object types are both being converted in 8.0 👍

  1. Rules maintain a reference to both IDs (id vs legacyId)

I'd prefer not to use the term "reference" here, it means something else when we are talking about saved objects.

But yes, Rule objects do maintain their legacy ID in their attributes (e.g., "type-level fields").

So, in 7.16, a Rule would have these fields:

{
  id: '123',
  attributes: {
    legacyId: '123'
  },
  namespace: 'some-space'
}

and after it is converted in 8.0, it would have these fields:

{
  id: '456',
  attributes: {
    legacyId: '123'
  },
  namespaces: ['some-space']
}

4. Given a rule, you can query for its associated alerts similarly to this pseudocode:

(alert.kibana.version < 8 && alert.kibana.alert.rule.id === rule.legacyId) 
  || (alert.kibana.version >= 8 && alert.kibana.alert.rule.id === rule.id)
  1. Given an alert, you can query for its associated rule similarly to this pseudocode:
if (alert.kibana.version < 8) {
  // Construct query with condition: rule.legacyId === alert.kibana.rule.id
} else {
  // Construct query with condition: rule.id === alert.kibana.rule.id
}

This point is tripping me up; I thought "Rule" is just a new name for the alert object type?

If the above is true, we should be ok to only store kibana.alert.rule.id and set it to whatever saved object ID is given to us for the rule at the time of indexing (rather than needing to store the legacy ID and pre-calculate the new ID stored separately).

I'd like to make sure I understand (4) and (5) above before attempting to answer this.

@jasonrhodes
Copy link
Member Author

From @kobelb -- legacyIds can be the same for two different rules in two different spaces. Have we accounted for this?

@jasonrhodes
Copy link
Member Author

@afgomez thanks for all of your work on tracking this information down and documenting it.

Two conclusions from this work:

  1. We have marked the data generated by the ruleRegistry plugin as "experimental" and "subject to change", giving us the option to change the name of the backing indices and effectively "drop" all existing data at any release. That means that if we have old SO IDs causing us trouble, we'll drop the 7.16 data in 8.0.
  2. We need to have a full Schema RFC which I will work together with the Actionable Observability team to put together with the RAC team ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refined Issue refined, ready to work on Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Theme: rac label obsolete v7.16.0
Projects
None yet
Development

No branches or pull requests

7 participants