Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Updates to the Room DAG concepts development document #12179

Merged
merged 2 commits into from
Mar 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/12179.doc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Updates to the Room DAG concepts development document.
71 changes: 53 additions & 18 deletions docs/development/room-dag-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,37 +30,72 @@ rather than skipping any that arrived late; whereas if you're looking at a
historical section of timeline (i.e. `/messages`), you want to see the best
representation of the state of the room as others were seeing it at the time.

## Outliers

## Forward extremity
We mark an event as an `outlier` when we haven't figured out the state for the
room at that point in the DAG yet. They are "floating" events that we haven't
yet correlated to the DAG.

Most-recent-in-time events in the DAG which are not referenced by any other events' `prev_events` yet.
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we just grab the events in the state/auth chain,
without calculating the state at those events, or backfilling their
`prev_events`.
Comment on lines +39 to +42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence makes less practical sense to me than the previous one:

For example, when we fetch the event auth chain or state for a given event, we
mark all of those claimed auth events as outliers because we haven't done the
state calculation ourself.

Perhaps:

Suggested change
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we just grab the events in the state/auth chain,
without calculating the state at those events, or backfilling their
`prev_events`.
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we mark all of those claimed auth events as
outliers because we haven't done the state calculation ourself, or backfilled
their `prev_events`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what you're trying to get at with "we haven't done the state calculation ourself". There are other times where we don't calculate the state at an event and yet those events aren't outliers (eg: the join event when joining a room over federation).

Also: the fact that we haven't backfilled prev_events doesn't, in itself, make it an outlier.

Copy link
Contributor

@MadLittleMods MadLittleMods Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the big differentiator phrase to me is "mark all of those claimed auth events as outliers". Without something like that, it's not clear to me what we do after "we just grab the events in the state/auth chain".

The other phrasing is just notes on what Synapse is doing. By "ourself", I mean the local homeserver doing the work that we trust as a user on that server. Vs other servers who already calculated the auth_events on the outlier since it has auth_events and is available over federation but we don't know if that's absolutely correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I don't disagree that the wording could be clarified, but I'm still struggling to make improvements without saying things that are actually wrong. For example,

mark all of those claimed auth events as outliers

isn't really true: we only do so for (claimed) auth events that we didn't already have.

How about just adding a sentence:

Suggested change
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we just grab the events in the state/auth chain,
without calculating the state at those events, or backfilling their
`prev_events`.
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we just grab the events in the state/auth chain,
without calculating the state at those events, or backfilling their
`prev_events`. Since we don't have the state at any events fetched in that
way, we mark them as outliers.

Copy link
Contributor

@MadLittleMods MadLittleMods Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's better 👍

For some reason "claimed" clarifies the point that they can't be trusted for me which would be nice to include.

mark all of those claimed auth events as outliers

isn't really true: we only do so for (claimed) auth events that we didn't already have.

I feel like your suggestion also doesn't clarify this point but I feel like it doesn't matter whether this is clarified anyway. It's a separate fact that we don't re-outlier a persisted normal event. And the paragraph below slightly touches a bit on this point in a different way.

I'm not coming up without something better than the following but your suggestion is also a good improvement.

Suggested change
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we just grab the events in the state/auth chain,
without calculating the state at those events, or backfilling their
`prev_events`.
Outliers typically arise when we fetch the auth chain or state for a given
event. When that happens, we mark all of those claimed auth events that we
don't already have as outliers because we haven't done the state calculation
ourself, or backfilled their `prev_events`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in #12345 with your suggestion 👍


The forward extremities of a room are used as the `prev_events` when the next event is sent.
So, typically, we won't have the `prev_events` of an `outlier` in the database,
(though it's entirely possible that we *might* have them for some other
reason). Other things that make outliers different from regular events:
Comment on lines +45 to +46
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying we need to include them all here, but could you give a couple of examples for my own understanding? I imagine it is something like we have one, but we don't know the connection between the outlier and the previous events for some reason (maybe we previously left a room and were re-invited or something weird)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of examples:

  • When you first join a room, you pull in the current state events in that room as outliers, which will normally include the first few events (m.room.create, m.room.member, m.room.power_levels, etc). Obviously, the m.room.create is the prev_event of the first m.room.member. So we have both in the database.
  • Suppose we received a chunk of timeline, and then went offline for a while. The first thing that happens to happen when we go offline is a state event (call it S), and then there's a load of other activity. Later, we come back online. We don't fill the entire gap in the DAG, but we do end up pulling in any state that changed - including S - as outliers. So S is an outlier, but it just so happens that we have its prev_events in the database.

I can stick this in the body of the doc if it would be helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I don't know if we really need the examples, as you mentioned in chat it might be taken as a list of the only ways it happens.

richvdh marked this conversation as resolved.
Show resolved Hide resolved

* We don't have state for them, so there should be no entry in
`event_to_state_groups` for an outlier. (In practice this isn't always
the case, though I'm not sure why: see https:/matrix-org/synapse/issues/12201).

## Backward extremity
* We don't record entries for them in the `event_edges`,
`event_forward_extremeties` or `event_backward_extremities` tables.

The current marker of where we have backfilled up to and will generally be the
`prev_events` of the oldest-in-time events we have in the DAG. This gives a starting point when
backfilling history.
Since outliers are not tied into the DAG, they do not normally form part of the
timeline sent down to clients via `/sync` or `/messages`; however there is an
exception:

When we persist a non-outlier event, we clear it as a backward extremity and set
all of its `prev_events` as the new backward extremities if they aren't already
persisted in the `events` table.
### Out-of-band membership events

A special case of outlier events are some membership events for federated rooms
that we aren't full members of. For example:

## Outliers
* invites received over federation, before we join the room
Copy link
Member

@clokep clokep Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have local users Alice and Bob; remote user Charlie. Charlie and Bob share a room. Charlie invites Alice to join them.

In this case, do we still end up with an outlier / out-of-band membership event? The homeserver will have the full auth chain, etc. from Bob so I don't think so?

(I think the below cases are similar.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question!

I think the invite still follows the same codepath, even though we share a room. So yeah, it probably still ends up stored as an out-of-band-membership event. I don't really know, without trying it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the answer! I suspect it isn't important to figure out (it should do the right thing?), but figured I'd ask!

richvdh marked this conversation as resolved.
Show resolved Hide resolved
* *rejections* for said invites
* knock events for rooms that we would like to join but have not yet joined.

We mark an event as an `outlier` when we haven't figured out the state for the
room at that point in the DAG yet.
In all the above cases, we don't have the state for the room, which is why they
are treated as outliers. They are a bit special though, in that they are
proactively sent to clients via `/sync`.

We won't *necessarily* have the `prev_events` of an `outlier` in the database,
but it's entirely possible that we *might*.
## Forward extremity

Most-recent-in-time events in the DAG which are not referenced by any other
events' `prev_events` yet. (In this definition, outliers, rejected events, and
soft-failed events don't count.)

The forward extremities of a room (or at least, a subset of them, if there are
more than ten) are used as the `prev_events` when the next event is sent.

The "current state" of a room (ie: the state which would be used if we
generated a new event) is, therefore, the resolution of the room states
at each of the forward extremities.

## Backward extremity

The current marker of where we have backfilled up to and will generally be the
`prev_events` of the oldest-in-time events we have in the DAG. This gives a starting point when
backfilling history.

For example, when we fetch the event auth chain or state for a given event, we
mark all of those claimed auth events as outliers because we haven't done the
state calculation ourself.
Note that, unlike forward extremities, we typically don't have any backward
extremity events themselves in the database - or, if we do, they will be "outliers" (see
above). Either way, we don't expect to have the room state at a backward extremity.

When we persist a non-outlier event, if it was previously a backward extremity,
we clear it as a backward extremity and set all of its `prev_events` as the new
backward extremities if they aren't already persisted as non-outliers. This
therefore keeps the backward extremities up-to-date.

## State groups

Expand Down