Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Op Event Changes #129

Closed
wants to merge 1 commit into from
Closed

Conversation

roark
Copy link

@roark roark commented Oct 8, 2016

Standardizes events so that they always work the same way, no exceptions. Changed op event to after op. Added before component and after component events.

Currently there are 2 events surrounding the application of an operation. They are:

  • before op - Emitted before op is applied
  • op - Emitted after op is applied

These behave in a standard way with one exception, when applying remote ops of json0 type. In this case the op is manually shattered, applied incrementally and the events are emitted before and after each "component op" has been applied.

the comments explain:

Ideally, we would not need this extra complexity. However, it is helpful for implementing bindings…

Indeed it can be very helpful to have events triggered on the application of each component. It may also be helpful on a different OT types that support it. Or even on local ops as well as remote ops, depending on the application.

That said, it also remains useful to have events that treat the op more atomically… before any portion of an op has been applied and after an op has been fully applied.

Unfortunately in its current implementation it’s either one of the other (atomic or incremental) and it’s not up to the user to decide. It’s only based on a hardcoded data type’s remote ops.

This creates several interesting issues, an important one in particular is that there is no way to run code only after an entire json0 remote op has been applied. Which is relevant, if your app operations are meant to be treated as ‘atomic’. Such that they change the doc in a way that only makes sense after the op has been applied in its entirety.

As an example: One might want to render after an entire op has been applied. However, when applying json0 remote ops they are split up and the op event would be run multiple times. The doc.data may only make sense to render once the entire op has been applied, but you aren’t sure which op event is the last one… Another cases could be needing to transform a complex cursor by an full operation.

Generally the issues are:

  1. No way to process json0 remote ops atomically.
  2. No way to process other OT types incrementally.
  3. No way to process local ops incrementally.
  4. No way to decide if ops should be applied atomically or incrementally.
  5. Events change meaning depending on type and source.

What would be nice is if we could listen to both op level and component level events. Have operations applied atomically or shattered if the type supports it. For both remote and local ops, if desired.

I'd like to propose that rather than changing the meaning of the before op and op events based on OT type and source. The two events should more appropriately be four separate events:

  • before op - emit before any portion of the op is applied.
  • before component - emit before each component op is applied.
  • after component - emit after each component op has been applied.
  • after op - emit after the entire op has been applied.

And whether ops are applied atomically or incrementally should be based upon…

  // if OT type implements .shatter() 
  doc.type.shatter !== undefined;

  // and upon two properties of the doc object
  doc.applyLocalOpsIncremental = false;
  doc.applyRemoteOpsIncremental = true;

Benefits of this design:

  • Allows any OT type to define .shatter() and emit granular events.
  • Enables after op to always emit entire ops, including remote json0 ops.
  • Enables before op to always emit entire ops, including remote json0 ops.
  • Enables ability to decide whether ops should be applied atomically or incrementally.
  • Events always work the same way, no confusion.

Design decisions

  • before component and after component will always fire. A single component op or an OT type that doesn’t support shatter, will emit the entire op. I did this to make transitioning op events to after component always work. However, I think the case could be made for not emitting these events if apply incrementally is disabled or shatter undefined.
  • applyLocalOpsIncremental defaults to false and applyRemoteOpsIncremental defaults to true. This was only chosen to remain consistent with ShareDBs current implementation. I think ultimately it would make more sense for both to either be true… or possibly false, but most importantly consistent.

I created this pull request to show the implementation of the ideas above. Please check it out and let me know your thoughts.

I realized that by recommending an update to the event system, it means developers implementing ShareDB would need to update their code. So I’d like to hear fellow developer’s thoughts and input. These changes would require developers to implement the following updates:

on(‘begin op’)    // to… either on(‘begin op’) or on(‘begin component’)
on(‘op’)          // to… either on(‘after op’) or on(‘after component`)

In most cases, begin op would stay the same and op would change to after op. Only when granular events are specifically needed would a developer change events to use before component or after component.

Thanks for everyone’s work on this awesome library, especially @nateps for continually improving it. It really is a hidden gem in the code universe.

Standardized events so that they always work the same way, no exceptions. Changed `op` event to `after op`. Added `before component` and `after component` events.
@coveralls
Copy link

coveralls commented Oct 8, 2016

Coverage Status

Coverage increased (+0.005%) to 96.111% when pulling cdefa91 on roark:standardize-op-events into bb11214 on share:master.

@@ -522,12 +525,15 @@ Doc.prototype._otApply = function(op, source) {
// that the snapshot only include updates from the particular op component
// at the time of emission. Eliminating this would require rethinking how
// such external bindings are implemented.
if (!source && this.type === types.defaultType && op.op.length > 1) {
if ( (this.type.shatter && op.op.length > 1) &&
Copy link
Author

@roark roark Oct 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering removing op.op.length > 1 check...

My reasons would be:

  1. Can operations for custom OT types be other objects than arrays? If so this length check would cause an error.
  2. Maybe an OT type operation has a length of one but due to its structure it can actually be shattered into smaller ops.

Removing this check would call op.shatter on ops of length 1 or 0. I don't believe this should be an issue since the shatter method would be extremely fast in these cases and OT types should support these cases. Thoughts?

@nateps
Copy link
Contributor

nateps commented Oct 11, 2016

Some good suggestions here

@roark
Copy link
Author

roark commented Oct 15, 2016

Thanks @nateps. Look forward to your input once you've been able to really mull everything over.

I had a couple of relevant ideas come to mind and wanted to put them into the ether.

  1. In order to make any transition smoother, we could continue to emit the op event as an alias to after op, along with a depreciation message. This would cover the majority of cases, except for those special cases which require updates of op to after component or before op to before component.

  2. Another thought... Instead of having the applyLocalOpsIncremental and applyLocalOpsIncremental flags, we could apply the op incrementally iff there is a listener on the doc for before component or after component.

    This would mean that if a user was listening to a component event, then the doc would apply both local and remote ops incrementally. Otherwise it would apply them atomically.

    This idea could keep things simpler, however it would mean the apply behavior changes based on subscription to an event. And it would also do away with the ability to apply remote and local ops differently.

@roark
Copy link
Author

roark commented Sep 14, 2017

Hi @nateps,

I know you keep quite busy, when you get a chance can you review this proposal and related pull request.

I'd like to know your thoughts on what changes may be required to merge these ideas.

Proposal TLDR Recap

  • Standardized application of ops for different ot-types (shatter / atomic)
  • Add additional of before component and after component events
  • Create after op -> op alias (consider op depreciation in favor of after op)

I think these additions, especially since some involve event names being integrated by end users, would be good to have in the official 1.0.0 release.

Thanks, keep up the great work.

@dlio
Copy link

dlio commented Feb 1, 2018

Any updates on this? Looks good to me.

@nateps
Copy link
Contributor

nateps commented Jul 24, 2018

Hey @roark!

Thanks for the well-thought out explanation and proposal! You've clarified many ways in which these event interfaces could be improved.

I do like the general direction of the proposal, and I'd like to implement this in a non-breaking way (all new events and options) in the current version of ShareDB along with deprecation of the prior inconsistent event names. In a future major version, we'd remove the deprecated events.

@ericyhwang and I have recently begun dedicating more time to ShareDB, and we're first prioritizing getting existing PRs merged in.

What are your thoughts around implementing a deprecation strategy for these changes?

Another question, should we have a way of specifying defaults? Perhaps you could configure a default per type and/or global default on the Connection?

@gkubisa
Copy link
Contributor

gkubisa commented Jul 24, 2018

I really like this proposal. 👍

As we're going to change the events, perhaps it's a good opportunity to make some more small improvements too:

  1. Pass in an Event object to the event handler as the only parameter, instead of multiple positional parameters (eg op and source). We could then pass in more information to the handlers in the future easily without affecting backward compatibility.
  2. Make the events always async. Related to Inconsistent callback execution - sync vs async #206.

@curran
Copy link
Contributor

curran commented Apr 15, 2019

It looks like the changes are generally good, but possibly breaking, so should be behind a deprecation strategy / flag. If that were added to this PR, retaining exactly the old behavior if the flag is not enabled, perhaps it could be merged.

Is anyone really interested in these changes? The win does not seem that big to be honest. If not, I suggest to close without merge due to lack of interest.

@curran
Copy link
Contributor

curran commented Dec 21, 2019

Suggest to close without merging due to lack of activity.

@curran curran mentioned this pull request Oct 21, 2020
alecgibson pushed a commit that referenced this pull request Nov 4, 2020
This change adds two events:

 - `before op batch` - fired once before any set of ops is applied.
 - `op batch` - fired once after any set of ops is applied. This may
   follow multiple `op` events if the op had multiple `json0` components

This is a non-breaking change that should allow clients to process ops
in their entirety.

There has already been some discussion around this:

 - #129
 - #396

This is a much simpler approach than the existing pull request. Here we
try not to change existing behaviour, and only add new, non-breaking
events.

Motivation for such an event would include clients applying some form of
validation logic, which doesn't make sense if an op is shattered.

For example, consider a client that wants to ensure a field
`mustBePresent` is always populated:

```js
doc.on('op', () => {
  if (!doc.data.mustBePresent) throw new Error('invalid');
});

remoteDoc.submitOp([
  {p: ['mustBePresent', 0], sd: 'existing value'},
  {p: ['mustBePresent', 0], si: 'new value'},
]);
```

In the above example, the submitted op is clearly attempting to perform
a replacement. However, the receiving `doc` only receives this
replacement in parts, so it looks like the document reaches an invalid
state, when actually the submitted op is perfectly valid.

In this case we `throw`, but we could have also attempted to populate
with a default value, which could interfere with the desired value.

This change fixes the above issue, because now we can just listen for
the `op batch` event, and consider the document once all the components
of a given op have been applied.
@alecgibson alecgibson mentioned this pull request Nov 4, 2020
alecgibson pushed a commit that referenced this pull request Nov 11, 2020
This change adds two events:

 - `before op batch` - fired once before any set of ops is applied.
 - `op batch` - fired once after any set of ops is applied. This may
   follow multiple `op` events if the op had multiple `json0` components

This is a non-breaking change that should allow clients to process ops
in their entirety.

There has already been some discussion around this:

 - #129
 - #396

This is a much simpler approach than the existing pull request. Here we
try not to change existing behaviour, and only add new, non-breaking
events.

Motivation for such an event would include clients applying some form of
validation logic, which doesn't make sense if an op is shattered.

For example, consider a client that wants to ensure a field
`mustBePresent` is always populated:

```js
doc.on('op', () => {
  if (!doc.data.mustBePresent) throw new Error('invalid');
});

remoteDoc.submitOp([
  {p: ['mustBePresent', 0], sd: 'existing value'},
  {p: ['mustBePresent', 0], si: 'new value'},
]);
```

In the above example, the submitted op is clearly attempting to perform
a replacement. However, the receiving `doc` only receives this
replacement in parts, so it looks like the document reaches an invalid
state, when actually the submitted op is perfectly valid.

In this case we `throw`, but we could have also attempted to populate
with a default value, which could interfere with the desired value.

This change fixes the above issue, because now we can just listen for
the `op batch` event, and consider the document once all the components
of a given op have been applied.
@alecgibson
Copy link
Collaborator

Closing this, as we've merged #401

The above Pull Request adds support for before op batch and op batch events as a non-breaking change (where this PR renamed the existing before op and op events.

Features that the above PR does not add, which have been proposed here (and could be proposed in future PRs):

  • Flags for toggling incremental op updates (we discussed this, and saw no harm in leaving them on all the time, and you just don't subscribe to the events if you don't care)
  • Support for any shatterable type, as opposed to just json0

@alecgibson alecgibson closed this Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants