Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webrtc: race condition during connection negotiation #585

Open
achingbrain opened this issue Oct 2, 2023 · 5 comments
Open

webrtc: race condition during connection negotiation #585

achingbrain opened this issue Oct 2, 2023 · 5 comments

Comments

@achingbrain
Copy link
Member

When exchanging ICE candidates, the connectionStatus of one side becomes connected before the other.

At that point it closes the signalling stream (at the moment - see #580)

If this happens before it's sent its final ICE candidate, the remote's end of the stream can close mid-read and the remote's RTCPeerConnection may still have the connectionStatus of connecting.

This is handled in js by waiting for the connectionStatus of the RTCPeerConnection to change within a time out when the stream closes, even mid-read but is not reset. It works but it feels a bit hacky.

It's necessary in JS because the receiving of ICE candidates and the changes of connectionStatus on the local RTCPeerConnection are async and decoupled so may occur in any order.

Do we care about this or would it be better to add an additional, final message that can be sent over the signalling stream when one side's RTCPeerConnection becomes connected like SDP_COMPLETE or similar?

This message would mean 'this end is "connected", no more ICE candidates will be sent, the stream can/will be closed, and you should wait for your end to become "connected" or to time out'.

cc @sukunrt @mxinden @marten-seemann

@sukunrt
Copy link
Member

sukunrt commented Oct 2, 2023

Since one peer believes the connection is state is Connected, the lagging peer would have received the Candidate, is it fine to just end the read loop and wait for peerConnectionState update when an EOF is received on the relayed stream?

@achingbrain
Copy link
Member Author

achingbrain commented Oct 2, 2023

is it fine to just end the read loop and wait for peerConnectionState update when an EOF is received on the relayed stream?

This is how it's currently implemented but the EOF leaves an error message in the logs. Adding a flag that says "i'm connected, you should be too shortly" lets us handle the scenario gracefully?

@sukunrt
Copy link
Member

sukunrt commented Oct 3, 2023

I'm a bit indifferent to the asked for change since in go-libp2p it's easy to check EOF and not log anything in such cases and this behaviour is not improved by adding a final message. However if there's no way of achieving this in js, we should definitely add this.

Is there a backwards compatible way of handling this? Given only js-libp2p has webrtc(private-to-private) what would its behaviour be on receiving a message with a new message type?

Should we also mention in the specs how implementations should handle unknown message types given that we might add some other message type in the future?

@achingbrain
Copy link
Member Author

achingbrain commented Oct 4, 2023

it's easy to check EOF and not log anything

The thing EOF is that it can mean a few different things and the code is left guessing what the outcome was and if it was a failure. Adding an explicit flag makes node behaviour easier to reason about in the future.

Is there a backwards compatible way of handling this? Given only js-libp2p has webrtc(private-to-private) what would its behaviour be on receiving a message with a new message type?

It would ignore the field as it wouldn't be in it's protobuf definition.

Should we also mention in the specs how implementations should handle unknown message types given that we might add some other message type in the future?

This is handled by the protobuf spec, essentially you should always add new field indexes for new types and not reuse old ones.

@mxinden
Copy link
Member

mxinden commented Oct 4, 2023

it's easy to check EOF and not log anything

The thing EOF is that it can mean a few different things and the code is left guessing what the outcome was and if it was a failure. Adding an explicit flag makes node behaviour easier to reason about in the future.

What does EOF mean other than a graceful close of a stream? The faster peer should set the FIN flag on the Yamux stream on the relayed connection on the happy path. All other cases should result in a RST flag on the Yamux stream on the relayed connection. Am I missing something @achingbrain?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

No branches or pull requests

3 participants