Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respond to requests from peers with invalid ENRs #265

Merged
merged 3 commits into from
Oct 8, 2024

Conversation

AgeManning
Copy link
Member

It can be the case that peers have invalid ENRs (i.e report socket addresses that do not match the socket src of the packet).

Previously discovery would drop packets from these peers and log a warning. This PR shifts this logic to instead respond to the src socket but exclude the ENR from reaching our routing table.

We will not advertise these peers, but we will respond to them.

I think this change will improve overall connectivity.

There are some legitimate cases where an ENR can be invalid, such as the external IP address changing or some funky NATs.

@ackintosh
Copy link
Member

I'm testing this change with the ip-change test case. I'll let you know when it's done.

@AgeManning
Copy link
Member Author

Awesome. Thanks @ackintosh 🙏

@ackintosh
Copy link
Member

@AgeManning During the testing, I noticed that the FINDNODE request (which is initiated from here) is timing out. Please see the attached image below. What do you think?

image

@ackintosh
Copy link
Member

For reference, the issue I mentioned above doesn’t happen in the current implementation, which uses a one-time session:

image

@AgeManning
Copy link
Member Author

@ackintosh

Awesome!! Thanks for catching this!

I was thinking in the transition its probably best to simply remove the transitioning peer from the local routing table. There is likely some time between us noticing it has changed its IP and it actually adjusting its IP. In the meantime we probably should prioritize better peers.

I suspect also if we remove it from the table, when it does eventually update, it can then have a second chance of getting back in there.

I've made this change. I'm hoping it now fixes the issue you've found.

Thanks again!

@ackintosh
Copy link
Member

@AgeManning I have tested it again. The updated ENR doesn't get back into Node B's routing table unless the session has expired. 🤔

image

@AgeManning
Copy link
Member Author

Hey @ackintosh

Nice. Yeah, I'm trying to think of any better way to improve on this behaviour. There's a number of things we want and facts we have to deal with:

  • We want to maintain the connection and respond to invalid ENRs now, so we dont want to terminate the session now in the process of a peer changing its IP
  • While its ENR is invalid, we dont want to advertise it and therefore don't want it in the routing table
  • There is some indeterminate time before the node updates its ENR (it might not even update it).

Therefore, I dont think we should:

  • Download every ENR we see that could be new on an established connection. Especially if we dont have them in our routing table, then we don't care about the ENR update
  • Keep the ENR in the routing table while we know its non-contactable
  • Drop and re-establish connections in hope that the ENR is now correct.

Given these, I can't think of an efficient and useful way that we can re-add this ENR to the table without the session expiring. We could keep track of expired node-ids and re-download ENRs if their sequence increases, but we could be downloading lots of ENRs with no benefit. Also the code complexity would increase a bit for this edge case.

I'm open to ideas of how to do this nicely, but currently I'm thinking to just leave this behaviour. In a real DHT, we have lots of peers. If we respond to everyone now and we kick out non-contactable ones, it should make the whole thing cleaner, i'd imagine.

@ackintosh
Copy link
Member

@AgeManning The idea that came to mind was expanding the use of one-time sessions not only for PING, but also for all other messages. If the ENR is invalid, then remove it from the routing table and respond to the peer using a one-time session. This gives the ENR a chance to get back to our routing table when the peer updates it, while still allowing us to respond to the peer.

However, this might increase the code complexity a bit, and yeah, this might not be a good solution, especially given that the peer might not update its ENR.

Copy link
Member

@ackintosh ackintosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@AgeManning AgeManning merged commit 994a61b into master Oct 8, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants