Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions to simplify Session management #5

Merged
merged 3 commits into from
Jan 17, 2024

Conversation

AgeManning
Copy link

I've made some changes which I think solve some bugs, simplify the code and make the diff smaller.

The main part of it was moving unreachable enr tracking to inside the LRUTimeCache in a generic way, by adding the concept of tagging. Elements can be optionally tagged when inserted into the cache. If we need to know how many tagged elements are in there we can just go like cache.tagged().

This allows us to easily track the number of unreachable sessions, just by checking tagged. There are a number of reasons why I think this is an improvement:

  • The code becomes much simpler and the diff from the original is smaller and I think we can remove a few dependencies
  • The futures based channel approach was not doing what you were expecting I dont think. In the Nat Hole Puncher, we implemented a stream based on when elements would expire. I think we were expecting this stream to wake up when something expires, but the LRUTimeCache is not async so it doesn't wake up when an element got expired. Rather it would wake up when something got sent to that channel. That only happened when get_mut() or len() on the cache was called. So elements could expire (time-wise) but the task will not get awoken until either of these functions are called. More importantly, the channel does not get populated when remove() is called. So in the NATHolePuncher tracking case, these were not being registered. I replaced this logic with a HashSetDelay, because it seemed we only cared about when elements were expiring. This is a true async struct and will fire exactly when things expire.
  • When sessions were removed, via the remove() function, they were not being counted as being removed from the tracker. I duplicated this logic in the nat_hole_puncher_tracker with a delay hashmap.
  • Some sessions I think could be entered without being tracked also. From a WHOAREYOU. See here: https:/emhane/discv5/blob/nat-hole-punch-discv5.2/src/handler/mod.rs#L870. So grouping the tracking logic into the cache helps isolate this logic and remove the chance of missing cases like this.

In addition a few more changes I made:

  • I removed the Clone and Copy derive from Session and Keys. These are private keys and we don't want to copy them in memory for security reasons. For private keys, we try to keep one point in memory to store them and we zeroize that memory when it goes out of scope. I think the copy and clone were only needed for tests, from what I can tell.
  • I simplified the naming a bit

I think I have replicated all the logic however. If there's a mismatch let me know.


// Decide whether to establish this connection based on our apettiite for unreachable
if enr_not_reachable
&& Some(self.sessions.tagged()) > self.nat_utils.unreachable_enr_limit
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&& Some(self.sessions.tagged()) > self.nat_utils.unreachable_enr_limit
&& Some(self.sessions.tagged() + 1) > self.nat_utils.unreachable_enr_limit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be >= ?

src/handler/mod.rs Outdated Show resolved Hide resolved
src/config.rs Show resolved Hide resolved
Copy link
Owner

@emhane emhane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, not so good to clippy. left a some suggestions, if you commit those I'm happy with it.

src/handler/nat_hole_punch/utils.rs Outdated Show resolved Hide resolved
}
/// Determines if an ENR is reachable or not based on its assigned keys.
pub fn is_enr_reachable(enr: &Enr) -> bool {
enr.udp4_socket().is_some() || enr.udp6_socket().is_some()
Copy link

@jxs jxs Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
enr.udp4_socket().is_some() || enr.udp6_socket().is_some()
enr.udp4_socket().or(enr.udp6_socket()).is_some()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont think we can do this, because the types are different. SocketAddrv4 vs SocketAddrv6

src/lru_time_cache.rs Outdated Show resolved Hide resolved
src/lru_time_cache.rs Outdated Show resolved Hide resolved
@AgeManning
Copy link
Author

Ok. I think i've addressed everything here. Will continue with the main review :)

@emhane emhane merged commit 5c89f12 into emhane:nat-hole-punch-discv5.2 Jan 17, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants