Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for the TCP_USER_TIMEOUT option #103

Merged
merged 5 commits into from
Nov 7, 2020

Conversation

mildsunrise
Copy link
Contributor

keep-alive is usually wanted for dead peer detection. But keep-alive is only one half of that; if the peer dies before acknowledging data, the TCP_USER_TIMEOUT option is what decides when the connection gets closed, and that's usually set to 10 minutes. (correct me if wrong)

So I suggest we add support for setting the TCP User Timeout on a socket, too. Currently only implemented for Linux, I haven't had time to look this on other platforms yet.

@codecov
Copy link

codecov bot commented Oct 30, 2020

Codecov Report

Merging #103 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #103   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            4         4           
  Lines           98       116   +18     
=========================================
+ Hits            98       116   +18     
Impacted Files Coverage Δ
lib/constants.js 100.00% <100.00%> (ø)
lib/index.js 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 47b25e7...6939eb7. Read the comment docs.

Copy link
Owner

@hertzg hertzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work @mildsunrise with preparing PR 👏 .

So I suggest we add support for setting the TCP User Timeout on a socket, too.

I have not used TCP_USER_TIMEOUT before and after reading a bit about it seems like it would complement the rest of the module functinality very well 👍.

Currently only implemented for Linux, I haven't had time to look this on other platforms yet.

So far what I found is that the rfc5482 might be considered "young(ish)" and seems to be either linux specific or just not yet implemented in other kernels.
It's unsupported on freebsd and darwin (don't have any strong facts for macos, only few words here and there) but apparently supported on win32.

  • Therefore I suggested throwing in set and get methods if they are called on unsupported platforms. I'm open to suggestions here.

screenshot of freebsd tcp rfc compliance wiki page listing RFC 5482 as unspported

It think it would be great to have an end to end test confirming that it does what it's supposed to do. A test similar to test-tcpdump.js but only checking that TCP_USER_TIMEOUT does what it's expected to do on linux.

A great drill-down by Couldflare regarding this (and many other cases) demonstrates that it could be visualised using tcpdump.

  • Can we create an e2e test using tcpdump to test that it was actually set and at least picked up by kernel?

All in all great job and thanks for the PR I'm all in favor of merging this once we resolve the comments 👌 :octocat:

lib/index.js Outdated Show resolved Hide resolved
lib/index.js Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
* Sets the TCP_USER_TIMEOUT value for specified socket.
*
* Note: The msec will be rounded towards the closest integer
*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest lightly mentioning the interaction between TCP_USER_TIMEOUT and TCP_KEEPALIVE. And provide a link to man page for it. For developer experience purposes.

Suggested change
*
* Note: When used with the TCP keepalive option, TCP_USER_TIMEOUT will override
keepalive to determine when to close a connection due to keepalive failure.

Source: https://man7.org/linux/man-pages/man7/tcp.7.html

@@ -23,6 +24,7 @@ switch (OS.platform()) {
default:
Constants.TCP_KEEPINTVL = 5
Constants.TCP_KEEPCNT = 6
Constants.TCP_USER_TIMEOUT = 18
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this feature is actually not supported on freebsd and darwin (not tested, assumption based, see review comment for details).

For those platforms the value would be undefiend which might cause segfault and I'm not sure what would happen in that case, most likely not what the user intended to do so I suggest throwing in set and get methods if this constant is null or undefiend (==null).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense, yes. I'll do that when I find some time.

@hertzg
Copy link
Owner

hertzg commented Oct 30, 2020

For now you can ignore the failing check for node v12 (PR / test-integration (ubuntu-latest, 12) (pull_request)), It's tracked in #93 and if other versions are passing we can ignore it.

mildsunrise and others added 2 commits October 30, 2020 16:38
Co-authored-by: George Hertz <[email protected]>
Co-authored-by: George Hertz <[email protected]>
@mildsunrise
Copy link
Contributor Author

Thanks for reviewing :)

Can we create an e2e test using tcpdump to test that it was actually set and at least picked up by kernel?

Okay, I'll try to implement an e2e test using tcpdump, I think it should be doable.

So far what I found is that the rfc5482 might be considered "young(ish)" and seems to be either linux specific or just not yet implemented in other kernels.
It's unsupported on freebsd and darwin (don't have any strong facts for macos, only few words here and there) but apparently supported on win32.

Not exactly... TCP itself defines the user timeout. What RFC5482 does is add an option to the protocol so that you can advertise the timeout to the other peer, rather than just setting it locally.

It's not clear how (or if) the TCP_USER_TIMEOUT option can be used with other kernels... FreeBSD seems to point to SO_SNDTIMEO, which is similar but I'm not sure it does the same since it apparently only affects blocking syscalls? So I'm not sure at this point.

Therefore I suggested throwing in set and get methods if they are called on unsupported platforms. I'm open to suggestions here.

Absolutely!

@hertzg
Copy link
Owner

hertzg commented Oct 30, 2020

Okay, I'll try to implement an e2e test using tcpdump, I think it should be doable.

That is amazing! I believe we could use some pieces from repo for cloud-flare blog post:

Not exactly... TCP itself defines the user timeout. What RFC5482 does is add an option to the protocol so that you can advertise the timeout to the other peer, rather than just setting it locally.

Ah! I see... Thanks for the explanation. 🙏

It's not clear how (or if) the TCP_USER_TIMEOUT option can be used with other kernels... FreeBSD seems to point to SO_SNDTIMEO, which is similar but I'm not sure it does the same since it apparently only affects blocking syscalls? So I'm not sure at this point.

I'm OK starting with linux only support, given that we document that fact as well as throw in other platforms so that users would know it.

Later we can see if there is a demand for it on other platforms.

@hertzg
Copy link
Owner

hertzg commented Nov 7, 2020

@mildsunrise Gentle ping, should I take over or do you have some progress?

@mildsunrise
Copy link
Contributor Author

you can take over :)

@hertzg hertzg changed the base branch from master to feat-user-timeout November 7, 2020 17:10
@hertzg hertzg merged commit 419a8db into hertzg:feat-user-timeout Nov 7, 2020
@hertzg hertzg mentioned this pull request Nov 7, 2020
5 tasks
@mildsunrise mildsunrise deleted the user-timeout branch November 8, 2020 10:29
@hertzg
Copy link
Owner

hertzg commented Dec 8, 2020

@all-contributors please add @mildsunrise for code, docs, tests

@allcontributors
Copy link
Contributor

@hertzg

I've put up a pull request to add @mildsunrise! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants