Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: keepalive race condition near timeout #52649

Closed
mweberxyz opened this issue Apr 22, 2024 · 9 comments
Closed

http: keepalive race condition near timeout #52649

mweberxyz opened this issue Apr 22, 2024 · 9 comments
Labels
http Issues or PRs related to the http subsystem.

Comments

@mweberxyz
Copy link

mweberxyz commented Apr 22, 2024

Version

v21.7.3

Platform

Darwin [me] 23.4.0 Darwin Kernel Version 23.4.0: Fri Mar 15 00:11:08 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T8122 arm64

Subsystem

http

What steps will reproduce the bug?

Run server.js:

const { createServer } = require("node:http");

const server = createServer(
  { keepAlive: true, keepAliveTimeout: 1000 },
  (req, res) => {
    res.writeHead(200, { "Content-Type": "application/json" });
    res.end(
      JSON.stringify({
        data: "Hello World!",
      })
    );
  }
);

server.listen(3000, async () => {
  const endpoint = `http://localhost:${server.address().port}`;
  console.log(endpoint);
});

Run client.js:

const http = require("http");

const timeouts = [
  950, 960, 970, 980, 990, 995, 995, 996, 996, 997, 997, 998, 998, 999, 999,
];
let prevTimeout = 0;

const sendRequest = () => {
  http.get("http://127.0.0.1:3000", (res) => {
    res.on("data", () => {});
    res.on("end", () => {
      console.log(`success: ${prevTimeout}`);
      if (timeouts.length) {
        prevTimeout = timeouts.shift();
        setTimeout(sendRequest, prevTimeout);
      }
    });
  });
};

sendRequest();

How often does it reproduce? Is there a required condition?

Fairly reliably, might take a few runs to see it, or adjust the timeouts to account for device differences.

What is the expected behavior? Why is that the expected behavior?

% node client-http.js
success: 0
success: 950
success: 960
success: 970
success: 980
success: 990
success: 995
success: 995
success: 996
success: 996
success: 997
success: 997
success: 998
success: 998
success: 999
success: 999

It is expected that (generally) every request should succeed.

What do you see instead?

% node client-http.js
success: 0
success: 950
success: 960
success: 970
success: 980
success: 990
success: 995
success: 995
success: 996
node:events:497
      throw er; // Unhandled 'error' event
      ^

Error: read ECONNRESET
    at TCP.onStreamRead (node:internal/stream_base_commons:217:20)
Emitted 'error' event on ClientRequest instance at:
    at Socket.socketErrorListener (node:_http_client:500:9)
    at Socket.emit (node:events:519:28)
    at emitErrorNT (node:internal/streams/destroy:169:8)
    at emitErrorCloseNT (node:internal/streams/destroy:128:3)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  errno: -54,
  code: 'ECONNRESET',
  syscall: 'read'
}

Node.js v21.7.3

In the full debug logs - http.log - the server connection close is apparent, followed by the attempt to use the closed socket for the next request:

...
NET 92544: destroy
NET 92544: close
NET 92544: close handle
HTTP 92544: SOCKET ERROR: read ECONNRESET Error: read ECONNRESET
...

Additional information

The server is closing the connection at it's own 1000ms timeout, but the keepAlive timers in the client either:

  • haven't fired, or
  • fired, but didn't have a chance to remove the socket from the keepAlive pool in time for the next request to be made.

The issue existed prior to Node 20, but it's incidence is more pronounced now that keepAlive is defaulted to true in http.globalAgent.

A similar issue was identified in nodejs/undici#3141. Undici already had logic in place to close connections on the client side 1000ms prior to the keep-alive value provided by the server (to account for timer/transport latency) but said logic was failing to run due to an unrelated issue.

@RedYetiDev
Copy link
Member

@nodejs/http

@RedYetiDev RedYetiDev added the http Issues or PRs related to the http subsystem. label Apr 22, 2024
@mokimolo

This comment was marked as resolved.

@EthanTuning
Copy link

I'm experiencing this exact issue when trying to run ionic start to start a new ionic project.

image

@mweberxyz
Copy link
Author

@EthanTuning it's possible though unlikely related to this issue. ECONNRESET is a fairly general error code for when the server closes the socket unexpectedly. This issue is concerned with when an ECONNRESET occurs in the exact same tick of the event loop as a http request resuming on an previously used socket - which seems unlikely to be reproduced repeatedly except in the kind of narrow circumstances proposed in my post.

@mweberxyz
Copy link
Author

mweberxyz commented Apr 23, 2024

Is Node ignoring server sent Keep Alive hint headers on purpose? This method in _http_agent.js is suspect:

Agent.prototype.keepSocketAlive = function keepSocketAlive(socket) {
  socket.setKeepAlive(true, this.keepAliveMsecs);
  socket.unref();

  let agentTimeout = this.options.timeout || 0;

  if (socket._httpMessage?.res) {
    const keepAliveHint = socket._httpMessage.res.headers['keep-alive'];

    if (keepAliveHint) {
      const hint = RegExpPrototypeExec(/^timeout=(\d+)/, keepAliveHint)?.[1];

      if (hint) {
        const serverHintTimeout = NumberParseInt(hint) * 1000;

        if (serverHintTimeout < agentTimeout) {
          agentTimeout = serverHintTimeout;
        }
      }
    }
  }

  if (socket.timeout !== agentTimeout) {
    socket.setTimeout(agentTimeout);
  }

  return true;
};

The serverHintTimeout < agentTimeout conditional always evaluates to false (in the default conditions) because the default agentTimeout is 0 -- which means Node will never opportunistically close a connection and instead always wait for the server to close it first.

@mcollina
Copy link
Member

#52653 seems relevant.

@mcollina
Copy link
Member

The serverHintTimeout < agentTimeout conditional always evaluates to false (in the default conditions) because the default agentTimeout is 0 -- which means Node will never opportunistically close a connection and instead always wait for the server to close it first.

I think we should close them actually. PR?

@EthanTuning
Copy link

@EthanTuning it's possible though unlikely related to this issue. ECONNRESET is a fairly general error code for when the server closes the socket unexpectedly. This issue is concerned with when an ECONNRESET occurs in the exact same tick of the event loop as a http request resuming on an previously used socket - which seems unlikely to be reproduced repeatedly except in the kind of narrow circumstances proposed in my post.

Oh thanks, yeah I was very unsure, but googling got me here haha!

zanettea added a commit to zanettea/node that referenced this issue Apr 23, 2024
zanettea added a commit to zanettea/node that referenced this issue Apr 24, 2024
zanettea added a commit to zanettea/node that referenced this issue Apr 24, 2024
@mweberxyz
Copy link
Author

The serverHintTimeout < agentTimeout conditional always evaluates to false (in the default conditions) because the default agentTimeout is 0 -- which means Node will never opportunistically close a connection and instead always wait for the server to close it first.

I realize default in this context meant two different things.

Using the default http.globalAgent -- the default timeout is 5000ms.
Using a non-default agent, with keepAlive on, but otherwise default parameters, (ie- new http.Agent({ keepAlive: true }); -- the issue arises.

zanettea added a commit to zanettea/node that referenced this issue Apr 24, 2024
…lation between http1.1 servers and clients

reduced likelihood of race conditions on keep-alive timeout calculation between http1.1 servers and clients and honor server keep-alive timeout when agentTimeout is not set

Fixes: nodejs#47130
Fixes: nodejs#52649
zanettea added a commit to zanettea/node that referenced this issue Apr 24, 2024
…lation between http1.1 servers and clients

reduced likelihood of race conditions on keep-alive timeout calculation between http1.1 servers and clients and honor server keep-alive timeout when agentTimeout is not set

Fixes: nodejs#47130
Fixes: nodejs#52649
zanettea added a commit to zanettea/node that referenced this issue Apr 24, 2024
…lation between http1.1 servers and clients

reduced likelihood of race conditions on keep-alive timeout calculation between http1.1 servers and clients and honor server keep-alive timeout when agentTimeout is not set

Fixes: nodejs#47130
Fixes: nodejs#52649
zanettea added a commit to zanettea/node that referenced this issue Apr 30, 2024
reduce likelihood of race conditions on keep-alive timeout
calculation between http1.1 servers and clients and honor server
keep-alive timeout when agentTimeout is not set

Fixes: nodejs#47130
Fixes: nodejs#52649
jazelly added a commit to jazelly/node that referenced this issue Sep 9, 2024
jazelly added a commit to jazelly/node that referenced this issue Sep 10, 2024
jazelly added a commit to jazelly/node that referenced this issue Sep 10, 2024
RafaelGSS pushed a commit that referenced this issue Sep 16, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
RafaelGSS pushed a commit that referenced this issue Sep 16, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
RafaelGSS pushed a commit that referenced this issue Sep 17, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
targos pushed a commit that referenced this issue Sep 22, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
targos pushed a commit that referenced this issue Sep 26, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
targos pushed a commit that referenced this issue Oct 2, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
targos pushed a commit that referenced this issue Oct 2, 2024
Fixes: #52649
Refs: #54293

Co-authored-by: Arrigo Zanette <[email protected]>
PR-URL: #54863
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: LiviaMedeiros <[email protected]>
Reviewed-By: Jake Yuesong Li <[email protected]>
Reviewed-By: Ethan Arrowood <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants