Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When libp2p fails, node never re-attempts to get connection #2674

Closed
SjonHortensius opened this issue May 22, 2019 · 3 comments
Closed

When libp2p fails, node never re-attempts to get connection #2674

SjonHortensius opened this issue May 22, 2019 · 3 comments
Labels
Blocked Blocked by research or external factors

Comments

@SjonHortensius
Copy link
Contributor

Not sure what caused this - but it might be a temporary network glitch:

[2019-05-22 10:57:10]  INFO initial-sync: Synced!
[2019-05-22 10:57:10]  INFO regular-sync: Listening for regular sync messages from peers
[2019-05-22 15:30:39] ERROR p2p: Failed to reconnect to peer failed to dial : all dials failed
  * [/ip4/23.202.xxx/tcp/4000] dial tcp4 23.202.xxx:4000: connect: connection refused
  * [/ip4/23.195.xxx/tcp/4000] dial tcp4 23.195.xxx:4000: connect: connection refused
  * [/ip4/23.217.xxx/tcp/4000] dial tcp4 23.217.xxx:4000: connect: connection refused
  * [/ip4/127.0.0.1/tcp/4000] failed to negotiate security protocol: message did not have trailing newline
  * [/ip4/23.217.xxx/tcp/4000] dial tcp4 23.217.xxx:4000: connect: connection refused
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 10.52.xxx:4000: connect: no route to host
  * [/ip4/35.224.xxx/tcp/30001] failed to negotiate security protocol: EOF
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 10.52.xxx:4000: connect: no route to host
  * [/ip4/10.55.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->10.55.xxx:4000: i/o timeout
  * [/ip4/92.242.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->92.242.xxx:4000: i/o timeout
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->10.52.xxx:4000: i/o timeout
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->10.52.xxx:4000: i/o timeout
  * [/ip4/18.211.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->18.211.xxx:4000: i/o timeout
[2019-05-22 15:30:52] ERROR p2p: Failed to reconnect to peer failed to dial : all dials failed
  * [/ip4/23.202.xxx/tcp/4000] dial tcp4 23.202.xxx:4000: connect: connection refused
  * [/ip4/23.195.xxx/tcp/4000] dial tcp4 23.195.xxx:4000: connect: connection refused
  * [/ip4/127.0.0.1/tcp/4000] failed to negotiate security protocol: message did not have trailing newline
  * [/ip4/23.217.xxx/tcp/4000] dial tcp4 23.217.xxx:4000: connect: connection refused
  * [/ip4/23.217.xxx/tcp/4000] dial tcp4 23.217.xxx:4000: connect: connection refused
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 10.52.2.193:4000: connect: no route to host
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 10.52.4.162:4000: connect: no route to host
  * [/ip4/92.242.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->92.242.xxx:4000: i/o timeout
  * [/ip4/10.55.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->10.55.xxx:4000: i/o timeout
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->10.52.xxx:4000: i/o timeout
  * [/ip4/10.52.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->10.52.xxx:4000: i/o timeout
  * [/ip4/18.211.xxx/tcp/4000] dial tcp4 0.0.0.0:12000->18.211.xxx:4000: i/o timeout
  * [/ip4/35.224.xxx/tcp/30001] failed to negotiate security protocol: read tcp4 192.168.xxx:12000->35.224.xxx/30001: read: connection reset by peer
[2019-05-22 15:30:58] ERROR p2p: Failed to reconnect to peer dial backoff

I'm see a few things wrong with this:

  • after this error the node hangs and never attempts to reconnect. It should schedule a reconnect after a few minutes instead.
  • this might also indicate an issue in Do not directly dial a peer we cannot reach #2245 as 10.x is not in my list of reachable networks - these nodes should be skipped
  • 127.0.0.1 should not be in this list
@prestonvanloon prestonvanloon added the Blocked Blocked by research or external factors label May 24, 2019
@dimchansky
Copy link

Not sure about this case, but this pattern helped me to solve re-connection issue in my app:

host.Peerstore().ClearAddrs(targetPeerID)

if sw, ok := host.Network().(*swarm.Swarm); ok {
	sw.Backoff().Clear(targetPeerID)
}

if err := host.Connect(ctx, *targetPeerAddr); err != nil {
	return nil, err
}

@rauljordan
Copy link
Contributor

No longer relevant in latest master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked Blocked by research or external factors
Projects
None yet
Development

No branches or pull requests

4 participants