Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sweeper+contractcourt: deadline aware in sweeping anchors #5148

Merged
merged 12 commits into from
Jun 29, 2021

Conversation

yyforyongyu
Copy link
Collaborator

Pushing for concept ack before moving forward.

Current design records the deadline in AnchorResolution, we can also calculate the deadlines inside the sweepAnchor. The former design has the advantage as we can fine control what deadlines are for remote/local commitment transactions, the latter one however can give much smaller code change and retain the deadline logic inside contractcourt.

Another thing to consider is when extracting info from mempool is supported, things is likely to change here?

Next step,

  • extend the deadline to HTLC claims
  • add unit test and itest
  • make the sweeper use the deadline (the new conf target), ie, build a new fee estimator

Some notes on deciding the deadline

For to_local and to_remote outputs, there are no deadlines needed as they can only be spent by us/them.
In the revocation path, which doesn't need a deadline either as all other paths are CSV locked.

In our local commitment transaction,

  • For incoming HTLCs, if we know the preimage, we should spend it before it times out by the remote peer
  • For outgoing HTLCs, we are safe as,
    • if things go well, the remote peer reveals the preimage, we can use it to collect the incoming HTLC.
    • if the remote peer disappears, we can time it out.
    • if the remote reveals the preimage after the timeout, we still have time to collect the incoming HTLC as we have a CLTV delta between the incoming/outgoing HTLCs. This not the remote peer's best interest however, as we can take the money via timeout and collect the money using the preimage.

This means the deadline for us would be the least CLTV value found in the incoming HTLCs.

In our remote commitment transaction,

  • For incoming HTLCs, it's money we paid to the remote, and it will time out to us when things don't go well.
  • For outging HTLCs, it's money given to us, and it's equivalent to the incoming HTLCs found in our local commitment transaction. We can use the preimage to spend them, or they can be timed out by the remote peer.

In this case, the deadline for us would be the least CLTV value found in the outgoing HTLCs. However, since we can only learn the remote commit tx after it's confirmed, there's no point in tracking the deadline here?

Fix #4215

@Roasbeef
Copy link
Member

Another thing to consider is when extracting info from mempool is supported, things is likely to change here?

Do you mean the issue to start watching the mempool for revealed pre-images or something else entirely?

@Roasbeef
Copy link
Member

In this case, the deadline for us would be the least CLTV value found in the outgoing HTLCs. However, since we can only learn the remote commit tx after it's confirmed, there's no point in tracking the deadline here?

Yeah I think that makes sense, if by remote commitment you mean the latest commitment for the remote party, then IIRC we only also attempt to sweep those anchors as at broadcast time since we don't know what's in the mempool, we dont' know exactly which transaction will confirm in the end. There's also a related proposal to simply re-use the source UTXOs for this case: #4860

@Roasbeef
Copy link
Member

In our local commitment transaction,
we still have time to collect the incoming HTLC as we have a CLTV delta between the incoming/outgoing HTLCs. T
This means the deadline for us would be the least CLTV value found in the incoming HTLCs.

What about the inclusion of the CLTV delta into this heuristic? Lets say we have an outgoing HTLC, and the remote peer is asleep at the wheel for w/e reason and won't come online to pull the pre-image (or attempting to redeem the HTLC on chain would actually be a net loss). In this case we still want to ensure that the commitment gets confirmed before the CLTV delta starts to "tick", as we want to be able to utilize the full CLTV delta period to actually fully resolve the HTLC on chain.

lnwallet/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
@yyforyongyu
Copy link
Collaborator Author

Do you mean the issue to start watching the mempool for revealed pre-images or something else entirely?

@Roasbeef No just want to leave room for future change. Previously learnt from @halseth that, we only care about the deadline for local commitment transaction because we can only learn a remote commit tx from chain watcher after the tx is broadcasted and mined, which means it's already confirmed so no need to accelerate it. However, if we switch to mempool then we can learn the remote commit tx before it's confirmed from mempool (most likely unless mempool data is split). In this case, we might have interest in accelerating the remote commit tx.

Thus it would change where we perform the calculation of the deadline.

  • If we will use mempool and we might have interest in accelerating remote commit tx, then it's better to perform the calculation when creating NewAnchorResolution.
  • Otherwise, the only place we care about the deadline is when we force close our local commit tx, we can then do the calculation in ChannelArbitrator.sweepAnchors, the lnwallet package then doesn't need to care about what is a deadline.

@yyforyongyu
Copy link
Collaborator Author

What about the inclusion of the CLTV delta into this heuristic? Lets say we have an outgoing HTLC, and the remote peer is asleep at the wheel for w/e reason and won't come online to pull the pre-image (or attempting to redeem the HTLC on chain would actually be a net loss). In this case we still want to ensure that the commitment gets confirmed before the CLTV delta starts to "tick", as we want to be able to utilize the full CLTV delta period to actually fully resolve the HTLC on chain.

Yeah for our outgoing HTLCs we still want to go through the timeout path when necessary. I think the deadline logic can be further extended as, in our local commitment tx,

If we have the preimages, the deadline would be the smaller of,

  • the least CLTV from the incoming HTLCs which we can use the preimage to claim, or,
  • the least CLTV from the outgoing HTLCs.

If we don't have the preimage,our deadline would be the least CLTV from outgoing HTLCs.

@Roasbeef Roasbeef added this to the 0.13.0 milestone Mar 31, 2021
@Roasbeef Roasbeef added the P2 should be fixed if one has time label Mar 31, 2021
@yyforyongyu yyforyongyu force-pushed the 4215-deadline-aware branch 2 times, most recently from 0b0b16e to 403397a Compare May 7, 2021 13:17
@yyforyongyu yyforyongyu requested a review from halseth May 7, 2021 13:19
@yyforyongyu
Copy link
Collaborator Author

I've updated to test out the alternative, which is to manage the deadline inside contract court. I think this approach is better as the only place we need a deadline is when we call sweepAnchors. Let me know what you think and I'll continue the rest.

Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like this approach 👍

lnwallet/channel.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
lnwallet/chainfee/estimator.go Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Show resolved Hide resolved
@yyforyongyu yyforyongyu force-pushed the 4215-deadline-aware branch 2 times, most recently from 3f29178 to 27e4f93 Compare May 11, 2021 13:23
@yyforyongyu
Copy link
Collaborator Author

Working on an itest, meanwhile ready for another look.

contractcourt/channel_arbitrator.go Show resolved Hide resolved
contractcourt/channel_arbitrator.go Show resolved Hide resolved
contractcourt/channel_arbitrator.go Outdated Show resolved Hide resolved
// Calculate the deadline. Notice that our minHeight may never get
// updated, which means the passed HTLCs set is empty. We will then
// fall back to the default value.
deadline := minHeight - heightHint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the height hint here is not necessarily the current block is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be one block off if a new block is generated during these lines.

What I see here is,

  1. new blocks comes in -> advanceState -> stateStep, which has the current block, start from this line
  2. the other is from the loop within advacenState, when the state is changed from StateBroadcastCommit to StateCommitmentBroadcasted from this line after we force close the channel. Depending on how fast these get executed, we might have a new block comes in, and we would be back in case 1.

@yyforyongyu yyforyongyu force-pushed the 4215-deadline-aware branch 2 times, most recently from 2b5734b to 55f983b Compare May 28, 2021 12:39
@Roasbeef
Copy link
Member

Most of the itest failures should be fixed by #5430, saw this on the latest run though:

    lnd_channel_force_close.go:132: 
        	Error Trace:	lnd_channel_force_close.go:132
        	            				lnd_channel_force_close.go:195
        	            				test_harness.go:112
        	            				lnd_test.go:13878
        	Error:      	Received unexpected error:
        	            	predicate not satisfied after time out
        	Test:       	TestLightningNetworkDaemon/58-of-80/neutrino/commitment_deadline
        	Messages:   	htlc mismatch

@Roasbeef
Copy link
Member

change the minBlockTarget to be 1 for webAPIEstimator. Curious why did we choose 2 as our min conf target?

IIRC, for bitcoind 1 actually maps to 2, and our implementation of the JSON response is sparsely encoded.

@yyforyongyu yyforyongyu force-pushed the 4215-deadline-aware branch 3 times, most recently from 925ed84 to 41328ba Compare June 28, 2021 12:55
@yyforyongyu
Copy link
Collaborator Author

Most of the itest failures should be fixed by #5430, saw this on the latest run though:

    lnd_channel_force_close.go:132: 
        	Error Trace:	lnd_channel_force_close.go:132
        	            				lnd_channel_force_close.go:195
        	            				test_harness.go:112
        	            				lnd_test.go:13878
        	Error:      	Received unexpected error:
        	            	predicate not satisfied after time out
        	Test:       	TestLightningNetworkDaemon/58-of-80/neutrino/commitment_deadline
        	Messages:   	htlc mismatch

This is now fixed. The travis build still failed due to suspected flakes.

@Roasbeef
Copy link
Member

Commit check failing as one of the commits doesn't compile:

# github.com/lightningnetwork/lnd/lntest/itest [github.com/lightningnetwork/lnd/lntest/itest.test]
FAIL	github.com/lightningnetwork/lnd/lntest/itest [build failed]
lntest/itest/lnd_channel_force_close.go:98:26: not enough arguments in call to net.ConnectNodes
?   	github.com/lightningnetwork/lnd/lntest/mock	[no test files]
	have (context.Context, *lntest.HarnessNode, *lntest.HarnessNode)
?   	github.com/lightningnetwork/lnd/lntest/wait	[no test files]
	want (context.Context, *testing.T, *lntest.HarnessNode, *lntest.HarnessNode)
?   	github.com/lightningnetwork/lnd/lntypes	[no test files]
lntest/itest/lnd_channel_force_close.go:98:26: net.ConnectNodes(ctxt, alice, bob) used as value

log.Warnf("ChannelArbitrator(%v): deadline is passed with "+
"deadlineMinHeight=%d, heightHint=%d",
c.cfg.ChanPoint, deadlineMinHeight, heightHint)
deadline = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm curious to see what effect this actually has in the wild...

We'll need to double check that these outputs don't get filtered out from the sweeper due to providing "negative yield". The failure mode I'm thinking off here is something like:

  • We go to chain for some HTLCs
  • These HTLCs aren't very large
  • Node is offline for a while for some reason, causes us to miss the deadline
  • We publish, then try to sweep w/ 100 sat/vb (just an example)
  • The sweeper filters this out and the sweeps never go through /sadface

In theory this can already happen w/ smaller HTLCs, which is why we need to more dynamically control what we think is considered dust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is ok since we'll always set the Force flag for any anchor sweeps. We def have more follow up work to do in this area, but it's certainly a start and better than the current state of things (no bumping at all, pray to Satoshi).


// When the deadline is passed, we will fall back to the smallest conf
// target (1 block).
case deadlineMinHeight <= heightHint:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


// We expect the default max fee rate is used. Allow some deviation
// because weight estimates during tx generation are estimates.
require.InEpsilonf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test function, at times stuff like fees and be pretty fuzzy so I far prefer this to having some brittle arithmetic inline within the test itself.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🏆

Just need to make sure the commits build so it can pass one of the CI tests, and this should be good to land. Excellent work tackling this, has been on my list of things we need to address to increase the security of the core routing network. Other worthy follow ups to this include: the fee bumping function itself, updating CLTV values in real-time based on on-chain conditions, and also updating min HTLC and dust settings based on on-chain conditions.

In this commit, we add a check inside EstimateFeePerKW for bitcoind so
that when the conf target exceeds the maxBlockTarget, we will use
maxBlockTarget instead.
This commit changes the minBlockTarget used by the WebAPIEstimator to be
1, inline with the bitcoind's accepted min conf target.
This commit adds a new struct AnchorResolutions which wraps the anchor
resolutions for local/remote/pending remote commitment transactions. It
is then returned from NewAnchorResolutions. Thus the caller knows how to
retrieve a certain anchor resolution.
In this commit, we made the change so that when sweeping anchors for the
commitment transactions, we will be aware of the deadline which is
derived from its HTLC set. It's very likely we will use a much larger
conf target from now on, and save us some sats.
This commit adds a deadline field to mockSweeper that can be used to
track the customized conf target (deadline) used for sweeping anchors.
The relevant test, TestChannelArbitratorAnchors is updated to reflect
that the deadlines are indeed taking effect.
This commit adds two tests to check that a) the correct deadline is used
given different HTLC sets and b) when sweeping anchors the correct
deadlines are used.
In this commit, we add a method in fee service to allow specifying a fee
rate with a conf target.
@yyforyongyu
Copy link
Collaborator Author

Commit check failing as one of the commits doesn't compile:

This is now fixed after rebase. I think I will make an issue to keep track of what to do next.

@Roasbeef
Copy link
Member

@yyforyongyu I added a new "task" (they like made it an officially PR now I guess?) to #4215 that tracks the next step: starting to gradually bump up the fee rate as we approach the deadline in real time (right now we'll just re-query for the current fee for the specified deadline/conf-target). I think it makes sense to make new issues for the dynamic CLTV measures, and the other aspects I mentioned above.

@Roasbeef
Copy link
Member

Some of the tests fail due to a new error in the itests:

<time> [ERR] RPCS: [/lnrpc.Lightning/CloseChannel]: rpc error: code = Canceled desc = context canceled

Which is benign really AFAICT, and is just part of the auto-logging we have for any errors triggered while handling an RPC request.

@Roasbeef Roasbeef merged commit 20ef37d into lightningnetwork:master Jun 29, 2021
@yyforyongyu yyforyongyu deleted the 4215-deadline-aware branch June 30, 2021 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 should be fixed if one has time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sweeper+contractcourt: deadline aware HTLC claims & commitment confirmation
4 participants