Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add infra to block ChannelMonitorUpdates on forwarded claims #2167

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Apr 7, 2023

When we forward a payment and receive an update_fulfill_htlc
message from the downstream channel, we immediately claim the HTLC
on the upstream channel, before even doing a commitment_signed
dance on the downstream channel. This implies that our
ChannelMonitorUpdates "go out" in the right order - first we
ensure we'll get our money by writing the preimage down, then we
write the update that resolves giving money on the downstream node.

This is safe as long as ChannelMonitorUpdates complete in the
order in which they are generated, but of course looking forward we
want to support asynchronous updates, which may complete in any
order.

Here we add infrastructure to handle downstream
ChannelMonitorUpdates which are blocked on an upstream
preimage-containing one. We don't yet actually do the blocking which
will come in a future commit.

Based on #2111, the follow up is be based on #2112 plus this.

@codecov-commenter
Copy link

codecov-commenter commented Apr 7, 2023

Codecov Report

Patch coverage: 79.81% and project coverage change: +1.17 🎉

Comparison is base (9e542ec) 90.94% compared to head (394f54d) 92.11%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2167      +/-   ##
==========================================
+ Coverage   90.94%   92.11%   +1.17%     
==========================================
  Files         104      104              
  Lines       52750    66889   +14139     
  Branches    52750    66889   +14139     
==========================================
+ Hits        47971    61612   +13641     
- Misses       4779     5277     +498     
Impacted Files Coverage Δ
lightning/src/ln/channelmanager.rs 89.24% <76.53%> (+2.11%) ⬆️
lightning/src/ln/channel.rs 92.85% <92.85%> (+3.04%) ⬆️
lightning/src/ln/payment_tests.rs 99.26% <100.00%> (+1.69%) ⬆️
lightning/src/ln/priv_short_conf_tests.rs 97.60% <100.00%> (+<0.01%) ⬆️
lightning/src/ln/reload_tests.rs 95.63% <100.00%> (+0.02%) ⬆️
lightning/src/ln/reorg_tests.rs 100.00% <100.00%> (ø)
lightning/src/sync/nostd_sync.rs 100.00% <100.00%> (ø)

... and 27 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@TheBlueMatt
Copy link
Collaborator Author

Slipping to 116.

@TheBlueMatt
Copy link
Collaborator Author

Rebased, now no longer based on anything.

@TheBlueMatt TheBlueMatt force-pushed the 2023-04-monitor-e-monitor-prep branch 2 times, most recently from 37a595d to ab7c327 Compare May 4, 2023 23:34
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
Comment on lines 8209 to 8299
// The ChannelMonitor that gave us this
// preimage is for a now-closed channel -
// no further updates to that channel can
// happen which would result in the
// preimage being removed, thus we're
// guaranteed to regenerate this claim on
// restart as long as the source monitor
// sticks around.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean we need to further keep monitors around even if they have zero claimable balances until we can unblock any dependent channels?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhhh, yea, kinda. I mean only until the upstream channelmonitor gets its update persisted, but indeed we don't currently have any infrastructure to let the user know whether that's the case when pruning the downstream monitor. We may want to add something like that (eg "no removing monitors while any monitors are syncing") but a few blocks of extra time should suffice in most cases.

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
@TheBlueMatt
Copy link
Collaborator Author

Will rebase this on #2287 in a day or two, but that should go first i think.

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
@TheBlueMatt TheBlueMatt force-pushed the 2023-04-monitor-e-monitor-prep branch 2 times, most recently from 34c6292 to bdd8f78 Compare May 12, 2023 05:20
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
@@ -5044,10 +5044,20 @@ impl<Signer: WriteableEcdsaChannelSigner> Channel<Signer> {
self.pending_monitor_updates.is_empty()
}

pub fn complete_all_mon_updates_through(&mut self, update_id: u64) {
self.pending_monitor_updates.retain(|upd| upd.update.update_id > update_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat unrelated to this method, but can a counterparty force us to drop these updates in any way and then play a commitment onchain for which we'd need one of those dropped updates to claim funds?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shouldn't be the case - if a monitor update is blocked/hasn't completed we should never ever ever give our peer whatever message would be required to broadcast the state included in that message. This is obviously different for payment preimages, which is why they "jump the queue".

/// completes a monitor update containing the payment preimage. In that case, after the inbound
/// edge completes, we will surface an [`Event::PaymentForwarded`] as well as unblock the
/// outbound edge.
EmitEventAndFreeOtherChannel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just name it PaymentForwarded and rename the action below to downstream_action? That fully describes the action we're performing after the update completes.

// support async monitor updates even in LDK 0.0.115 and once we do we'll require no
// downgrades to prior versions. Thus, while this would break on downgrade, we don't
// support it even without downgrade, so if it breaks its not on us ¯\_(ツ)_/¯.
(1, downstream_counterparty_and_funding_outpoint, option),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means we could make this even once we land the follow-up PR as part of 117?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no mechanism currently to do that. Sadly, downstream_counterparty_and_funding_outpoint is always set, so we can't make it even or we break all downgrades. Instead, we should make channel's pending_monitor_updates even, which it for some reason currently isn't. We'll want to do that for 116 - #2317

@TheBlueMatt
Copy link
Collaborator Author

Still need to do a few more followups but pushed a bunch of the more trivial changes here.

@dunxen
Copy link
Contributor

dunxen commented May 24, 2023

Haven't looked at tests yet, but the rest LGTM so far.

@valentinewallace
Copy link
Contributor

Still need to do a few more followups but pushed a bunch of the more trivial changes here.

Did you mean to push @TheBlueMatt? Also feel free to squash IMO.

@TheBlueMatt TheBlueMatt force-pushed the 2023-04-monitor-e-monitor-prep branch from 2935e00 to 9f2e9f1 Compare May 26, 2023 03:17
@TheBlueMatt
Copy link
Collaborator Author

I did.

Our `no-std` locks simply panic if a lock cannot be taken as there
should be no lock contention in a single-threaded environment.
However, the `held_by_thread` debug methods were delegating to the
lock methods which resulted in a panic when asserting that a lock
*is* held by the current thread.

Instead, they are updated here to call the relevant `RefCell`
testing methods.
This allows us to make the `force_shutdown` definition less verbose
In the coming commits we'll need the counterparty node_id when
handling a background monitor update as we may need to resume
normal channel operation as a result. Thus, we go ahead and pipe it
through from the shutdown end, as it makes the codepaths
consistent.

Sadly, the monitor-originated shutdown case doesn't allow for a
required counterparty node_id as some versions of LDK didn't have
it present in the ChannelMonitor.
Rather than letting `AChannelManager` be bounded by all traits
being `Sized` we make them explicitly `?Sized`. We also make the
trait no longer test-only as it will be used in a coming commit.
@TheBlueMatt TheBlueMatt force-pushed the 2023-04-monitor-e-monitor-prep branch from 9f2e9f1 to 5509788 Compare May 30, 2023 18:15
@TheBlueMatt
Copy link
Collaborator Author

Squashed with one further commit added at the top to fix the no-std held_by_thread debug method.

Copy link
Contributor

@wpaulino wpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending the follow-up work, but CI needs a small fix.

`BackgroundEvent` was used to store `ChannelMonitorUpdate`s which
result in a channel force-close, avoiding relying on
`ChannelMonitor`s having been loaded while `ChannelManager`
block-connection methods are called during startup.

In the coming commit(s) we'll also generate non-channel-closing
`ChannelMonitorUpdate`s during startup, which will need to be
replayed prior to any other `ChannelMonitorUpdate`s generated from
normal operation.

In the next commit we'll handle that by handling `BackgroundEvent`s
immediately after locking the `total_consistency_lock`.
When we generated a `ChannelMonitorUpdate` during `ChannelManager`
deserialization, we must ensure that it gets processed before any
other `ChannelMonitorUpdate`s. The obvious hook for this is when
taking the `total_consistency_lock`, which makes it unlikely we'll
regress by forgetting this.

Here we add that call in the `PersistenceNotifierGuard`, with a
test-only atomic bool to test that this criteria is met.
If a `ChannelMonitorUpdate` was created and given to the user but
left uncompleted when the `ChannelManager` is persisted prior to a
restart, the user likely lost the `ChannelMonitorUpdate`(s). Thus,
we need to replay them for the user, which we do here using the
new `BackgroundEvent::MonitorUpdateRegeneratedOnStartup` variant.
When we forward a payment and receive an `update_fulfill_htlc`
message from the downstream channel, we immediately claim the HTLC
on the upstream channel, before even doing a `commitment_signed`
dance on the downstream channel. This implies that our
`ChannelMonitorUpdate`s "go out" in the right order - first we
ensure we'll get our money by writing the preimage down, then we
write the update that resolves giving money on the downstream node.

This is safe as long as `ChannelMonitorUpdate`s complete in the
order in which they are generated, but of course looking forward we
want to support asynchronous updates, which may complete in any
order.

Here we add infrastructure to handle downstream
`ChannelMonitorUpdate`s which are blocked on an upstream
preimage-containing one. We don't yet actually do the blocking which
will come in a future commit.
@TheBlueMatt TheBlueMatt force-pushed the 2023-04-monitor-e-monitor-prep branch from 5509788 to 394f54d Compare May 30, 2023 23:05
@TheBlueMatt
Copy link
Collaborator Author

Oops, sorry, doc bug on an intermediary commit, shuffled diff around across commits to fix it.

Copy link
Contributor

@dunxen dunxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Bit off topic: I'm happy to wait for follow-ups to this one before considering #2077 for merge as it'll probably be much less hassle for you.

Also, with current work on V2 establishment at #2302, it gives me a chance to "vet" the utility of splitting those channels.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking!

lightning/src/ln/channel.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
) -> bool {
actions_blocking_raa_monitor_updates
.get(&channel_funding_outpoint.to_channel_id()).map(|v| !v.is_empty()).unwrap_or(false)
|| self.pending_events.lock().unwrap().iter().any(|(_, action)| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be cleaner to get rid of EventCompletionAction::ReleaseRAAChannelMonitorUpdate and store that data in actions_blocking_raa_monitor_updates now, so RAA monitor upd blockers are only stored in one place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe? I really hate the actions_blocking_raa_monitor_updates thing - its redundant state in two places that has to always be in sync, vs the event stuff is one place that isn't redundant and checked in-place - its comparatively harder to screw up.

The only reason for actions_blocking_raa_monitor_updates is that without it we'd have to lock + walk each peer and all our channels to figure out if we're being blocked anywhere. Originally I was gonna do that if we found a blocking action but figured it was overengineering, but either way I'd kinda rather not have the issue twice, even if we have to have it once.

@TheBlueMatt
Copy link
Collaborator Author

Bit off topic: I'm happy to wait for follow-ups to this one before considering #2077 for merge as it'll probably be much less hassle for you.

I don't think its worth waiting, some of the followups have to wait for 0.0.117, and the next round of followups that do have to go in 116 I havent finished writing yet 😭

@TheBlueMatt
Copy link
Collaborator Author

Gonna get this out of the way. Given some followups are required for 116 anyway will address the above doc comments there.

@TheBlueMatt TheBlueMatt merged commit 32eb894 into lightningdevkit:main May 31, 2023
@dunxen
Copy link
Contributor

dunxen commented May 31, 2023

I don't think its worth waiting, some of the followups have to wait for 0.0.117, and the next round of followups that do have to go in 116 I havent finished writing yet 😭

Ok, I'll rebase in the morning and we'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants