Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixPayChanV1 #4717

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from
Open

fixPayChanV1 #4717

wants to merge 19 commits into from

Conversation

dangell7
Copy link
Collaborator

This commit introduces a new fix amendment (fixPayChanV1) which prevents the creation of new PaymentChannelCreate transaction with a CancelAfter time less than the current ledger time. It piggy backs off of fix1571.

Once the amendment is activated, creating a new PaymentChannel will require that if you specify the CancelAfter time/value, that value must be greater than or equal to the current ledger time.

Currently users can create a payment channel where the CancelAfter time is before the current ledger time. This results in the payment channel being immediately closed on the next PaymentChannel transaction.

I did swap tecNO_PERMISSION for temBAD_EXPIRATION as I believe that is a more descriptive error message.

@@ -456,6 +456,7 @@ REGISTER_FIX (fixReducedOffersV1, Supported::yes, VoteBehavior::De
REGISTER_FEATURE(Clawback, Supported::yes, VoteBehavior::DefaultNo);
REGISTER_FEATURE(AMM, Supported::yes, VoteBehavior::DefaultNo);
REGISTER_FEATURE(XChainBridge, Supported::yes, VoteBehavior::DefaultNo);
REGISTER_FIX (fixPayChanV1, Supported::yes, VoteBehavior::DefaultNo);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be Yes by default? Per my understanding, every fix amendment shout have vote by default set to yes.

REGISTER_FEATURE(fixPayChanV1, Supported::yes, VoteBehavior::DefaultYes);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look above at all the fixes that don't have the VoteBehavior::DefaultYes.

Yes is reserved for things that need immediate approval. This doesn't need immediate approval imo.

I'm happy to change it though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advocate for keeping it as is, to avoid risk of amendment blocking, since this is not critical.

{
auto const closeTime = ctx_.view().info().parentCloseTime;
if (ctx_.tx[~sfCancelAfter] && after(closeTime, ctx_.tx[sfCancelAfter]))
return temBAD_EXPIRATION;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't tem codes should only be used in preflight?

Copy link
Collaborator Author

@dangell7 dangell7 Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used temBAD_EXPIRATION because that is what was used in PaymentChannelFund. Its also a finite error code. This transaction cannot be reapplied and get a different error. It will always fail. However, because its so far into the tx in doApply, you could argue it should be a tec code and take the fee.

You guys tell me what to do :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we have good documentation for when to return different ter codes, and we should write some. In the meantime, here are my thoughts:

A transactions has three major phases:

  1. Preflight. This is cheap. It happens before signature checks (which are expensive) and cannot read the ledger. In order to prevent broadcasting transactions to the network that fail signature checks, and therefore can't claim a fee, preflight is not allowed to return tec codes. (Note: there are several places where we check signatures: preflight2, invoke_preclaim, and in network ops; But not allowing preflight to return tec codes was modivated by the signature checks happening after preflight).

  2. Preclaim. This is more expensive. This happens after signature checks (and after preflight) and has read access to the ledger.

  3. Apply. This happens after preclaim and has read/write access to the ledger.

Two important points are:

  1. Preflight is cheap and happens before signature checking.

  2. If something returns a tem code it is not broadcast to the network. If something returns a tec code it is broadcast.

There are places in the code that return tem codes after preflight. The bad thing about this is someone can try to overload an individual server by making it do relatively expensive operations (like signature checks) and never pay fees for the work. The good thing is it saves the other servers on the network from doing the work because it is never broadcast.

However, I don't see a way to prevent someone from submitting bad signatures to an individual server and making that server do the same work that a transaction with a valid signature that returns a tem code would do.

It's important to note that tem codes should only be returned for transactions that can't succeed in any ledger. For example, if an account has an insufficient balance in the current open ledger, that transaction should return a tec code and should still be broadcast. In some cases we may want to return tec codes even when a tem would to so the transaction is recorded in a ledger and can easily be retrieved and replayed to diagnose errors.

Given that returning a tec code would not prevent any attacks that I can see, and there are benefits to not broadcasting transactions, I don't think we want to blanket disallow returning tem codes from preclaim or apply or change the ones that are already there.

However, in most cases preclaim and apply should return tec codes.

Copy link
Collaborator

@mDuo13 mDuo13 Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add on to this, I am supportive of creating this fix amendment, but I believe the appropriate error code for this situation is probably tecEXPIRED:

The transaction tried to create an object (such as an Offer or a Check) whose provided Expiration time has already passed.

It would not surprise me if EscrowCreate is inconsistent with this. Ideally we should fix that too.

But here's why I think a tec-class code is appropriate:

  1. The transaction isn't internally inconsistent or self-contradictory; it's only no good because there exists a previous ledger whose close time is higher than the specified expiration.
  2. Since transactions can be delayed an indefinite amount of time between creation and confirmation¹, this error can occur for transactions that were valid initially, got relayed to a bunch of servers (maybe queued), and then happened to have their expiration pass while they were awaiting confirmation. As a general rule, we want transactions to claim a fee (tec result code) if they've been relayed through the network, so transactions whose only fault is something is past expiration should use a tec code not a tem code.

¹ Some example situations that could cause a delay:

  • Transaction was created and signed, saved, and then machine lost network connectivity or power. When it came back up an unspecified amount of time later, it submitted the transaction.
  • Transaction was relayed to some servers during a period of high load, then fees went up and it didn't get relayed to enough servers to achieve consensus. Later on when load was lower some system retried the same transaction..
  • The transaction was relayed through the network and on track to be validated, but then XRPL network had a temporary outage preventing new ledgers from being confirmed for a while (for example, half of validators became unable to talk to the other half due to a massive netsplit).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both for the very very detailed explanation. This will help me a lot. I'm going to change the code to tecEXPIRED.

@intelliot
Copy link
Collaborator

Although the issue can be resolved with an amendment, it's not necessarily a flaw at the protocol level. Instead, it's a potential user error. While many such errors can be detected, catching every single one might not be feasible. A strong case can be made for identifying these mistakes through checks in client libraries or at the API level. With this approach, it wouldn't be necessary to make changes that impact transaction processing. For instance, xrplcluster prevents transactions with extremely high fees, or payments to known scammers. Instead of trying to address every possible user error within the protocol, it's more practical to handle these in a layer in front of rippled. One of the benefits is that the fix can be rolled out immediately, permissionlessly and without total consensus.

@dangell7 dangell7 closed this Sep 21, 2023
@intelliot
Copy link
Collaborator

Important notes:

  • I don't see anything wrong with this amendment. It fixes a problem (even if it's a small problem with satisfactory alternative workarounds). I'm very happy to see developers open small fix PRs like this one.
  • My comment offered a holistic view, considering both the benefits of the amendment and the practicality and immediacy of other solutions.

If @dangell7 (or anyone else in the community) would like this amendment, please proceed. Reopen this PR or open a new one. This protocol change can happen in parallel, and in addition to, any fixes at the client or API service layers.

@dangell7 dangell7 reopened this Sep 21, 2023
@mvadari
Copy link
Collaborator

mvadari commented Oct 3, 2023

@dangell7 it looks like tests are failing on this PR.

@intelliot intelliot requested a review from mvadari October 6, 2023 05:09
@intelliot
Copy link
Collaborator

@dangell7 - confirming - is this ready for review? (or are there any additional changes you're planning to make?)

@dangell7
Copy link
Collaborator Author

@intelliot Ready for review

Comment on lines 273 to 278
if (ctx_.view().rules().enabled(fixPayChanCancelAfter))
{
auto const closeTime = ctx_.view().info().parentCloseTime;
if (ctx_.tx[~sfCancelAfter] && after(closeTime, ctx_.tx[sfCancelAfter]))
return tecEXPIRED;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should happen before the SLE is created - perhaps around line 250.

@intelliot
Copy link
Collaborator

@dangell7 - do you have an opinion on whether this should be considered for the next release (2.0.0) or if it's ok for it to wait for 2024?

(The first release in 2024 is expected for Jan or Feb.)

@intelliot intelliot added this to the 2024 release milestone Oct 17, 2023
@dangell7
Copy link
Collaborator Author

@intelliot I have no opinion. 2024 is fine. I don't think many use paychan but its nice that this is fixed at some point.

Copy link
Collaborator

@scottschurr scottschurr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. Thanks for adding the test!

@intelliot
Copy link
Collaborator

  • bug fix. Should not require perf signoff. cc @sophiax851

@intelliot
Copy link
Collaborator

@dangell7

  1. This PR has conflicts that must be resolved.
  2. After resolving, please confirm that you still think this PR is ready to merge.

@intelliot intelliot removed this from the 2.2.0 (June 2024) milestone May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

8 participants