Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Inconsistencies in Node Contracts with Public IP Address #1002

Open
sameh-farouk opened this issue Sep 10, 2024 · 0 comments
Open
Assignees
Labels
tfchain tfchain repo issue type_bug Something isn't working
Milestone

Comments

@sameh-farouk
Copy link
Member

sameh-farouk commented Sep 10, 2024

Description: There are design inconsistencies when handling node contracts with public IP addresses, particularly in cases where the node contract exists alongside a rent contract.
The issue arises since, unlike resources consumption cost, public IP costs and related bandwidth consumption are billed to the node contract, rather than being offloaded to the rent contract. This introduces the possibility that the node contract will start managing its state independently, creating potential race conditions due to the mismatch in state management between the rent and node contracts. let's consider the following scenarios:

  1. Insufficient funds for rent contract but enough for IP rent: When the user has insufficient funds to settle the rent contract but enough to cover the IP rent, the rent contract gets suspended, bringing all associated node contracts down. However, the billing for the node contract with the public IP address eventually restores the associated workload, despite the rent contract being suspended. This could result in the workload being repeatedly restored, depending on user actions and timing.

  2. Sufficient funds for rent contract but not for IP rent: If the user can settle the rent contract but lacks the funds to cover the node contract with a public IP, the rent contract restoration brings up all node contracts, including those with unsettled costs. This would sorted eventually when billing is triggered for the node contract, correcting its state but with the side effect of starting over/resetting the grace period.

  3. Deleted rent contract with node contracts has yet reserved balance: When a rent contract is deleted due to insufficient funds, all associated node contracts are deleted immediately. Some node contracts with reserved balances (for IP rent) won't have the chance to handle the reserved funds correctly, leading to funds being stuck on hold without being transferred to the beneficiary or returned to the user.

  4. (less severity) If a Node contract switched to a grace period due to insufficient IP rent, and later the rent contract switched to a grace period, the rent contract would sync all node contracts to their state, overriding the node contract grace period start block

Proposed Solutions:

  • For the first two cases: Add a defensive check before a node contract transitions to the created or resumed state. Node contracts should first verify the state of the rent contract. Additionally, when the rent contract attempts to resume a node contract, it should check if there are any unsettled funds associated with the node contract.

  • For the third case: Delay the deletion of the rent contract until all node contracts (involving some cost) are independently billed and deleted. This approach would require checking if any active node contracts have reserved balances before proceeding with the deletion of the rent contract.

  • For the last case: Don't allow changing state from graceperiod(x) to graceperiod(y)

Notes:

Integration Testing: These scenarios should be comprehensively covered in integration tests. Ensuring that all possible state transitions are accounted for will help avoid unexpected behavior during production.

Case 1 and Case 2: Although neither of these cases is critical or results in permanent failure, handling them will enhance the system's robustness. By addressing these scenarios, unexpected behavior—such as temporary workload restoration during a suspended rent contract—can be avoided. Preventing such brief transient issues will ensure a smoother user experience and reduce confusion.

Case 3 (Higher Priority): This case carries greater significance, as it directly impacts the proper distribution of rewards. If not handled, this can interrupt the normal reward distribution flow, potentially leaving a portion of the user’s balance unusable.

@sameh-farouk sameh-farouk self-assigned this Sep 10, 2024
@sameh-farouk sameh-farouk added type_bug Something isn't working tfchain tfchain repo issue labels Sep 10, 2024
@ashraffouda ashraffouda added this to the 2.9.0 milestone Sep 24, 2024
sameh-farouk added a commit that referenced this issue Oct 2, 2024
* fix: fix inconsistencies in node contracts states

* test: add tests for scenarios related to issue #1002
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tfchain tfchain repo issue type_bug Something isn't working
Projects
Status: Pending Deployment
Development

No branches or pull requests

2 participants