Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: cleanup for release #2

Merged
merged 21 commits into from
May 27, 2024
Merged

Conversation

amusingaxl
Copy link
Contributor

@amusingaxl amusingaxl commented Apr 25, 2024

Get this repo up-to-date with the most recent engineering practices, clean it up and get it ready for release.

@amusingaxl amusingaxl self-assigned this Apr 25, 2024
@amusingaxl amusingaxl marked this pull request as ready for review April 25, 2024 22:48
oddaf
oddaf previously approved these changes Apr 26, 2024
0xdecr1pto
0xdecr1pto previously approved these changes Apr 30, 2024
.gitignore Show resolved Hide resolved
foundry.toml Show resolved Hide resolved
README.md Outdated

Used to deploy a single spell that can be elected to a privileged position. This requires that the `_spell` address parameter conforms to a MakerDAO conformant [spell](https://github.com/makerdao/spells-mainnet)
#### 1. `deploy(DssSpellLike _spell)(address)`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case would you prefer using deploy(DssSpellLike _spell)(address) over deploy(address _usr, bytes32 _tag, bytes memory _fax, uint256 _eta)(address)? I understand the problem with non-conformant (or rather malicious) spells, so in my opinion it's always safer to use the second option as it doesn't require an expertise to prove that the spell is conformant. It's just a bit more complicated to use as it requires parsing calldata of the original plot transaction (but that can also be automated outside of the chain)

Copy link
Contributor Author

@amusingaxl amusingaxl May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where a conforming spell might need to be dropped. For instance: some last minute legal issue that was found or a bug in the spell.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that there are conformant spells that might need to be dropped. I rather question why not only have deploy(address _usr, bytes32 _tag, bytes memory _fax, uint256 _eta)(address) as it can be used to drop both types of spells and more importantly does not require expertise to decide if the spell is conformant. You can have a script along the repo to fetch all plotted spells and parse required parameters for calling this function no matter if the spell is conformant or not.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment for a scenario when calling deploy(DssSpellLike _spell)(address) could be weaponised

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it will make it more difficult instead of easier for users.
Code was updated.

README.md Outdated
Comment on lines 16 to 18
## Usage

### Drop an individual plan
### Create a single drop spell

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there are 4 different options to use with 4 different reasons and 2 different set of actions (voting for protego + calling drop vs calling for new spell + voting for new spell + calling drop), I find it necessary to have a very clear instructions (IF/ELSE, step-by-step) explaining which of the 4 functions to use when and in which order. As in case of emergency:

  • There will be a limited time to make a decision
  • There will be delegates/non-technical users who also need to be convinced to vote for a specific version of protego (imagine someone deliberately uses deploy in order to confuse delegates and don't vote for the protego itself)

As the next step, we might even consider writing a script that evaluates currently plotted spells, then fetches required parameters for the specific call and executes it. This way, the choice itself would also be reviewable/auditable in advance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it necessary to have a very clear instructions (IF/ELSE, step-by-step) explaining which of the 4 functions to use when and in which order.

Fully agree. Part of the deliverable is to provide a long-written post and also train governance facilitators on how to use the tool. We're just not sure the README is the best way to put the content verbatim. Maybe we'll have a link here to these documents once it's finished.

As the next step, we might even consider writing a script that evaluates currently plotted spells, then fetches required parameters for the specific call and executes it.

Not sure a script would be good enough in this case. There probably needs to be a UI for that and such feature be included there, however this is out of the scope of this repo.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main point is: the choice to use one of the four methods could be included in the scope of the repo specifically for it to be audited. This can be done the simplest way possible, for example, using a s.sol file or a js-based script

Copy link
Contributor Author

@amusingaxl amusingaxl May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: when you mention the "four methods", you are referring to:

  1. Deploy a drop spell with a spell address (for conforming spells)
  2. Deploy a drop spell from plot params (for non-conforming spells)
  3. Call Protego.drop() with a spell address (for conforming spells)
  4. Call Protego.drop() with plot params (for non-conforming spells)

Correct?

Do you mean we should include all flows into scripts for auditing?
What kind of additional value do you expect this to bring, given that those flows are already present in the test files?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct?

Correct

Do you mean we should include all flows into scripts for auditing? What kind of additional value do you expect this to bring, given that those flows are already present in the test files?

Current tests only check listed flows separately, isolated from each other. My proposal is to add usage instruction or (if possible) a script that would explain "how to choose between 4 methods" or directly execute a specific method (e.g. if there are multiple spells plotted and not executed). I propose this because this decision (no matter if described in words or in code) is security-critical (e.g. if a wrong method is used it might not drop the spell) and time-sensitive (highly likely there wouldn't be time to deploy another dropspell as voting also takes time) and therefore would highly benefit from external audit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping that the documentation is now clear enough, since there are only 2 methods left.
Maybe there should be something written in a "decision tree" format, but not sure if it's strictly necessary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there should be something written in a "decision tree" format, but not sure if it's strictly necessary.

Let's do it in the follow up PR, as part of the script?

script/dependencies/ProtegoDeploy.sol Show resolved Hide resolved
uint256 eta;
}

contract BadAction {
contract MaliciousSpell {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you calling it malicious if it's actually not doing anything unexpected?

To clean this up a bit more, I would suggest to:

  • Move MaliciousSpell into the test subfolder
  • Rename two test spells to be consistent with introduced terms (ConformantSpell.sol and NonConformantSpell.sol)
  • Make NonConformantSpell define action, tag, sig, eta in way that is non-conformant / malicious
  • Update testDropSpellParams to use NonConformantSpell methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna rearrange the contracts the way you're suggesting, however I don't get this part:

Make NonConformantSpell define action, tag, sig, eta in way that is non-conformant / malicious

It's already doing that. MaliciousSpell allows anyone to plot any action + tag + sig + eta combination.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that the NonConformantSpell can actually try to be malicious: it can define action, tag, sig, eta not as public variables like ConformantSpell does, but as public methods with custom malicious logic. For example: tag can return a different value when called by the pause compared to when called by protego. Meaning that it would still be conformant interface-wise, but not conformant logic-wise: making it callable via deploy(DssSpellLike _spell)(address) which would then deploy incorrect DropSpell.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A scenario like that cannot be technically prevented from a code perspective, it is more like a process issue.

I understand what you are saying, but IMO it does not help to "prove" the correctness of the implementation, which is the main goal of the tests here.
It should be clear that for non-conforming spells the version with the parameters should be used. This applies even if a non-conforming spell looks like a conforming one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot be technically prevented from a code perspective

Design-wise, I think if we don't even have deploy(DsSpellLike _spell)(address) and drop(DsSpellLike _spell) methods and instead have a script that parses required data from the original plot transaction it would be decrease probability of mistakes. Of course someone could still just trust the spell, but it would be less convenient.

IMO it does not help to "prove" the correctness of the implementation, which is the main goal of the tests here.

I understand that theoretically it should not make a difference, but I personally usually rather make a test more realistic than try to imagine all edge cases that less realistic test leaves out. So I can agree that it's an opinionated suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I get your point. If we're talking about a conforming spell, it most likely was written by one of the teams in the spell roster, so they could easily assist governance facilitators figuring out the parameters. Maybe it makes sense to reduce the interaction surface by removing the redundant methods.

There are some things to consider, however:

[...] instead have a script that parses required data from the original plot transaction it would be decrease probability of mistakes [...]

Who would be the end users of such script?
We want Protego to be usable regardless of the availability of engineering teams.
I guess we most likely need a UI instead of a script.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who would be the end users of such script?
We want Protego to be usable regardless of the availability of engineering teams.
I guess we most likely need a UI instead of a script.

I agree that the end goal is enabling non-engineers to run it. But my other point is to include the usage into the scope of the audit. Also, nothing stops you from writing a simple js script that is then directly importable as a dependency into the UI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. However, could we do that in a follow-up PR?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link

@SidestreamColdMelon SidestreamColdMelon May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design-wise:

  • I'm not sure why do we need to have a custom drop function (for the deployed version) instead of conforming to the standard DssAction interface (using schedule to drop the spell as in emergency) and therefore leveraging TechOps-maintained chief keeper (which would otherwise would most probably be broken, endlessly trying to schedule a spell that is not done)
    • Anyway, it's a good idea to test-run both protego options with their keeper
  • I don't think the two use cases (deploying DropSpell vs lifting the contract itself) have to be part of the same contract as it adds confusion and additional surface area. At least improving the naming – not using the same protego word for both cases – would already help
  • Documentation, design and tests currently does not account for the aftermath: the logic of dropping the protego itself after it has been lifted. Is this a loophole or can a separate spell be lifted in order to disable it?

Copy link
Contributor Author

@amusingaxl amusingaxl May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why do we need to have a custom drop function (for the deployed version) instead of conforming to the standard DssAction interface (using schedule to drop the spell as in emergency) [...]

drop() exists only on Protego itself. On DropSpell we use cast(), which is part of the default interface. However, I see this can be confusing, as in regular spells we need to schedule() before cast(). In this case, we could move the pause.drop() call to schedule and make cast() an empty function to keep consistency.

However, since this is an emergency response, there will be enough eyes on it, so anyone can call the functions to not rely on the TechOps keeper.

I don't think the two use cases (deploying DropSpell vs lifting the contract itself) have to be part of the same contract as it adds confusion and additional surface area.

The counter-argument would be that there is one less moving part to keep track of. In case of emergencies, people will know Protego is the go-to place. If we split the functionality in 2 different contracts, the cognitive overhead would be slightly increased.

Documentation, design and tests currently does not account for the aftermath: the logic of dropping the protego itself after it has been lifted. Is this a loophole or can a separate spell be lifted in order to disable it?

Agreed that this needs to be better documented. Protego as a "drop any spell" facility only works as long as it has the hat. To disable it, governance just need to lift a different spell.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop() exists only on Protego itself. On DropSpell we use cast(), which is part of the default interface. However, I see this can be confusing, as in regular spells we need to schedule() before cast(). In this case, we could move the pause.drop() call to schedule and make cast() an empty function to keep consistency.

I think using schedule() + having done() method would be enough to improve keeper situation and making its interface more friendly for existing keepers

To disable it, governance just need to lift a different spell.

Does it needs to be a spell or could it be an empty/dead address? If only a spell, then you might consider adding a method to deploy a "blank" spell.

I would anyway add a test for the aftermath as lifecycle of the protego contract is not finished until it is voted out / can no longer drop spells.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be really anything: address(0), an old spell that has already been cast, etc.
In either case, I'll include it in the test for the happy path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using schedule() + having done() [...]

Not sure that we want to have state for done here. In a situation where there is a potential governance conflict, we could have the same bad spell being scheduled more than once. That would require the ones counter-acting it to deploy another drop spell.

I don't know the specifics about the implementation of the bot, but IMO it would be better to have it calling a no-op cast() indefinitely until the hat changes than requiring the deployment of a new spell.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a situation where there is a potential governance conflict, we could have the same bad spell being scheduled more than once.

I actually proposed to have done() as a method, not variable. It can return true as soon as no matching plan is found. So basically act as inverted planned()(bool). In other words be one method more compliant with DssExec.

I don't know the specifics about the implementation of the bot

FYI, the source code is actually available here https://github.com/makerdao/chief-keeper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually proposed to have done() as a method, not variable. It can return true as soon as no matching plan is found. So basically act as inverted planned()(bool). In other words be one method more compliant with DssExec.

Fair enough. In this case, we have a revolving done(), that will return false again if the same spell is somehow plotted again.

Copy link
Contributor Author

@amusingaxl amusingaxl May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is updated to take that suggestion into account. Good stuff 💯 .

@amusingaxl amusingaxl dismissed stale reviews from 0xdecr1pto and oddaf via cd7db94 May 14, 2024 15:20
src/EmergencyDropSpell.sol Outdated Show resolved Hide resolved
src/EmergencyDropSpell.sol Outdated Show resolved Hide resolved
src/Protego.t.sol Outdated Show resolved Hide resolved
Comment on lines 89 to 91
function schedule() external {
pause.drop(action, tag, sig, eta);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also emit Drop event here for consistency (since it can be executed multiple times)? Or maybe even call protego.drop so that events accumulate under the same address

Copy link
Contributor Author

@amusingaxl amusingaxl May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protego.drop is only possible if Protego has the hat. Also not sure if we should call this event Drop or Schedule 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check now. I think I found a good solution for the naming issue.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protego.drop is only possible if Protego has the hat. Also not sure if we should call this event Drop or Schedule 😅

You can do delegatecall, but I don't know if it's a good practice to use delegate call for the sake of having events in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in this case it would work because there's no state involved, however I believe this would not land well with the existing security practices. A spell with delegatecall would cause an aneurysm in some people 😅.

src/test/NonConformingSpell.sol Outdated Show resolved Hide resolved
@amusingaxl amusingaxl merged commit fc1f94c into master May 27, 2024
1 check passed
@amusingaxl amusingaxl deleted the refactor/cleanup-for-release branch May 27, 2024 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants