proposal: SARP Core Components #1842

masapr · 2023-07-12T13:59:41Z

Project Abstract

This is the follow up to our initial research proposal, that we delivered here. The goal of this work package is to evaluate different software designs to implement a static code analysis on substrate pallets with MIRAI. Furthermore we want to investigate issues and bugs, we found in MIRAI in the previous work package.

Pull requests of the previous grant:

Grant level

Level 1: Up to $10,000, 2 approvals
Level 2: Up to $30,000, 3 approvals
Level 3: Unlimited, 5 approvals (for >$100k: Web3 Foundation Council approval)

Application Checklist

The application template has been copied and aptly renamed (project_name.md).
I have read the application guidelines.
Payment details have been provided (bank details via email or BTC, Ethereum (USDC/DAI) or Polkadot/Kusama (USDT) address in the application).
The software delivered for this grant will be released under an open-source license specified in the application.
The initial PR contains only one commit (squash and force-push if needed).
The grant will only be announced once the first milestone has been accepted (see the announcement guidelines).
I prefer the discussion of this application to take place in a private Element/Matrix channel. My username is: @_______:matrix.org (change the homeserver if you use a different one)

CLAassistant · 2023-07-12T13:59:46Z

All committers have signed the CLA.

Noc2

Thanks a lot for the application. Let me also share it with @bhargavbh and @semuelle, who previously evaluated your delivery. I have one initial feedback. The milestone deliveries are currently relatively unspecific. You could fulfill these relatively quickly. Could you be more specific here? Especially since this is already a follow-up grant. The milestone table basically lists the requirements of our contracts and should contain as many details as possible.

bhargavbh

In my opinion, the follow-up grant should clearly have a tool (albeit with limitations) as a deliverable. A follow-up grant on ideating solutions without a tool that generates the MIRAI annotations for certain bug-classes is not the best way forward. The tool can assume certain human-in-the-loop model if it makes tricky cases easier to handle. This would involve tweaks on the MIRAI side to make the analysis faster by tracking only the pallet-level dataflow graph.
Feel free to add more technical details in the milestones regarding the approach.

applications/sarp-2-swdesign.md

masapr · 2023-07-13T14:26:56Z

Thanks for the feedback

I updated the deliverables to be more concrete. Also I included the delivery of a prototype tool.

I didn't incorporate all of your feedback @bhargavbh. You're giving very valuable input. My plan was to evaluate these things, when we actually do the project and not beforehand. Is that ok for you?

bhargavbh · 2023-07-14T10:48:56Z

hi @masapr thank you for making the changes. I think the tool should be the main deliverable and list the vulnerability classes that it supports (generating the MIRAI annotations automatically), which should be demonstrated by a decent number of examples (ideally real-world but synthetic examples/tests would suffice). The analysis of approaches would just be a step towards the tool. Could you please restructure the deliverables to reflect this prioritisation by adding more details to what the tool does (there is ofcourse room to modify how its achieved during the grant).
Secondly, its not convincing (from a usability point-of-view) to hope that the runtime pallets will not use the FRAME macros. Hence, its better to focus on an approach where the tool works for FRAME macros.

…vulnerability classes

masapr · 2023-07-17T09:38:09Z

Hi @bhargavbh

Ok, I became more concrete now on the examples and we'll provide examples for 3 of the 5 vulnerability classes (of the original RFP). I stated this explicitly in the deliverables now.

Also, I updated the approaches for the software design to include your suggestions, especially the preprocessing/annotation tool. I think it makes sense to look into this first. I still believe, though, that we have to figure out a way for the analysis to not run over specific parts of the code, simply to reduce the complexity. I don't know if this can be achieved by annotations.

To me, the main deliverable is not the tool, but the documentation on the software design. The real tool, will be delivered later. Here, we only show the feasibility of the approach.

semuelle

There were some small improvement suggestions in the evaluation of the last milestone. Would be nice to see them addressed. Other than that, happy to proceed.

bhargavbh · 2023-07-28T12:31:11Z

Hi @semuelle Sabine and i had a chat recently on the technical feasibility of the proposal. There is still some high-level exploring and design decisions to be made, which needs to be added as technical details to the proposal. Also the main deliverable should be a tool and not a research report as tool is of more utility to the community. Can you please revert to "changes requested" status? Thanks

Noc2 · 2023-08-09T11:32:23Z

@masapr I just wanted to follow up and ask about its current status.

masapr · 2023-08-10T07:30:48Z

@Noc2 yes, I can give an update: After the call with @bhargavbh our aim is a proposal with a clear tool delivery, together with asking for more funding. To make this proposal, we decided to investigate a bit more to reduce the risk and to be able to make a serious estimate. I should be ready with an adjusted proposal in 1-2 weeks.

It also depends on the outcome of the question I asked at MIRAI

Noc2 · 2023-08-11T12:33:17Z

Thanks for the update. In this case, I put the application on hold for now.

masapr · 2023-08-23T14:37:51Z

Hi @Noc2
Thanks for your feedback. I updated the proposal to:

include an example in the tutorial
I removed the "Engagement" point. This was a left-over from the previous version. I think with this proposal the usual engagement alongside the milestones is enough.
I added the ink! smart contracts in the future roadmap. We haven't analyzed this yet, but you're right, this has a great potential.

semuelle · 2023-08-25T15:39:06Z

Thanks for the updates, @masapr. I'll share the application with the rest of the committee.

Noc2

I'm happy to go ahead with it. Thanks for the update.

bhargavbh · 2023-08-28T07:38:15Z

Hi @bhargavbh The plan is to mock the most basic functionality, induced by macros and storage. We do not plan to mock regular pallet functions. I don't have literature to point to. There's no more reason behind this approach, than just to reduce the complexity by reducing the complexity of the call graph, that is being analyzed.

Hi @masapr. I am skeptic about the usefulness of the proposed tool where the developer has to use mock-ups for basic functionalities. How trivial/difficult will this process be? Is it a matter of just replacing the dependencies? Or a tussle with the rust compiler to even get it compiled. It is hard to answer this question for any arbitrary pallet under analysis (which might use other basic functionalities than the provided mock-ups). It seems the problem statement is being adapted to match the tool (MIRAI) than the other way around. That said, if the committee is interested in funding exploratory approaches in the area of security tooling, this might be a good first effort and result in interesting insights.

Secondly, from my understanding, the main contribution of M1 is to write Mock-up for basic functionality at the right abstraction level which can be substituted in various pallets. This is something which could be clarified in the deliverables and also mention what pallets will be supported.

github-actions · 2023-08-28T08:59:36Z

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

masapr · 2023-08-28T09:00:47Z

Hi @bhargavbh
I agree, that it is an open question how easy the process with the compiler will be. The idea is to have a fully automated process and that the user doesn't have to worry about the process. This is why we create a copy and then adjust the cargo.toml to point to the mock-libraries. I'm not entirely sure, if this approach is too naïve and we're overseeing something. But I think this could work and should give good results.

I also agree, that the result might still be, that the complexity is too high, because of dependencies (and the variety of those). But to be honest: then it is generally nonesense to do static code analysis on pallet code. The whole idea was, that pallet code can be analyzed, because it is not too complex in itself and doesn't have too many dependencies.

I updated the proposal text on the deliverables of milestone 1 & 2. Our plan is, to have SARP running against the examples, provided for the 2 vulnerability classes and to evaluate the tool against frame pallets. The actual adjustments needed for the frame pallets will then be done in milestone 3.

masapr · 2023-08-28T11:55:15Z

I have read and hereby sign the Contributor License Agreement.

takahser

I tend to agree with @bhargavbh here - requiring the developer to add annotations manually sounds doesn't sound very straight-forward. The interest of using a solution like this might be rather low. Hence, I don't see much value in M1.

Apart from that, I'm also concerned about the limitations that mocking the lib code brings with it.

masapr · 2023-09-08T06:44:21Z

@takahser we don't want developers to add annotations manually. We cover that aspect in milestone 2.

masapr · 2023-09-14T07:35:38Z

@takahser to add to my comment above: The value of M1 is to lay the foundation for the final tool and to know if the approach of mocking part of the library code actually delivers the results we hope for.

bhargavbh · 2023-09-14T09:11:01Z

Hi @bhargavbh I agree, that it is an open question how easy the process with the compiler will be. The idea is to have a fully automated process and that the user doesn't have to worry about the process. This is why we create a copy and then adjust the cargo.toml to point to the mock-libraries. I'm not entirely sure, if this approach is too naïve and we're overseeing something. But I think this could work and should give good results.

Fair enough, worth giving it a shot. Just changing the dependency might work. However, the important question again is how comprehensive are the mocks written by you and in how many pallets they can be substituted (with the mockups) to reduce the complexity.

I also agree, that the result might still be, that the complexity is too high, because of dependencies (and the variety of those). But to be honest: then it is generally nonesense to do static code analysis on pallet code. The whole idea was, that pallet code can be analyzed, because it is not too complex in itself and doesn't have too many dependencies.

It isa bit harsh to conclude static analysis on pallet code is nonsense because a particular tool times out. If the tool/technique times out, then generally the approach is to try and switch to different levels of abstraction to reduce complexity where the property (in our case incorrect origins) can still be expressed.

I updated the proposal text on the deliverables of milestone 1 & 2. Our plan is, to have SARP running against the examples, provided for the 2 vulnerability classes and to evaluate the tool against frame pallets. The actual adjustments needed for the frame pallets will then be done in milestone 3.

This basically highlights my point that the mockups you write are not universal. They have to be designed for each pallet. Writing mockups to bypass all the complex functionalities of pallets is not possible because you do not know what functionality a (new) pallet uses. At best this approach can described as the team writing mockups for the 5 pallet and running MIRAI on it, and not really considered a tool. There is very high likelihood that the mockups you write in M1 are not applicable for a new pallet. M2 however can be seen as a tool in itself. If and when there is possibility of running an analyser, the auto-generator for annotations can be plugged-in.

takahser

@masapr I currently see no value in M1, since by itself it's useless. In theory, you could just deliver M1 and cash-in USD 47k without ever delivering M2. Hence, I'm not going to approve M1.
Also, I'd like to hear your perspective on @bhargavbh's most recent comment.

masapr · 2023-09-14T13:21:17Z

This basically highlights my point that the mockups you write are not universal. They have to be designed for each pallet. Writing mockups to bypass all the complex functionalities of pallets is not possible because you do not know what functionality a (new) pallet uses. At best this approach can described as the team writing mockups for the 5 pallet and running MIRAI on it, and not really considered a tool. There is very high likelihood that the mockups you write in M1 are not applicable for a new pallet. M2 however can be seen as a tool in itself. If and when there is possibility of running an analyser, the auto-generator for annotations can be plugged-in.

Maybe this wasn't clear: we write mocks for the functions introduced by the pallet macros and the most basic functionality of pallets, like using storage elements. There is no plan (currently) to mock all possible libraries or pallets, someone might use. To make it more tangible, we commit ourselves to at least 5 working pallets from the frame pallets.

It isa bit harsh to conclude static analysis on pallet code is nonsense because a particular tool times out. If the tool/technique times out, then generally the approach is to try and switch to different levels of abstraction to reduce complexity where the property (in our case incorrect origins) can still be expressed.

Sorry, for being too harsh. There are two things I meant: 1. It was part of the original RFP: [Runtime Pallets] are usually concise pieces of standalone code with relatively few dependencies and clear specifications, hence tractable targets for performing static analysis and verification. So, this was always the assumption. 2. I honestly believe, that it is a huge problem to analyze very complex code. So if we introduce large complexity through libraries (and we cannot prevent the tool from analyzing the complex library), then we will always have a problem. No matter the level of abstraction. But I might underestimate the reduction of complexity we gain from choosing the right level of abstraction.

All in all: I believe our approach only works if the assumption:
[Runtime Pallets] are usually concise pieces of standalone code with relatively few dependencies and clear specifications, hence tractable targets for performing static analysis and verification.
holds true. If it holds true, we should be able to achieve a good analysis by mocking the most basic functions used in any pallet.

masapr · 2023-09-14T13:30:51Z

@masapr I currently see no value in M1, since by itself it's useless. In theory, you could just deliver M1 and cash-in USD 47k without ever delivering M2. Hence, I'm not going to approve M1. Also, I'd like to hear your perspective on @bhargavbh's most recent comment.

I could move parts from M1 to M2 and make M1 a smaller, pure research milestone with less deliverables. Then the damage is not as big, in case the approach doesn't work.

bhargavbh · 2023-09-18T18:24:26Z

Hi @masapr
Would the mock functionalities be selected fixing the 5 pallets? Or the pallets be selected fixing the mock functionality? Either ways, how will this be used by a pallet developer who wants to analyse his newly written pallets.

At the fundamental level, the mockups are written to reduce the complexity of the analysis right? How are these functionalities identified. I would assume by looking at the MIR graph generated (on which MIRAI acts upon). Why not just collapse the complex functionality (which is inlined) directly at the MIR level. The whole exercise of writing mockups has the same effect, except the compressing/flattening of the graph happens at the source-code level, which requires additional replacement steps. By flattening, i mean collapsing the external inline call of the complex functionality and bypassing it.

Either ways, it may well work with 'mocking functionality' approach but i do not see how advantages of mockups approach. Both approaches require identifying the complex functionalities that time out, which is gathered at the MIR level, hence easier to rectify/patch at the MIR level.

As i said before, the mockups maybe interesting to get more insights into pallets static analysis and i would be happy to see the approach work. But some reasoning for the rational behind this approach in comparison to more conventional ones would be helpful. To summarise:

do i see direct utility of the tool in developer community?
Probably no. if one is supposed to replace pallets with mock functionalities by changing dependencies, this is not going to be used much. Further, maintaining these mocks is even more difficult with ever changing functionality/pallet pairs.
is it a useful exercise which can provide insights and first attempt at static analysis?
Probably yes.

masapr · 2023-09-19T07:09:09Z

I feel like the whole discussion got into a wrong direction. The mocks are just a technical detail to the implementation of the tool. We have no intend for users to write mocks themselves.
After analyzing how MIRAI works and talking with Herman, the main developer of MIRAI, we came to the conclusion to simply write mocks for the code, that we don't want to be analyzed. But this is happening in the background. The end-user should not see any of this.

We do want to deliver a useful tool here, and not perform an exercise. We chose the "5 frame pallets", because they are real world examples, that follow a certain standard. We want to give a first proof, that the tool is usable in real-world scenarios. (This is not perfect yet, of course, but a good starting point).

masapr · 2023-09-25T08:50:17Z

@bhargavbh Honestly, I really don't know what you expect from us. Do you want us to automatically identify on MIR-level, which parts of the code are complex and can be left out? I don't see, how we could identify those in a generic way. On the other hand, if we know, which parts we want to leave out from the analysis (explicitly, after analyzing which parts lead to a certain complexity), we might as well mock them.

keeganquigley

Thanks for your thorough answers @masapr and for the deep dive @bhargavbh I'm willing to go forward with it as well.

semuelle · 2023-10-11T19:57:14Z

Hi @masapr. Thanks again for the application and thorough discussion. After internal debate, it's become clear that the application will not find the necessary support in the committee. I know that this isn't the news you were looking for, but I want to stress that we are very thankful for the insightful conversation and the work you put into the initial grant and application. I hope that this doesn't mean the end for the project, and that we will come together again in some capacity. In any case, best of luck going forward and feel free to apply again anytime.

proposal: SARP software design

bbb1c44

Fixed text in overview to satisfy checks

6595022

Noc2 requested changes Jul 13, 2023

View reviewed changes

Noc2 requested review from semuelle and bhargavbh July 13, 2023 08:18

bhargavbh suggested changes Jul 13, 2023

View reviewed changes

applications/sarp-2-swdesign.md Outdated Show resolved Hide resolved

applications/sarp-2-swdesign.md Outdated Show resolved Hide resolved

applications/sarp-2-swdesign.md Outdated Show resolved Hide resolved

Noc2 added the changes requested The team needs to clarify a few things first. label Jul 13, 2023

clarify deliverables

bf0f18c

more details on sw design approaches + deliverable with examples for …

fc5cad0

…vulnerability classes

semuelle previously approved these changes Jul 28, 2023

View reviewed changes

semuelle added ready for review The project is ready to be reviewed by the committee members. changes requested The team needs to clarify a few things first. and removed changes requested The team needs to clarify a few things first. labels Jul 28, 2023

semuelle added changes requested The team needs to clarify a few things first. and removed ready for review The project is ready to be reviewed by the committee members. labels Jul 28, 2023

semuelle self-assigned this Jul 28, 2023

Noc2 added the on hold There is an external blocker, such as another grant in progress. label Aug 11, 2023

masapr dismissed semuelle’s stale review via a43de34 August 22, 2023 07:13

masapr force-pushed the master branch 2 times, most recently from 3d704f5 to b6eebb8 Compare August 22, 2023 07:30

masapr changed the title ~~proposal: SARP software design~~ proposal: SARP Core Components Aug 22, 2023

semuelle added ready for review The project is ready to be reviewed by the committee members. changes requested The team needs to clarify a few things first. and removed changes requested The team needs to clarify a few things first. labels Aug 25, 2023

Noc2 previously approved these changes Aug 25, 2023

View reviewed changes

clarified deliverable

a992805

masapr dismissed Noc2’s stale review via a992805 August 28, 2023 08:59

Noc2 approved these changes Aug 30, 2023

View reviewed changes

takahser self-requested a review August 30, 2023 13:22

nikw3f approved these changes Sep 1, 2023

View reviewed changes

takahser requested changes Sep 7, 2023

View reviewed changes

semuelle approved these changes Sep 11, 2023

View reviewed changes

takahser requested changes Sep 14, 2023

View reviewed changes

keeganquigley approved these changes Oct 9, 2023

View reviewed changes

semuelle closed this Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: SARP Core Components #1842

proposal: SARP Core Components #1842

masapr commented Jul 12, 2023 •

edited

Loading

CLAassistant commented Jul 12, 2023 •

edited

Loading

Noc2 left a comment

bhargavbh left a comment

masapr commented Jul 13, 2023

bhargavbh commented Jul 14, 2023

masapr commented Jul 17, 2023 •

edited

Loading

semuelle left a comment

bhargavbh commented Jul 28, 2023

Noc2 commented Aug 9, 2023

masapr commented Aug 10, 2023

Noc2 commented Aug 11, 2023

masapr commented Aug 23, 2023

semuelle commented Aug 25, 2023

Noc2 left a comment

bhargavbh commented Aug 28, 2023

github-actions bot commented Aug 28, 2023 •

edited

Loading

masapr commented Aug 28, 2023

masapr commented Aug 28, 2023

takahser left a comment

masapr commented Sep 8, 2023

masapr commented Sep 14, 2023

bhargavbh commented Sep 14, 2023

takahser left a comment

masapr commented Sep 14, 2023

masapr commented Sep 14, 2023

bhargavbh commented Sep 18, 2023

masapr commented Sep 19, 2023

masapr commented Sep 25, 2023

keeganquigley left a comment

semuelle commented Oct 11, 2023

proposal: SARP Core Components #1842

proposal: SARP Core Components #1842

Conversation

masapr commented Jul 12, 2023 • edited Loading

Project Abstract

Grant level

Application Checklist

CLAassistant commented Jul 12, 2023 • edited Loading

Noc2 left a comment

Choose a reason for hiding this comment

bhargavbh left a comment

Choose a reason for hiding this comment

masapr commented Jul 13, 2023

bhargavbh commented Jul 14, 2023

masapr commented Jul 17, 2023 • edited Loading

semuelle left a comment

Choose a reason for hiding this comment

bhargavbh commented Jul 28, 2023

Noc2 commented Aug 9, 2023

masapr commented Aug 10, 2023

Noc2 commented Aug 11, 2023

masapr commented Aug 23, 2023

semuelle commented Aug 25, 2023

Noc2 left a comment

Choose a reason for hiding this comment

bhargavbh commented Aug 28, 2023

github-actions bot commented Aug 28, 2023 • edited Loading

masapr commented Aug 28, 2023

masapr commented Aug 28, 2023

takahser left a comment

Choose a reason for hiding this comment

masapr commented Sep 8, 2023

masapr commented Sep 14, 2023

bhargavbh commented Sep 14, 2023

takahser left a comment

Choose a reason for hiding this comment

masapr commented Sep 14, 2023

masapr commented Sep 14, 2023

bhargavbh commented Sep 18, 2023

masapr commented Sep 19, 2023

masapr commented Sep 25, 2023

keeganquigley left a comment

Choose a reason for hiding this comment

semuelle commented Oct 11, 2023

masapr commented Jul 12, 2023 •

edited

Loading

CLAassistant commented Jul 12, 2023 •

edited

Loading

masapr commented Jul 17, 2023 •

edited

Loading

github-actions bot commented Aug 28, 2023 •

edited

Loading