-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deitos Network Application #2040
Conversation
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read and hereby sign the Contributor License Agreement. |
Hello @keeganquigley, I have added the comment "I have read and hereby sign the Contributor License Agreement." as requested but it does not seem to be working. Is there anything else i should do in order to complete this step? Thanks! |
@rvalera is the author of the commits in this PR, thus they need to sign the CLA. |
I have read and hereby sign the Contributor License Agreement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the application. This sounds really interesting and very similar to Arweave. Have you looked at it? How is your solution different? One concern I always had in regard to Arweave is that you don't have any incentive to actually share the data with others (at least the last time I looked into it). You only have an incentive to share the hashes to get the rewards. How do you plan to address this? Could you add more technical details to the milestone specification?
Thank you very much for your interest and insights in our application @Noc2 ! While Deitos Network shares similarities with platforms such as Crust, Arweave, and IPFS, our primary focus is distinct. We emphasize the processing, structuring, and utilization of data. Our direction leans more towards Big Data and AI functionalities than acting as a descentralized storage service. Though storage is a crucial component, it's not our sole emphasis. Currently, our efforts for this grant are channeled towards the foundational aspects of Infrastructure Providers and Consumers, with a significant focus on storage. As we progress, we intend to extend the agreement between Providers/Consumers with computational resources (such as vCPU and RAM), besides the storage services. Additionally, we plan to add more components to our security framework with advanced privacy features. However these two phases will not be part of the deliverables included in this grant.
Regarding your observation on Arweave: You've pinpointed a valid concern. Certain platforms may lack compelling incentives to promote data sharing among users. However, for Deitos Network, our objective isn't necessarily broad data sharing among all participants. Our model aims to facilitate a collaborative yet private interaction between Infrastructure Providers and Consumers, all under the Deitos Network umbrella. This approach not only diversifies the options available to consumers but also fosters healthy competition among Providers. This is a stark contrast to the present scenario where a handful of dominant entities, such as AWS and Google Cloud. It's important to highlight that we haven't established any specific policies or incentive mechanisms for data sharing among actors. This omission is intentional given that the data analysis industry frequently deals with sensitive and private information, where stakeholders typically prefer it not be shared or publicly exposed. The configuration for the infrastructure provider will encrypt data by default. However, within our protocol design, we're considering the possibility of providers holding public data as an open service, encouraging contributions to datasets or raw files. Distinctly, a primary divergence between our model and existing decentralized storage solutions like Arweave or IPFS is that once the agreed-upon duration between the infrastructure provider and the consumer expires, the data is deleted. This action ensures that providers can liberate their storage space for other consumers' utilization. Moreover, with minor adjustments, the toolset for data scientists remains largely unchanged. This means they won't be burdened with learning specialized software to access and use the data.
This is a refined specification of the milestones, enhanced with additional technical details. Please consider that certain implementation specifics are left out as they might change during the development process. Could you please confirm if this aligns with your expectations? Once confirmed, I'll move forward with updating the PR content: Milestone 1 — Initial setup and infrastructure provider pallet.
Milestone 2 — Proxy, file uploader and data integrity protocol.
Milestone 3 — Disputes resolving and protocol documentation.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update and the detailed reply. Feel free to integrate these details into the application. If I understand you correctly, your solution will rely mostly on a Dispute Resolution Committee and/or the on-chain reputation system. Have you already looked into potential attacks of such systems and how to address these?
Hello @Noc2! Thank you for your feedback and your insightful question. As agreed, we have updated the milestones section with all the technical details.
When considering dispute resolution and on-chain reputation systems, we analyzed scenarios like collusion and reputation gaming manipulation which could affect the fairness of dispute resolution processes and the accuracy of reputation scores. For the on-chain reputation system, it's noteworthy that scoring for a provider/consumer can only occur after the agreement has concluded and if no disputes arose during the agreement. This design makes Sybil attacks, where a group aims to prejudice an actor, costly and time-consuming as multiple agreements are a prerequisite. Although not foolproof, this design mitigates some risks associated with reputation manipulation. It's crucial to note that while our governance model is still in the design phase, the overarching goal for Deitos is to establish a decentralized governance system. We're looking to draw significant inspiration from Polkadot's Referendum chamber, aiming to embody a community-driven decision-making and decentralized approach. Regarding dispute resolution, our vision is that if the Dispute Resolution Committee is found to be acting inappropriately, the aggrieved party can escalate the case to a Referendum. However, this step is not encouraged within Deitos network actors due to the costly coordination and discussion it requires among all token holders. Escalation to a referendum necessitates a significant amount of locked tokens (configurable and adjustable by Referendum) and is only permissible within a specified timeframe post-dispute resolution. Post this period, escalation to a referendum is disallowed. This entire mechanism will be cautiously implemented once the governance model of the network is designed, agreed upon and implemented, given the tight relationship between these two components. At present, our primary objective in this initial project stage is to assemble all protocol components, meticulously measure, and validate our assumptions for optimal functionality. We observe parallels with Polkadot's iterative evolution concerning governance and collective roles. Currently, our dispute resolution module and on-chain reputation system are initial proposals, subject to refinement as the project progresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the update. In this case, removing the last milestone for now might make sense. This way, it's also only a level 2 grant. Apart from this, I have one more question before I share it with the rest of the team: Usually, the deliverables 0c and 0d. as well as 0e. for the last milestone of our template are mandatory. Could you add them again to the application or provide a reason why you removed them?
Hello David! Thanks again for your message.
Understood. We think your advice is very valid and we've decided to remove the milestone related to the Dispute protocol and downgrade the current grant to Level 2 as suggested. Since the disputes protocol is a key part in our design, we are sure that the development of this protocol could be tackled in future grants after the successful delivery of this grant.
We have updated the deliverables items 0(a/b/c/d) for both milestones according to the template provided. Also the 0e was added for the second and last milestone as well. Please let us know if any other remaining topic should be discussed or addressed. We would be more than pleased to answer them. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ernih thanks, that looks interesting. However, I've got a couple of questions as well:
- What's the "YARN" component in the storage section of the architecture diagram about? Are you referring to the yarn dependency tool or something else here?
- We've recently signed a similar grant, could you compare your scope to theirs and add the analysis to the proposal? Also, it'd be good if the market analysis would include Arweave as well, so feel free to integrate what's been discussed into this proposal.
- Do you require the infrastructure provider to run some kind of service that measures certain metrics, like uptime, workload, specs, etc. on the host machine? Or how would the dispute resolver committee be able to fairly resolve conflicts? For example, the consumers could just rent a bunch of computation power and (for example) after 80% of the rental period complain that there is a breach of contract and request a refund, although there was non and their model already has been trained successfully. How would you prevent or mitigate such a scenario? I assume, the consumer doesn't need to have a reputation to start renting, right? And if that's wrong, how would he reach it and at what cost?
- What are "block by block reward dynamics"?
Outline of storage quotas and its duration based on block by block reward dynamics
- Are any restrictions imposed in terms of the data, tools, or ML workflows that consumers can bring and run as per their specific use cases? For example, you mention Spark and Llama, but what if a consumer wants to use TensorFlow or PyTorch instead?
Hello @takahser, We appreciate your interest in the application. Below you have the answers for your questions:
Actually, the YARN in our context is not the JavaScript package manager, but it does share the same acronym. We're referring to Apache Hadoop YARN (Yet Another Resource Negotiator). Apache Hadoop YARN is the resource management and job scheduling technology in the Hadoop Distributed Processing Framework. As one of the core components of Apache Hadoop, YARN is responsible for allocating system resources to the various applications running in a Hadoop cluster and scheduling tasks to run on different cluster nodes.
Given the limited details available from the project described in their grant, an in-depth comparison is challenging. Nonetheless, based on the grant information, it appears there are parallels in terms of decentralizing machine learning model training, where rewards are based on data model training contributions and parameter adjustments by governance. Our approach, however, adopts a distinct architecture and game theory strategy. We focus on infrastructure providers offering private services, competing to deliver optimal solutions to consumers. In future developments, these providers may also engage in maintaining and utilizing a shared public dataset, rewarded for hosting this data and processing consumer requests. This aspect, though, is not covered in the current grant scope as previously mentioned. Another difference is that our project is not only related to machine learning or AI, but also provides business intelligence capabilities such as data analytics, predictive modeling, and customized reporting. These features enable businesses to gain deeper insights from their data, supporting more informed decision-making and strategic planning.
Acknowledged. We've included a supplementary section on Arweave in the proposal, specifically addressing the distributed storage aspect of our project. While there are overlaps between Arweave and Deitos, it's important to note that Deitos is primarily geared towards processing and training data. In contrast, Arweave's focus is predominantly on distributed storage solutions.
Your question raises key concerns about fairness and conflict resolution in computational resource rental scenarios. Apache Hadoop YARN will play a key role by providing logs of resource usage, which can be accessed by the Dispute Resolution Committee to inform their decisions. This grant is primarily focused on developing the storage layer and establishing the foundational data model, encompassing elements like network actor registration, agreement mechanisms, and data integrity protocols. The development stages envisioned for the network are as follows:
The execution aspect in this application is highlighted to provide a clear roadmap towards achieving the first usable version of the protocol. This aspect is a crucial element of the project's second stage of development.
"Block by block reward dynamics" refers to the mechanism where the reward for infrastructure providers is allocated per block, based on the terms of an agreement with a consumer. For instance, in a scenario where a consumer and an infrastructure provider agree on a resource rental for 144,000 blocks (equivalent to roughly 10 days at a 6-second block time) at a total cost of 10,000 tokens, the provider would earn approximately 0.069444444444 tokens per block. As the project progresses, we anticipate refining and optimizing this mechanism to reduce computational costs while maintaining its effectiveness.
The technology stack available on infrastructure providers is determined by the network to facilitate environment attestation, ensuring applications run as expected. This control is necessary to prevent issues like misreporting of resource allocation. However, if there's a community demand for additional tools like TensorFlow, the process involves raising the issue, discussing it within the network governance, and reaching a consensus. Once agreed upon, the integration and attestation of the new application are developed, allowing it to be included in the official tech stack. Please let us know if there are any additional topics that need to be discussed or addressed. If there are no further concerns, our team is eager to commence work on the milestones and make a meaningful contribution to the ecosystem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ernih thanks for your detailed answers, that's very helpful.
Your question raises key concerns about fairness and conflict resolution in computational resource rental scenarios.
Apache Hadoop YARN will play a key role by providing logs of resource usage, which can be accessed by the Dispute Resolution Committee to inform their decisions. This grant is primarily focused on developing the storage layer and establishing the foundational data model, encompassing elements like network actor registration, agreement mechanisms, and data integrity protocols.
Details about the Dispute Resolution mechanism, vital for mediating conflicts, will be covered in a future grant application as per @Noc2's request.The development stages envisioned for the network are as follows:
- Development of the storage layer and data model foundations, which is the current grant's focus.
- Addition of the execution aspects for agreements and the dispute resolution mechanism.
- Implementation of security measures such as infrastructure provider attestation to ensure execution integrity and reliability.
The execution aspect in this application is highlighted to provide a clear roadmap towards achieving the first usable version of the protocol. This aspect is a crucial element of the project's second stage of development.
Could you integrate this information into the proposal, maybe into the Mid-Term Plans section?
Given the limited details available from the project described in their grant, an in-depth comparison is challenging. Nonetheless, based on the grant information, it appears there are parallels in terms of decentralizing machine learning model training, where rewards are based on data model training contributions and parameter adjustments by governance.
Could you integrate this part as well? I think it'd be useful for other readers to be aware of this, even if you're not planning to make an in-depth comparison.
Considering the comprehensive nature and the interesting use case of your grant proposal, along with @kalaninja's proven expertise as evidenced in the previous grant for the substrate java client I'm fine with adding my approval to this proposal, once you've integrated the missing information. Meanwhile, I'll mark it as ready for review and share it with the rest of the committee.
Thank you very much @takahser! The latest information we discussed has been incorporated into the application as suggested. We agree that the application now appears much more informative following all the feedback we've received. Please let us know us if there's anything else we need to do on our end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, I added 2 more inline comments, other than that LGTM.
Co-authored-by: S E R A Y A <[email protected]>
Co-authored-by: S E R A Y A <[email protected]>
Hello @takahser ! we have merged your inline comments and as the content of the PR was changed, your approval has gone. Could you please re-approve? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much, @takahser, @Noc2, and @keeganquigley, for your valuable feedback, support, and approval of this application. We understand that we have now achieved the required number of approvals, is that correct? I would like to point out that the PR currently indicates that at least 5 approvals are needed for merging this application. This might be due to the initial submission being for a level 3 grant, which required 5 approvals. I'm unsure if any manual intervention is needed for this specific case. Please let us know if there's any further action required from our side. |
Congratulations and welcome to the Web3 Foundation Grants Program! Please refer to our Milestone Delivery repository for instructions on how to submit milestones and invoices, our FAQ for frequently asked questions and the support section of our README for more ways to find answers to your questions. |
Hello everyone, we are excited to announce that Hector Bulgarini (@hbulgarini) has recently joined the Deitos Network team in a leadership role. If any action is needed to update the team information for the grant, please let us know. Thank you. |
@Ernih thanks for letting us know. It'd be good if you could amend the proposal accordingly. The approval process is the same as for the initial approval, but more minor changes (like this one) are usually very easy to approve, hence the approval duration can expected to be much shorter. |
No problem, we are happy to do whatever is required to reflect the change. |
Hello @Noc2, @takahser, @keeganquigley, We're reaching out to inform you that the Deitos team's operations were limited over the holiday period. As a result, we'll be delivering Milestone 1 on January 10th. Thank you for your understanding! |
@hbulgarini thanks for the heads-up; that's no problem at all. Happy new year to you and the team! |
Hello @Noc2, @takahser, @keeganquigley, We have conducted research on licensing and we are considering adopting the GPLv3 license for all work related to the Deitos Network. Additionally, we have observed that this license is the preferred choice among parachain teams. Given that Apache v2 was specified in the application, we would like to inquire whether the grant program permits the delivery of milestones and code under the GPLv3 license. Thank you! |
@hbulgarini changing the license to GPLv3 is not a problem, since it's one of our supported licenses. However, it'd be good if you could make an amendment PR, where you update the license in your milestone deliverables. This kind of amendment is usually merged rather quickly. |
Thank you @takahser for the answer! @Ernih will proceed with the amendment PR. |
|
Project Abstract
Grant level
Application Checklist
project_name.md
).@_______:matrix.org
(change the homeserver if you use a different one)