Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding application: Polkadot Analytics Platform #1883

Merged
merged 5 commits into from
Aug 22, 2023
Merged

Conversation

rbrandao
Copy link
Contributor

@rbrandao rbrandao commented Aug 3, 2023

Project Abstract

The Polkadot Analytics Platform aims at building a comprehensive data analysis and visualization tool for the Polkadot ecosystem. The platform will allow users to retrieve and analyze data from various Polkadot-related sources (e.g., different parachains and components such as browser wallets), aligned with the POnto ontology [1, 2, 3]. Users will be able to specify their queries using a controlled natural language (CNL), and the platform will provide a query engine to process these queries. Additionally, the platform will provide a UI to support constructing queries and visualizing informative artifacts that represent query results. As well as support for composing customizable dashboards using these artifacts.

This is only the first stage in the roadmap to build the platform, which comprises a subset of the platform components

[1] POnto source code: https://github.com/mobr-ai/POnto
[2] POnto documentation: https://www.mobr.ai/ponto
[3] POnto scientific paper: https://github.com/mobr-ai/POnto/raw/main/deliverables/milestone3/article.pdf

This is a follow-up grant application for the project A Knowledge-Oriented Approach to Enhance Integration and Communicability in the Polkadot Ecosystem

Grant level

  • Level 1: Up to $10,000, 2 approvals
  • Level 2: Up to $30,000, 3 approvals
  • Level 3: Unlimited, 5 approvals (for >$100k: Web3 Foundation Council approval)

Application Checklist

  • The application template has been copied and aptly renamed (project_name.md).
  • I have read the application guidelines.
  • Payment details have been provided (bank details via email or BTC, Ethereum (USDC/DAI) or Polkadot/Kusama (USDT) address in the application).
  • The software delivered for this grant will be released under an open-source license specified in the application.
  • The initial PR contains only one commit (squash and force-push if needed).
  • The grant will only be announced once the first milestone has been accepted (see the announcement guidelines).
  • I prefer the discussion of this application to take place in a private Element/Matrix channel. My username is: @_______:matrix.org (change the homeserver if you use a different one)

The Polkadot Analytics Platform aims at building a comprehensive data analysis and visualization tool for the Polkadot ecosystem. 

This is a follow-up grant application for the project: w3f#1420
@CLAassistant
Copy link

CLAassistant commented Aug 3, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@semuelle semuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the application, @rbrandao. I have some questions. Feel free to amend the application accordingly.

  1. As you state in the application, we recently signed three projects addressing the same RFP. You have listed some technical differences between your project and the others, but what I don't understand is how an average user benefits from your approach.
  2. You are proposing building a platform funded by subscriptions. Have you done any market research on the size of the target audience and potential revenue? Given that there are already a number of existing platforms, this seems ambitious.
  3. Do you have experience with ETLs and the technologies you propose using? I was under the assumption that your background was largely academic and I'm worried that we are funding too many datasets and analytics platforms that stop being maintained. Going by the conversations we had with people working on such products, keeping up to date with the data and changes in the ecosystem is a challenge in itself.

@semuelle semuelle self-assigned this Aug 4, 2023
@rbrandao
Copy link
Contributor Author

rbrandao commented Aug 4, 2023

Thank you for the comments, @semuelle !

  1. As you state in the application, we recently signed three projects addressing the same RFP. You have listed some technical differences between your project and the others, but what I don't understand is how an average user benefits from your approach.

For the average user, having a CNL (Controlled Natural Language) to perform queries is key. CNLs preserve most of the natural properties of their base language. In this sense, users can intuitively and easily specify their intent. In contrast to the other three projects addressing the same RFP, in our project users do not need to know or learn programming languages such as SQL, GraphQL, or any other.

In addition to the CNL, we designed the concept of informative artifacts to allow dashboard composability. This adds to the benefits to the average user, as they will be able to reuse and adapt existing artifacts through a visual and interactive interface.

We believe that by leveraging the ontological framework and the controlled natural language querying engine, users can easily perform complex cross-chain data analysis without requiring in-depth technical knowledge or familiarity with specific programming languages. This empowers the average user within the Polkadot ecosystem to efficiently retrieve and analyze blockchain data across multiple parachains, contributing to improved decision-making, research, and understanding cross-chain effects and dynamics.

  1. You are proposing building a platform funded by subscriptions. Have you done any market research on the size of the target audience and potential revenue? Given that there are already a number of existing platforms, this seems ambitious.

In the application, we mentioned that the platform could be leveraged and monetized via additional funding applications or a SaaS subscription model (monthly fee payments and a free tier with limited capabilities). However, this is an early-stage technical project application. As the roadmap evolves, we will look for partnerships, sponsorships, or other supporting programs that might be suited, e.g. builders programs, VC funding, or treasury funding. In this case, market research and potential revenue analysis are key for achieving the requested funding indeed.

  1. Do you have experience with ETLs and the technologies you propose using? I was under the assumption that your background was largely academic and I'm worried that we are funding too many datasets and analytics platforms that stop being maintained. Going by the conversations we had with people working on such products, keeping up to date with the data and changes in the ecosystem is a challenge in itself.

Both my partner and I at MOBR have been working with AI, roughly for the last eight years. We worked in different industry R&D projects at IBM for seven years, specifically with knowledge engineering (KE). The computational KE field comprehends among other activities the capability of structuring domain knowledge in a way it can be queried. Commonly, information in expert domains are rather unstructured and dynamic. In other words, data has to be extracted, cleansed, transformed, and injected into knowledge bases continuously.

As a matter of fact, in our previous work we designed and deployed a graph database technology (named Hyperknowledge) to address this specific issue of maintaining the dynamicity of domain knowledge and data.

More details about the projects we engaged before can be found at our LinkdIn profiles [1, 2]. I'm glad to answer any questions you may have about them. Concerning the other technologies mentioned in the application, we have previous experience with them in these previous projects as well.

[1] Dr. Moreno LinkedIn profile: https://linkedin.com/in/marcio-moreno-phd-598a459a/
[2] Dr. Brandao LinkedIn profile: https://linkedin.com/in/rafaelrmb/

Copy link
Member

@semuelle semuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications, @rbrandao. I think this CNL approach might be useful. However, I would prefer to test this in a smaller context, e.g. by extending existing querying engines, to see what the actual benefit is. Or, if you have some concrete examples of how the CNL simplifies certain queries, feel free to add them.

In any case, I will share your application with the rest of the committee.

@semuelle semuelle added the ready for review The project is ready to be reviewed by the committee members. label Aug 9, 2023
@rbrandao
Copy link
Contributor Author

rbrandao commented Aug 9, 2023

Thanks again for your comments, @semuelle. Our idea for the query engine is to extend an existing SPARQL query engine and not to create a whole new query engine from scratch.

As described on Milestone 4, deliverable 1, in this component we will Implement the logic for executing CNL queries by translating them into SPARQL queries to fetch data from the Knowledge Base.

Concerning concrete examples of how CNL may simplify certain queries, please consider Fig. 1 in the grant application with the overview of the process. In it, we illustrate specifically competency questions (CQ5 and CQ6) in a controlled natural language, and the corresponding structured equivalent in SPARQL. More examples and details are available in the published paper.

CQ5

CNL: How many transactions happened between July 4th 2023 and July 8th 2023 specifically in the Moonbeam parachain?

SPARQL:
image

CQ6

CNL: What are the top 5 parachains by pull requests in the last 7 days?

SPARQL:
image

@dsm-w3f
Copy link
Contributor

dsm-w3f commented Aug 9, 2023

@rbrandao thank you for the grant application. I think this topic is interesting but it seems that we already have some grants related to data analysis. The initiatives should work in synergy and ideally reusing the work of each other. See this comment from Karim, which is one of the team leads from the data team at Parity, in another grant application. In this way, I have some doubts about your project.

How this grant application could reuse the efforts that were already in progress? This could lead to a reduction in the scope and price of your application? How?

From what I saw in the other applications in the same area, this one is the most expensive and my feeling is that we are already supporting part of it in other projects. Could be possible to you to follow the recommendations from Karim and focus on frontend (query part) or capabilities that other grants cannot provide?

@rbrandao
Copy link
Contributor Author

@rbrandao thank you for the grant application. I think this topic is interesting but it seems that we already have some grants related to data analysis. The initiatives should work in synergy and ideally reusing the work of each other. See this comment from Karim, which is one of the team leads from the data team at Parity, in another grant application. In this way, I have some doubts about your project.

Thanks for your comments, @dsm-w3f. We are excited to see this surge of analytics project proposals in the ecosystem. Checkout below our points regarding yours and Karim’s comment.

How this grant application could reuse the efforts that were already in progress? This could lead to a reduction in the scope and price of your application? How?
From what I saw in the other applications in the same area, this one is the most expensive and my feeling is that we are already supporting part of it in other projects.

In terms of reusing the current efforts, in the proposed architecture we are already foreseeing the reuse of available ETLs in the Data Layer (see Fig. 2 in our application). The idea is that the "Semantic ETL workflows" proposed in Milestone 2 will fetch data using available ETLs.

If the current ETLs like Substrate-ETL and dot-etl (when available) are not suitable for our use cases, we could propose extensions to existing projects. But this is not in the scope of our current application.

Nevertheless, for the proposed platform to succeed we need to transform the fetched data and align it with the ontology, creating knowledge representations that will be the basis for the upper layers in our architecture. So, the scope of work cannot be reduced since there is no other project offering a knowledge-oriented solution with the features this platform requires.
Concerning the asked price for this application, we think the budget is reasonable for the projected platform features and work involved. The specified daily rate is the same as the previous grant application we delivered, and the scope of work is considerably broader.

One point that I'd like to mention is that we had our previous grant application submitted before the publication of this RFP. The previous grant was focused on devising a domain ontology which would be the steppingstone to achieve an envisioned analytics platform for the ecosystem. This platform is the focus of the current application. Our vision for this platform is based on our past experience as AI researchers and developers in different industries. In our approach, we are combining a holistic view of data and domain knowledge to create a queryable knowledge base that can be leveraged to meet experts, developers and average users demands.
The platform will provide an engine layer supporting informative artifacts and a controlled natural language (CNL) based on the previously developed domain ontology. Both the CNL and informative artifacts will support users to easily access and analyze the ecosystem data. This is useful for experts and developers, but key for the average users and is a requirement stated in the RFP, i.e. "the tools should NOT demand that users need to know or learn technical query languages such as SQL, GraphQL, or any other."

In addition to providing a knowledge-oriented solution, we anticipate that the proposed platform could also be valuable in terms of creating AI research opportunities. Including, exploring semantic reasoning in the KB to provide answers that are not explicitly represented; research initiatives to support rich insights based on predictive models (e.g., link prediction, concept2vec), etc.

Could be possible to you to follow the recommendations from Karim and focus on frontend (query part) or capabilities that other grants cannot provide?

Technically, we are already focusing on the frontend and query features, as we do not plan to implement an ETL from scratch. However, our solution demands the implementation of intermediary layers to process, transform, and align information to maintain triples as knowledge representation in a triplestore database. In addition, it is necessary to build on top of triplestore query engines to support CNL queries, and create necessary endpoints including those to consume the knowledge properly. As far as we know, there is no alternative to the backend features we need.

Below are some reflections regarding Karim’s comment on the dot-etl proposal.

COMMENTS FROM KARIM:
High-level aspects

  1. Regarding querying a Postgres database in the browser.

As opposed to using relational databases, our approach uses triplestores to maintain a knowledge base, where ledger data is transformed and injected, and can be further enriched through other data sources and domain knowledge.

In our perspective, a triplestore database offers distinct advantages over a relational database due to its semantic model. It excels in capturing and querying complex relationships by utilizing RDF triples that facilitate efficient representation of diverse and evolving data structures. Unlike the rigid schema of relational databases, triplestores allow schema-less data integration, making them adaptable to evolving data sources. They naturally support semantic reasoning and inferencing, enabling advanced querying for deriving new insights. This enables applications such as knowledge graphs, semantic web, and linked data, fostering enhanced data integration, flexibility, and semantic querying capabilities that are often challenging to achieve with traditional relational databases.

Would be beneficial here is to see how that ties in with online charting capabilities, like what Colorful Notion is proposing with the Apache Superset project they're integrating

Considering the charting possibilities, we are proposing creating the concept of informative artifacts as query results to compose custom dashboards. We plan to reuse different libraries to specifically deal with rendering and dashboard composition. We will definitely consider Apache Superset for that matter as well.

How do you see users interacting and using the system that would be similar (or different) to Dune?

On the one hand, in our proposal the platform will provide to the average users a UI with querying capabilities. Their interaction will be facilitated through a CNL-based query specification supported by autocomplete and contextualized suggestion features. Query results will be presented as visual content that users will be able to interact and customize. In the future, these artifacts can be leveraged through social engagement, e.g. sharing, ranking, bragging, etc. Dune Analytics has a similar social engagement approach.

On the other hand, Dune Analytics is built on top of a relational database and requires users to learn their custom SQL query language, which is a technical query language, while also demanding users to learn and understand their data model. There is no support for query building features or user-friendly mechanisms, such as autocomplete, or contextual suggestions based on entities that compose the domain knowledge. In our approach, the proposed platform will be built on top of a knowledge base, comprising a custom triplestore database, which is commonly leveraged to provide semantics and additional information to support such features.

CNL is key for the average users to specify their queries and it meets the requirement stated in the RFP, i.e. "the tools should NOT demand that users need to know or learn technical query languages such as SQL, GraphQL, or any other."

You mention it provides also a post-ETL data engine, leveraging subxt to ingest data directly from the chains themselves.
A few general questions:
How do you plan to cover the costs of accessing all the nodes in the ecosystem?
How do you plan to maintain the system (system operations, DB backups)?

We plan to use existing ETLs (Substrate-ETL, dot-etl). However, we will still have to cover costs of computing power for maintaining data synchronized in the knowledge base, as well as storage costs.

The data and information extraction will be carried out by the planned Semantic ETL workflows, which are a series of Airflow tasks that will use existing ETLs, to transform and align data to inject knowledge in the triplestore. These pipelines may be triggered over a schedule that can be adjusted depending on the afforded costs and accepted delay.

As commented before, in the application we mentioned that the platform could be leveraged and monetized via additional funding applications or a SaaS subscription model (monthly fee payments and a free tier with limited capabilities). However, this is an early-stage technical project application. As the roadmap evolves, we will look for partnerships, sponsorships, or other supporting programs that might be suited, e.g. builders programs, VC funding, or treasury funding. In this case, market research and potential revenue analysis are key for achieving the requested funding indeed.

How do you plan to share the data, if people would like direct access? Is this a use case you plan to cover?

Regarding sharing query results, each query will have an informative artifact associated with it. This artifact will provide not only the results, but also a reference to an endpoint so results can be accessed and polled dynamically.

As for direct access to the KB representation, we will provide endpoints for performing queries. Another possibility is to have dumps from the triplestore in Turtle format.

Milestones - backend:
Will you be providing the built docker containers too?

Yes, we will provide a Dockerfile in the project github repository. We specified docker images as deliverables in our milestones.

How will you handle metadata upgrades and data backfills of the data? Some words on that would help.
Thoughts on data indexing, information extraction and ETL.

As illustrated in the application, we designed a systematic process to perform information extraction (reusing existent ETLs and existent Substrate interfaces) from the ecosystem. Specifically, this process will comprise workflows, i.e. streamlined Airflow tasks to continuously (at a configured schedule) fetch, transform, cleanse, and align the data to inject or backfill structured knowledge by a specific component in the architecture, which is capable of leveraging triplestore data model flexibility. Going forward, the extraction processes, as well as the domain ontology, should be regularly reviewed to ensure that data is accurately represented and handled.

@dsm-w3f
Copy link
Contributor

dsm-w3f commented Aug 11, 2023

@rbrandao thank you for the answer. I now understood that the project already considers the databases that are being developed by other projects as sources of information. A concern that remains is regarding the price, since US$500 per hour is very expensive and not all scope of the work needs PhDs to be developed. Do you consider giving us a discount? An option to be more cost-efficient maybe is to hire less expensive professionals for at least some parts of the development.

Another concern is regarding the scope of the questions that would be answered by the tool. As far as I remember, the ones from the last grant were in some way generic and not very tight with Polkadot ecosystem. The questions that could be answered by the tool are the same proposed from the previous grant? Any change on that?

@rbrandao
Copy link
Contributor Author

@rbrandao thank you for the answer. I now understood that the project already considers the databases that are being developed by other projects as sources of information.

Thanks for the prompt comments @dsm-w3f. Indeed, we already foresee the use of existing ETLs, Substrate interfaces, and datasets.

A concern that remains is regarding the price, since US$500 per hour is very expensive and not all scope of the work needs PhDs to be developed.

The specified rate is US$500 per day, not per hour. This price is reasonable for a PhD daily rate with the required expertise. It is less than the commonly observed market rates, honestly. Note that it is an 8 months project roadmap.

Do you consider giving us a discount? An option to be more cost-efficient maybe is to hire less expensive professionals for at least some parts of the development.

The work needs a deep understanding of specialized technologies, demanding skills like NLP, AI/Knowledge Engineering, HCC (Human-centered Computing), and UI/UX. I don't see how hiring less specialized professionals could work here. At least not without compromising the achieved results.

Another concern is regarding the scope of the questions that would be answered by the tool. As far as I remember, the ones from the last grant were in some way generic and not very tight with Polkadot ecosystem. The questions that could be answered by the tool are the same proposed from the previous grant? Any change on that?

The scope of the questions will be broad, supported by a combination of the concepts, properties and individuals aligned with the POnto domain ontology, that are represented in the Knowledge Base.

In this sense, to illustrate the scope of the questions that the platform will be able to answer consider the following categories:

Category Description Structure / Example
“What is” questions The platform will be able to define any of the entities represented. Also including many individuals (instances) that will populate the knowledge base after milestone 2 is accomplished Structure: What is entity here. E.g., What is multirole delegation ?
Relationship questions Answering how an entity is related (directly or indirectly) to others Structure: What entities are relationship here to entity here? E.g., What are the components involved in the OpenGov mechanism?
Simple analysis questions Answering about the current state of a specific entity according to specific characteristics Structure: Which/How many entity here has the following characteristics here ? E.g., How many accounts have DOT and KSM tokens?
Complex analysis questions Same as simple, but correlating multiple entities and data sources Structure: quantify entity according to characteristics here. E.g., What are the top 5 parachains by pull requests in the last 7 days? How many transactions happened between July 4th and 8th specifically in the Moonbeam parachain?

Concerning the scope of the questions, this is exactly the kind of discussion that we tried to foster on the last Milestone of our previous research grant, with the discussions over the Mural and questionnaire. We are addressing all suggestions from the feedback we received, as well as the types of queries stated in the RFP.

Note that, after deploying an initial MVP of the platform, there will be many evolution opportunities, including the expansion of the ontology, which would further expand the query answering capabilities as well.

Copy link
Member

@semuelle semuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described on Milestone 4, deliverable 1, in this component we will Implement the logic for executing CNL queries by translating them into SPARQL queries to fetch data from the Knowledge Base.

Thanks for the updates, @rbrandao. I agree that this concept could be quite useful and is worth pursuing. However, looking at the complexity of starting, running and maintaining an analytics platform, as people in the ecosystem have shared with us, I would not recommend tying it to a complete platform. I'd be happy to give my +1 for a library that helps converting CNL to SPARQL and possibly other tools that might be useful in this regard. That would also be a better fit for the grants program as a whole.

@rbrandao
Copy link
Contributor Author

rbrandao commented Aug 14, 2023

Thanks for the updates, @rbrandao. I agree that this concept could be quite useful and is worth pursuing. However, looking at the complexity of starting, running and maintaining an analytics platform, as people in the ecosystem have shared with us, I would not recommend tying it to a complete platform. I'd be happy to give my +1 for a library that helps converting CNL to SPARQL and possibly other tools that might be useful in this regard. That would also be a better fit for the grants program as a whole.

Thanks for the comments and suggestions @semuelle.

Indeed, an analytics platform is a complex asset to develop and maintain. There are no alternatives when it comes to developing robust software platforms that are not available off-the-shelf.

Considering your recommendation of developing a library to help convert CNL to SPARQL. As far as we know, there is no other project considering a knowledge base to support analytics in the Polkadot ecosystem. How could such a library be useful as a standalone solution without the required backend layers (knowledge base with a proper knowledge representation, endpoints, ontology and information alignment, etc.)?

The set of open source libraries that compose the platform would be good contributions to the community. However, as isolated components, libraries would not bring much value as they would in a cohesive solution as the envisioned platform.

@dsm-w3f
Copy link
Contributor

dsm-w3f commented Aug 14, 2023

@rbrandao thank you for the answers. The price charged per day is more reasonable than per hour. However, I agree with @semuelle that the core part of the app is the CNL to SPARQL conversion and maintaining ETL and other infrastructure at this moment would not be appropriate. Maybe we could try the proposed technology on a small scale, understand if and how this could generate value for our ecosystem before proceeding to fund a full product. Furthermore, I'd like to ask if the proposed technology will be based on another one such as Sparkils (docs here), other tools available, or do you plan to develop a new tool from scratch?

@rbrandao
Copy link
Contributor Author

rbrandao commented Aug 14, 2023

@rbrandao thank you for the answers. The price charged per day is more reasonable than per hour. However, I agree with @semuelle that the core part of the app is the CNL to SPARQL conversion and maintaining ETL and other infrastructure at this moment would not be appropriate. Maybe we could try the proposed technology on a small scale, understand if and how this could generate value for our ecosystem before proceeding to fund a full product. Furthermore, I'd like to ask if the proposed technology will be based on another one such as Sparkils (docs here), other tools available, or do you plan to develop a new tool from scratch?

Hi @dsm-w3f, it's hard to cherry pick a core component of the proposed approach, if we had to. In our understanding, a foundational aspect relies on the structuring of a domain ontology. And that's why we proposed it beforehand. For us, it makes no sense focusing specifically on the CNL to SPARQL aspects without considering the big picture of the platform (as commented before on @semuelle's suggestion).

Concerning the design of the querying building, our initial interface proposal aims at providing a textual interface. That is, an "omnibox" text input tool that will suggest terms and autocomplete expressions based on contextual information. That is opposed to the referred tool (https://github.com/sebferre/sparklis), which demands users to go through a series of clicking and selections over widgets and visual components. Indeed, we plan to develop our own tool, but that doesn't mean everything from scratch. We will definitely consider open-source solutions that may help, including the aforementioned work and others discussed in the comparative survey.

We co-authored a number of papers and patents published throughout the years, specifically dealing with and advancing the state-of-the-art in querying technologies. Checkout some of them:

An Extensible Approach for Query-Driven Multimodal Knowledge Graph Completion
Supporting Polystore Queries using Provenance in a Hyperknowledge Graph
Evaluating Semantic Queries for Dataset Engineering on the Hyperknowledge Platform
A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores

US Patents:
Automatically and incrementally specifying queries through dialog understanding in real time
Mapping application of machine learning models to answer queries according to semantic specification

If you want, we can talk about them, or if you have further interest in getting to know details about our vision of the query building aspects.

@dsm-w3f
Copy link
Contributor

dsm-w3f commented Aug 15, 2023

@rbrandao thank you for the answers. Would translating the CNL to SPARQL without selecting options, like other tools, cause precision problems in the translation? How do you plan to deal with that?

Furthermore, let me know if you plan to make changes to the application document considering our discussion or if this is the final version in order to give a decision on it.

@rbrandao
Copy link
Contributor Author

rbrandao commented Aug 16, 2023

@rbrandao thank you for the answers. Would translating the CNL to SPARQL without selecting options, like other tools, cause precision problems in the translation? How do you plan to deal with that?

Hi @dsm-w3f. If by "selecting options" you mean interactivity through a visual UI to support the translation, it would not cause "precision problems" not having it. As specified in M3 of our proposal, we included deliverable 1) CNL grammar, 2) CNL syntax definition, 3) CNL semantics definition. These deliverables would support parsing and validating any valid specification of queries in the controlled natural language. In addition, our textual interface to support query building includes autocomplete and contextualized suggestions features, which also would assist in the validation and “precision” of the query translation.
In our vision a visual approach to assist query building constraints productivity and hinders expansion of the supported CNL. Imagine if to carry a search on Google users had to go through a series of clicking and widget interaction. It would certainly be a cumbersome and tiresome experience to come up with only a bunch of interesting queries.

Furthermore, let me know if you plan to make changes to the application document considering our discussion or if this is the final version in order to give a decision on it.

Concerning the focus of this initiative, we think that it is key to follow the current proposed milestones order, since there is a dependency among the proposed assets. That is, each asset depends on the previous milestone outcome. An alternative to this application would be breaking down each of the proposed milestones in separated L1/L2 grant applications. This would address @semuelle's and yours concerns regarding a complete platform as a single application. Note that, by proposing all of the deliverables together we reduce the costs of development, since we can plan the roadmap and scope of work ahead. Whereas, if we break it down into small projects the roadmap can change a lot with discussions and things may take a different route with different costs.

Currently we have the following milestones in our radar: M1 (10k in 1mo), M2 (18k in 2mo), M3 (18k in 2mo), M4 (15.5k in 1.5mo), and M5 (18k in 2mo)

What do you think about this alternative?

@dsm-w3f
Copy link
Contributor

dsm-w3f commented Aug 17, 2023

@rbrandao thank you for the answer. It is not clear to me what would be the scope of the L1/L2 grant applications. Can you detail the scope of each one together with the budget? In this way, we would be able to analyze and give an opinion on that. Although the scope separation could lead to a higher overall price, this can reduce our risk of funding large projects that might not go in a direction that provides value to our ecosystem. Furthermore, data-related projects are eligible for treasury funding in other initiatives, such as Data Alliance, if they are aligned with it. Having a working prototype that shows the value of your tool for our ecosystem and does not overlap with other projects that already are part of Data Alliance could be a good way to ask for funds for the treasury. Usually, treasury proposals and bounties give more funds than grants.

@rbrandao
Copy link
Contributor Author

@rbrandao thank you for the answer. It is not clear to me what would be the scope of the L1/L2 grant applications. Can you detail the scope of each one together with the budget? In this way, we would be able to analyze and give an opinion on that. Although the scope separation could lead to a higher overall price, this can reduce our risk of funding large projects that might not go in a direction that provides value to our ecosystem. Furthermore, data-related projects are eligible for treasury funding in other initiatives, such as Data Alliance, if they are aligned with it. Having a working prototype that shows the value of your tool for our ecosystem and does not overlap with other projects that already are part of Data Alliance could be a good way to ask for funds for the treasury. Usually, treasury proposals and bounties give more funds than grants.

@dsm-w3f, we have just changed our proposal to a L1 grant application. Now the scope of work is limited to the first milestone of the previous application.

This is the final version of our application.

Thank you and @semuelle for your time and relevant feedback.

Copy link
Member

@semuelle semuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @rbrandao. I'm happy to support this.

@Noc2 Noc2 merged commit 87376fd into w3f:master Aug 22, 2023
6 checks passed
@github-actions
Copy link
Contributor

Congratulations and welcome to the Web3 Foundation Grants Program! Please refer to our Milestone Delivery repository for instructions on how to submit milestones and invoices, our FAQ for frequently asked questions and the support section of our README for more ways to find answers to your questions.

Before you start, take a moment to read through our announcement guidelines for all communications related to the grant or make them known to the right person in your organisation. In particular, please don't announce the grant publicly before at least the first milestone of your project has been approved. At that point or shortly before, you can get in touch with us at [email protected] and we'll be happy to collaborate on an announcement about the work you’re doing.

Lastly, please remember to let us know in case you run into any delays or deviate from the deliverables in your application. You can either leave a comment here or directly request to amend your application via PR. We wish you luck with your project! 🚀

@rbrandao
Copy link
Contributor Author

Thanks for the updates, @rbrandao. I'm happy to support this.

Thanks @semuelle and all the W3F Grants team for this approval. We at MOBR have high hopes that this will be a really interesting initiative for the ecosystem as whole. Let's rock!

taqtiqa-mark pushed a commit to taqtiqa-mark/Grants-Program that referenced this pull request Jun 6, 2024
* Adding polkadot_analytics_platform.md

The Polkadot Analytics Platform aims at building a comprehensive data analysis and visualization tool for the Polkadot ecosystem. 

This is a follow-up grant application for the project: w3f#1420

* Update polkadot_analytics_platform.md

* Update polkadot_analytics_platform.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review The project is ready to be reviewed by the committee members.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants