Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please add more examples or use cases #303

Open
pragmaxim opened this issue Jun 17, 2024 · 1 comment
Open

Please add more examples or use cases #303

pragmaxim opened this issue Jun 17, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@pragmaxim
Copy link

pragmaxim commented Jun 17, 2024

Hey,

I found grovedb extremely useful however I have troubles to tell if it fits certain use cases, for instance now I'm trying to spike a data model to index whole Bitcoin into where I could use secondary indexes and SumTrees, just to spare some disk space and to have address balances automatically summed. Think of it as decentralized blockchain explorer.

I have a few concerns :

  1. often I see that I would need tx_id to be both SumItem and a Reference at the same time
  2. performance ... that the distribution of transactional data might not be a good fit for grovedb
  3. references here does not make much sense as we just persist small hashes
  4. I am not sure if Trees can be used as an alternative to composite keys as we know from Cassandra to avoid data duplication. Ie. if we can have billions of tx_id trees that each contain items or references.

Would you please give me some hints and suggestions as to this model? Overall I have a feeling that groveDB excels only in use cases when we store some "bigger" objects that we can reference by secondary indexes.

https://docs.google.com/document/d/1dql0MTMeu1-3PE_1CtSCc9mtHCi4PD8jhOZ7Ta_sTVQ/edit?usp=sharing
Screenshot from 2024-06-20 15-49-18

@pragmaxim pragmaxim added the enhancement New feature or request label Jun 17, 2024
@QuantumExplorer
Copy link
Member

Sorry for not responding sooner, I hadn't seen your message. And we are releasing Dash Platform this month so I won't be able to go into great detail here.

GroveDB is extremely useful for provable data, where you can put data in the database and then have a merkle-ized proof of any data that you wish to query.

Indeed the sum tree usage here can be quite interesting as you can always verify that there is no inflationary issue.

As for performance, I wouldn't worry too much, in tests grovedb has been extremely fast. It's not going to be as fast as some tailored solutions, but I wouldn't worry all that much unless you are seeing a problem. Under the hood it uses rocksdb, and unless you are using proofs it uses an abstraction to not have to use the merkle trees and instead query directly to the underlying rocksdb. Now rocksdb isn't the most performant for reads, but I still wouldn't worry unless you are actually seeing issues.

The point of references is that they point to other data, so that if that data were to change it would still be pointing to it. Currently references are not bidirectional, so you need to manually know all your references and update them when you update data. While this seems like a very weird way to use them it's actually for performance. They were built for use cases of Dash Platform (which is why we built this database).

In your use case of non modifiable data they would be useful if the data you are pointing to is large, otherwise I would not use them as it takes more seeks to the database.

I am not exactly sure how composite keys work in cassandra, but in grovedb you can query and combine tree paths to get your data, hence getting secondary index type functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants