Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mantaray 1.0 #37

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Mantaray 1.0 #37

wants to merge 7 commits into from

Conversation

nugaon
Copy link
Member

@nugaon nugaon commented Dec 13, 2021

This proposal fixes some design issues of the current mantaray v0.2 data-structure and additionally provides better metadata handling and makes possible to have reliable web3 services that deal with tree-like data-structures.

@AuHau
Copy link

AuHau commented Feb 28, 2022

Generally these improvements LGTM, but after reviewing the reference implementation I have reservations toward one change and that is toward the metadata layout and handling.

First I am not sure what was the reasoning to have metadata in Forks. I assume to store data like filenames etc. I would be interested to hear how this reasoning changed with the introduction of Node's metadata. Could somebody elaborate? Maybe @zelig?

Now to the point that I am not a fan of. I don't like having metadata spread in two places as it brings quite a big complexity overhead to implementation. For example:

  • node's metadata are not size limited, but fork's metadata is (2^5*32 ~ 1kB)
    • what happens when the fork's metadata is exceeded?
  • there is a "metadata overriding" policy (eq. forks metadata extends node's metadata) => another level of complexity
  • if there is one Fork's metadata super large then all the others Fork's metadata has to have the same large size (waisting space)

I was thinking about what Fork's metadata allows in this setup and I could think of only one reason - it allows to directly reference a file from a Fork without the need to have another "leaf" Mantaray node as you can put the needed metadata (filename etc) info the Fork's metadata.

My question is, is this sufficient reason to introduce the additional level of complexity?

IMHO no, as I could imagine the main reason for this is to save space from not having the leaf node, but that is sort of contradicting with the fact that Fork metadata's size has to be aligned across all forks which leads to another level of "wasting", so we could just simply drop the Fork's metadata (which would simplify the protocol a lot) and have metadata only in Node, which would require for each file to have its own Node...

@nugaon
Copy link
Member Author

nugaon commented Mar 7, 2022

thanks for your comment @AuHau

First I am not sure what was the reasoning to have metadata in Forks. I assume to store data like filenames etc. I would be interested to hear how this reasoning changed with the introduction of Node's metadata.

handling metadata on forks has some advantages such as giving information about the referenced resource before fetching it so you know its metadata that you need to know on parent level (e.g. directory listing in your terminal)
Other use case, when there is already an uploaded content (potentially not uploaded by you) and you put that into your trie with different metadata that your application requires or you just want to handle the file differently.
Moreover, it makes the ground base for other optimizations where the fork's reference would point to the content itself but it still has place for its own metadata as we got used to. (coming back as you also mentioned this part in your comment). For this feature, we have to agree some consensus on the metadata itself about what will indicate this desired behavior and act accordingly. I think first we should start use this new format before making any new features top on these.

  • node's metadata are not size limited, but fork's metadata is (2^5*32 ~ 1kB)
    - what happens when the fork's metadata is exceeded?

fork metadata has to have size limit if it is serialized in place. there is a plan to use content references for big sized and frequently used metadata later, but I think 1KB/fork is quite enough for the first round.

there is a "metadata overriding" policy (eq. forks metadata extends node's metadata) => another level of complexity

why exactly? You can get the forkMetadata and the nodeMetadata separately if you want.

if there is one Fork's metadata super large then all the others Fork's metadata has to have the same large size (waisting space)

This is a well-known property, but has to be like this because the fix offsets in the forkArray. Nevertheless, it also has its best use-cases where the required metadata about node on parent level is unified for all sister nodes.

@crtahlin crtahlin added the check-SWIP-status Check if the SWIP is still relevant and being pursued. label Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
check-SWIP-status Check if the SWIP is still relevant and being pursued. pull-request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants