FIP-6: Flexible Storage #98
Replies: 7 comments 21 replies
-
Do we have to impose storage restrictions on the user level? Are other solutions possible? Gmail became the most popular email client b/c it provided users with so much storage (compared to other email clients) that users didn't have to think about it. @kcchu mentioned that [this] problem could be solved later and by others. Perhaps better solution(s) could be found later without imposing storage limits on users, such as:
How does addressing this problem today help Farcaster clients/applications find product-market fit faster? |
Beta Was this translation helpful? Give feedback.
-
If cheaper/free signups are desired, could be nice to have different "tiers" of storage payment, such that if there is spare unused free space, it can be claimed with a lower-tier purchase:
There is a risk that this idea makes it more likely for a free-rider problem to emerge. As long as hubs all retain strong social consensus that the highest tier takes priority and won't fork that out, then they should co-ordinate to disable and drop second tier support more heavily if the network as a whole has too much of a free-rider problem. |
Beta Was this translation helpful? Give feedback.
-
Can storage be proportionate to user’s account quality? By default everyone gets low quota, if your account is popular among others (follows, likes, recasts) quota increases. Should storage fees be distributed among hub operators? |
Beta Was this translation helpful? Give feedback.
-
generally I favor local over global state. 'light hubs' as I suggested to greg at the farcaster meetup in paris would solve dos vectors. use the web of trust as rate limiters with power users (highly connected nodes in the social graph), hosting infrastructure for their communities (3 degrees of separation for example from direct follows). social media isn't a strictly financialized adversarial environment. there are non-financial incentives as well. I'm willing to bet you that most people would be willing to share idle PC hosting time to support their direct communities (people they follow, groups they are a part of). side comment: I think lens is making a mistake over financializing their platform maybe there's a bootstrapping phase where new users need to find content from global state, but as the network evolves, a rich web of trust will form, at which point anti dos mechanisms which emphasize local state and local trust could work well. Global state could still benefit from tapping into the web of trust. Perhaps farcaster power users who are willing to pay $5 for extra storage can lend that storage to their local web of trust. |
Beta Was this translation helpful? Give feedback.
-
This sounds reasonable. But wonder if it's better to have a 1$ account with 1/5th of the current limits for people to try out before upgrading to the 5$ account. My concern comes from 2 points :-
|
Beta Was this translation helpful? Give feedback.
-
What about the case where you want to setup a shared umbrella of storage for a number of users to use? A Telegram group chat or Github discussions forum are examples. On Discord you can boost servers to give it more functionality/storage. You can setup a FID for that community, and then that FID can be given storage by anyone. The problem is messages intended for a group/community are signed by a individual member, and not the overarching group. This can be solved by having the group setup a server that will sign messages on-top a base message from a user. The downside of this a group needs to setup a dedicated server w/ a private key that can sign messages. |
Beta Was this translation helpful? Give feedback.
-
The solution sounds quite reasonable for the short term. However, we will face a scenario that a user pay $10 to rent 2 units of storage and only pay $5 in the next year. How are we going to prune the users' excess data, on a First-In-First-Out (delete the earlier messages and keep later ones) basis or should each application developer provide an interface for users to delete them one by one? Also, currently, the requirement is that "each Hub must store a copy of every user’s data." Will we find a solution for this in the future? Because, the capacity for the total number of casts is still restricted by the Hub's minimum disk space, which is restricting the growth of the Farcaster ecosystem in the long term. |
Beta Was this translation helpful? Give feedback.
-
FIP: Flexible Storage
Title: Flexible Storage
Type: Implementation FIP
Authors: @cassie, @horsefacts, @v
Problem
Users on Farcaster can only store a fixed number of messages after which older messages are expired. Limited storage, along with restricted signups, has kept the size of the network from growing indefinitely during testnet. This has been important to make running a Hub practical, since each Hub must store a copy of every user’s data.
When mainnet launch happens, signups will become permissionless and inexpensive which creates a vector for unbounded growth. This will lead to a few problems:
Message storage space is a common resource which is rivalrous. Users would prefer not to have limits on what they can do while Hub operators would prefer to only store useful content. In the absence of guardrails, we should expect overconsumption by users and eventually, a tragedy of the commons where Hub operators try to implement controls to block certain users.
We need a better system to manage and allocate storage between users and hubs that is:
Specification
We propose a system that imposes a limit on the size of a Hub and divides it into equal shares called units. Users can rent units of storage by paying a yearly fee to a contract which allows them to store messages on Hubs.
A Hub’s capacity is divided into
n
equal units which is the total supply available for users. Acquiring a unit of storage costs a pricep
and is performed by making a transaction to a smart contract, which emits an event. Hubs monitor these events and increase or decrease a user’s storage limits. If a user acquires a unit, theirCRDT limits
on the Hub increase by:These limits are chosen to cover the 99th percentile usage for casts and links, and the 90th percentile usage for reactions over the course of the year. Casts and Links provide more long term value to the network than reactions, so we set those limits higher than that of Reactions. See the appendices for more details.
A unit of storage may take up to
s
bytes on disk which depends on the type of messages in each CRDT and the limits assigned to them. Hubs must ensure that they reserve at leasts*n
bytes of storage for messages from users. A Hub that does not allocate enough space will not be able to synchronize the network and may be disconnected by its peers.The storage parameters
n
,p
,d
ands
can be modified as part of a protocol release which happens every 6 weeks. Parameters may be changed for the following reasons:p
may be increased if it is not preventing spam or decreased if it hinders onboarding.n
may be increased if ≥ 25% ofs*n
is in use by the network.s
may be increases or decreased if new message types are added, existing types are extended or if limits are changed to reflect common usage patterns.The system is intended to last for a year and be manually tuned for that period. Afterwards, we may choose to replace it with a more dynamic system or continue with the current implementation.
Storage Contract
The storage contract lets users pay rent to acquire storage and manages the price and supply of storage. It’s deployed on the same L2 as the Farcaster Identity Registry, and has the following functionality:
p
, total supplyn
and deprecation timestampd
in storage which are also initialized on deployment.p
is set to 500_000_000 (5 USDC or $5)n
is set to 50,000d
is set toblock.timestamp + 365 days
rent
to acquire storage units which:block.timestamp
, number of purchased units (n), and the fid.batchRent
to acquire multiple users of storage for multiple fids, following the same logic asrent
rent
orbatchRent
ifblock.timestamp > d
p
,d
, andn
at anytime.Hubs
Hubs must be updated to allow dynamic limits for each user within each CRDT type. They must also monitor the contract and update the limits as storage is purchased and expires.
Rent
events and:APIs
Hubs must also expose the following APIs which allows a caller to determine the storage limits for a user.
Rationale
Can we prevent overconsumption by keeping the network invite-only?
Staying invite-only makes it harder for developers to onboard users and gatekeeps access to the network. This was necessary to bootstrap the network when it was being developed, but is becoming undesirable as the network grows. A permissionless system is a more level playing field for everyone.
Can we prevent overconsumption by kicking out the spammers?
People may have different perspectives on what spam is and making such decisions at the network level introduces a vector for censorship. A better approach is to reduce spam by adding fees and leave the final filtering steps to applications which can experiment with different, dynamic heuristics to identify useful content to surface.
Why is a unit set to a specific number of messages and not bytes?
Keeping track of bytes is much harder than keeping track of the number of messages and is less intuitive for users. A total message count is easier for end users to understand and model their behavior around.
Why is the price for a unit of storage $5/year and not more or less?
A “best guess” was made to select a price that was easy to communicate and was large enough to prevent overconsumption of storage. Networks like ENS and domain names have pricing models that are similar or in the same order of magnitude. We don’t yet know if this is the right number and may change it over time.
Why denominate in USDC instead of ETH?
Pricing in USDC makes it easier for end users - they pay a predictable fee every year that does not change based on the price of Ethereum, which is currently much more volatile.
Why is the storage price fixed instead of market-based?
A fixed price system manually tuned by admins is simpler and less likely to have bugs. It helps us ship something quickly that we can develop as we observe user behavior. It is dependent on a price feed which adds some risk, but this is acceptable when the amounts are small. The right long term approach is a dynamic pricing approach with a GDA or VRGDA. This can maximize value captured while minimizing manual intervention. The main downside is that it is complex, takes a while to tune correctly and may result in “winner’s curse” during periods of high demand.
Who sets the storage price?
The price can be set by a multi-sig currently controlled by Dan & Varun. The price will be adjusted upwards if it is not effective enough at reducing spam and will be adjusted downwards if it is creating too much friction for user onboarding. For the first year, these decisions will be made by the team based on qualitative factors.
Where do the storage fees go?
Fees are expected to be minimal in the first year (~$10-20k) and are mostly for spam prevention. They will be collected in a multi-sig controlled by Dan & Varun. If they reach a significant amount ($100k+) we will consider spending them in ways that benefit the protocol.
Why rent storage instead of buying it permanently?
Storage units could be issued as transferrable ERC-20’s, which users can buy and re-sell when they no longer need it. While this creates a marketplace for storage, it creates more UX complexity, increases gas costs and makes it more difficult for app developers. It may also lead to hoarding and inefficient allocation, where users may lose or forget about tokens but Hubs can’t tell this and must keep reserving space for these “dead” tokens.
Release
Appendix A: Prior Work
Appendix B: Usage Patterns
The following data was collected from three groups of users: Group 1 (who signed up in 2021/2022), Group 2 (who signed up in 2022/2023) and Group 3 (who signed up in 2023). Data is obtained from warpcast which keeps an archive of pruned and revoked data allowing us to get a broader picture of how many messages users generate over time.
Appendix C: Message Sizes
The following shows the maximum possible size of each message type allowed by the protobuf format and the average size observed on Hubs as of Jul 21st, 2023.
The following table shows the projected size of Hubs under three different scenarios for different numbers of storage slots:
We also multiply all estimates by a 2.42x overhead factor and a 5x buffer factor. The overhead accounts for indexes, events, sync tries and other consumption of storage. The buffer is simply to protect against underestimations and can be relaxed in the future.
We expect realistic usage to fall somewhere between (1) and (2), likely closer to (1). The sizes are represented below in TB:
Beta Was this translation helpful? Give feedback.
All reactions