FIP-15: Link Defragmentation #169
Replies: 7 comments 24 replies
-
It is not clear to me: Do we give Warpcast the power to re-define the Links of all FIDs?
It seems to me that the problem is not at the protocol level (add/remove messages), but on the implementation level (removed messages occupy more space in the db than they should/could). Why would this be fixed by a protocol change and not an implementation change? I may be missing something, please explain if I got everything wrong :-) |
Beta Was this translation helpful? Give feedback.
-
There is no guarantee that such a service will actually archive all messages. Any connectivity issue with Warpcast hubs (or any other hub that receives a lot of original messages, like neynar) would result in messages being pruned (compacted) before they even reach my archiving hub. This is not the case today. If I run a replicator today, it will store everything that's posted on the network, even if there are long periods of hub connectivity issues. |
Beta Was this translation helpful? Give feedback.
-
Maybe I'm missing something. Let's say this FIP is implemented. Then I follow a FID. Two days later I unfollow them. What kind of record will hubs have of these two actions, a week later? Will the timestamp and signer (i.e. app used) of my original follow and unfollow exist? My understanding is that the only onhub record will be a CompactStateMessage that indicates that at some point in time before it was posted, I unfollowed this FID. Am I wrong? |
Beta Was this translation helpful? Give feedback.
-
I wanted to comment older FIPs were more principled, and recent ones feel narrower, but the newest version of this FIP is much better 👏 Minor: I still think that the term I think that something along the lines of |
Beta Was this translation helpful? Give feedback.
-
I think there should be a third rule, for Add/Remove messages posted after the CompactStateMessage?
|
Beta Was this translation helpful? Give feedback.
-
This FIP implies that there is a single type of Link messages (type="follow"). However, the protocol allows for arbitrary Link types that can be used to define various types of relationships between FIDs. In order for this FIP to be compatible with the current protocol specs, there are a number of alternatives. I think that both are reasonable and add minimum complexity, with A just being a more explicit version of the current proposal. A. Narrow the scope of the FIP to Link.type="follow". Changes:
This means that CompactLinkMessages will only affect Link.type=="follow" and leave the rest of the LinkMessages unaffected. B. Extend the scope to support any Link type.To do so, I propose the following changes:
|
Beta Was this translation helpful? Give feedback.
-
Wrapping my head around this improvement for my own purposes and trying to document how to work with hub data. Currently it reads that hubs will, upon receiving a compaction message:
Should this be i.e. It feels like it should end up pruning the messages that are IN the compaction message so you preserve all link adds / current links? |
Beta Was this translation helpful? Give feedback.
-
Type: Implementation
Author: [Aditya Kulkarni] (@adityapk00 )
Abstract
Link Defragmentation is a new process that can reclaim space in a Link store. A new "CompactStateLinkMessage" is added represent the full state of a Link store for an fid. It will effectively compact the store by pushing out old LinkRemove messages, reducing the storage space required.
Problem
In our CRDT stores, delete messages occupy the same storage space as the add message. Deleting doesn't actually reduce the space used and users will eventually run out of space. When this happens, the CRDT evicts the oldest messages. This is actually reasonable for casts and reactions where the oldest content is less relevant and eventually all the delete messages get cycled out.
It is much more problematic for follows where the oldest follows might be the most relevant. One way to sole this is to re-sign links so that the adds appear after the removes. Warpcast implements this and will periodically resign all LinkAdd messages with a higher timestamp than the LinkRemove, forcing the Removes to get evicted first when limits are reached. The downside of this approach is that it creates a lot of sync thrash.
For some users, thousands of messages are resigned periodically to reclaim a few slots of storage space. This is creating issues where users are seeing their data get out of sync because the high volume of message changes is triggering rate limits and creating other problems.
Specification
A user can create a new type of message called
CompactStateLinkMessage
which contains the full state of the Link store. State here refers to all the users being followed, so it is simply a list of all fids that the user should be following. It "defragments" the Link store by cleaning up all old LinkRemove references.If the user's store is filled with a lot of removes, they can simply issue a new CompactStateLinkMessage with a list of all the fids they want to be following. When the hub receives this message it will:
linkRemove.timestamp < compactStateMessage.timestamp
linkAdd.timestamp < compactStateMessage.timestamp && linkAdd.targetFid NOT IN compactStateMessage.targetFids
.This enables users to issue a single message which cleans up their state instead of having to resign and re-issue every LinkAdd message. A bonus side effect is that CompactStateLinkMessages can be used to discover missing LinkAdds that can be proactively synced.
If a user's Link set has no CompactStateLinkMessage there is no change to the set rules. But if a message is currently present in the set, the following rules apply:
msg.timestamp < compactStateMessage.timestamp
, it is merged only ifmsg.targetFid IS IN compactStateMessage.targetFids
.msg.timestamp < compactStateMessage.timestamp
, it is ignored.msg.timestamp
>compactStateMessage.timestamp
normal rules applyStorage Considerations
The CompactStateLinkMessage will be large since it includes a list of fids. For a full link store this can be 2,500 fids and the message would roughly be 10 kB in size. Assuming that all users have a single storage unit this would roughly increase storage space by ~ 3% for compacted fids. A CompactStateLinkMessage cannot exceed 250,000 fids or 10 storage units.
Release
The target release for this proposal is protocol version 5/1/2024.
Backwards compatibility for hubs
Older hubs will ignore the
CompactStateMessage
s, and will temporarily fall out of sync from the network, so upgrading is recommended.Backwards compatibility for consumers
APIs
If you have downstream apps, you don't need to specifically store a CompactStateMessage if you are handling all the LinkAdd / LinkRemoves. You can query the
getLinksByFid
orgetLinksByTarget
and that will return the full set of all valid LinkAdd messages.Events
When the hub merges a CompactStateMessage, it will generate a hub event like it does for all messages. The event will contain the CompactStateMessage as the
mergedMessage
, along with any conflicted out messages in thedeletedMessages
field. This will include any previous CompactStateMessage, all LinkRemoves and LinkAdds that were removed as a part of the compactionBeta Was this translation helpful? Give feedback.
All reactions