Replies: 3 comments
-
I just checked some metrics from current Bisq data store |
Beta Was this translation helpful? Give feedback.
-
Another idea to overcome the issue that protobuf is not guaranteed deterministic: |
Beta Was this translation helpful? Give feedback.
-
After some discussions with @sqrrm I think the data storage concept becomes a bit more clear. I try to summarize the current approach: Access controlThe access control layer will play the main role for protecting against ddos attacks. Peer managementBeside that the peer management will attach a network reputation to peers. E.g. a peer which is long time online and has decent response time will get a higher score compared to newer or less stable peers. Data storeThere are in-memory only data (like offers) and persisted data. Each data type is kept independently in its own map and storage. All have their defined limits and if those limits are exceeded it enters "alert mode" where it throttle the accepted data and drops old data. I am not decided yet but tend to stick with the file based persistence approach. This can help as well for recovery from an attack by using backup files from the rolling backup to replace polluted files and gives more fine grained control compared to a database where all the data is combined. To start a separate DB for each storage data type would add quite a bit of overhead. Data like Filter, Alert or Distpute agents are signed by a privilegued key (either hard coded or derived from the DAO) and verification will be done on the p2p network level so such data cannot be abused as invalid signatures lead to disconnecting the attacker and dropping the data. Use casesHere is a list of different data types to see how those can be treated in the new model (even some of the data will not be used or used differently). ProtectedStoragePayloadData kept in memory only, not persisted. Protected by signature of owner so only owner can remove (in case of MAILBOX_STORAGE_PAYLOAD only receiver can remove). ALERT, ARBITRATOR, MEDIATOR, REFUND_AGENT, FILTER:All are protected by a hard coded pubKey which need to match the signature. This is done at application layer in Bisq, but should be moved lower so invalid data get dropped immediately thus avoiding risks that those data can be polluted. By moving lower its not meant that the p2p network layer handles that but the data classes implement the verification and the p2p network calls that interface method. OFFER_PAYLOAD:We likely will keep offers online for a while (e.g. 2 days) even maker is offline. Maker still need to send MAILBOX_STORAGE_PAYLOADThose are the most critical data as they are important for completing a trade. We can apply a similar strategy as above. Active users will republish their mailbox messages frequently thus updating their timestamp to not get dropped. The limits can be higher here. Currently in Bisq those data are about 4-5 MB but should be able to get reduced by optimisations to 1-2 MB. So allowing up to 10-20 MB seems acceptable. TEMP_PROPOSAL_PAYLOADNot clear yet if the DAO will be part of Misq or be kept as external app inside Bisq. Here is a list of the size of data we get at initial data request for a fresh app:BlindVotePayload: 342 / 1,121 MB Persisted data are those in current BisqACCOUNT_AGE_WITNESS_STORE (about 3MB), SIGNED_WITNESS_STORE (7.6 MB)We can drop old data if user is not active anymore. Active users will republish their data so get updated the timestamps. BLIND_VOTE_STORE (1.3 MB), PROPOSAL_STORE (140 kb), TEMP_PROPOSAL_STORE,TEMP_PROPOSAL_STORE should be not used anymore. DAO_STATE_STORE (126 MB)Thats the most problematic area as it grows linearily with BSQ txs and writing that large file at each change (e.g. each new block) has already perfromance issues. TRADE_STATISTICS3_STORE (about 5 MB)This data is not highly critical and moslty serves for informational purpose beside the usage as Bisq-internal price indicator which is not used for % based trades anyway due its insucure nature (easy to manipulate). So my conclusion is that the access control tool combined with more fine grained storage will be sufficient to protect the network from ddos attacks on the storage layer. |
Beta Was this translation helpful? Give feedback.
-
I have not found another solution for ensuring the validity of the key in the hashmap for the data storage.
We use that to look up if we have already received data and if so we do not further propagate the data for the gossip protocol.
It also give protection so that a malicious node could not occupy a map slot and by that avoiding that the real data would get distributed.
Any scheme where we would trust the provided data (e.g. message contains hash so we do not need to calculate it) would not help for those 2 conflict scenarios as we would need to verify then both the value in the map as well as the new value. As the first use case is very common, that would likely lead to even higher costs as in case we caclulate the hash ourself we can trust the map value and never need to recaclulate it.
Using signatures or hmac would not help as well. A malicious node could create a valid signature but provide the wrong key (we could use then a normal uid instead of the hash). That would be an easier attack as the above with the hashes as one could take the uid of a competing offer, wait until its removed then publish themself a fake offer with the same uid and valid signature and the maker when back online will fail to get their offer broadcast.
If anyone has another idea, let me know, but I fear there is no alternative to using the hash of the serialized payload, thus bringing us back to the issue that the serialisation method should be deterministic.
If the serialisation would be difference for differnet users it might be less critical as long as it produces determinisitically at the local user the same bytes as that map is local. Though for the signature check it would be a problem if user A got different bytes used for the input for the signature than user B for verifying it. But I think we could use a uid instead of the hash of the data. The verification uses the sequence number as well and it is checked before if the sequence number has correct. So I think it would be safe to only use the uid as input for the signature, thus removing the risk in case an implementation of protobuf would generate different bytes.
Though we never saw any such issues beside that hashmaps are in java deterministically sorted but in Rust not. But avoiding hashmaps in proto is easy and solves that know issue.
So maybe we should not over-emphasize that theoretical risk. We also can freeze the protobug version if it would change to break deterministic behaviour.
Regarding the storage strategy:
A database would not make sense as we need to lookup frequently the hashes and a DB lookup would be too slow.
Also to create schemas for all potentially persisted data would be very cumbersome. Using the serialization of protobuf avoids that we need a second scheme.
I tend to use a similar model as in Bisq now. To have storage services which are implemented by clients. That way the clients only lookup their local maps and get better performance.
The p2p lib just manages a list of services and delegates the storage to those. Persistence will be implemented on that end as well.
Currently we have a map which holds data which are never persisted (like offers) as well as persisted data and different features (mainly protected data which allow removal and append only data).
One open problem is how we can support protected and persisted data. TempProposal was such a use case where only the owner can remove it but it got persisted. That caused the problem that if a user was online at the addData message they persisted the object. Later when the user was offline and the owner removed the TempProposal the user missed that RemoveDataMessage and kept the TempProposal from their persited data.
To fix that we would need to keep the latest RemoveDataMessage (using sequence number) so that when the user goes online again they got the missed RemoveDataMessage and apply it to remove the TempProposal.
Such a solution would require a TTL for the RemoveDataMessage so that those do not fill up storage, and then they need to be well adjusted to the TTL of the payload... Not sure if the added complexity is worht to support that use case. For TempProposal we have removed the delete option to avoid that problem, but we could have implemented it as not persisted data but with a long TTL. This would have avoided the whole problem with the little overhead that each node need to load those data at startup and the little risk that if the network has a network wide problem and forgets about that data that its less safe to recover that lost data.
Beta Was this translation helpful? Give feedback.
All reactions