P2P data storage #12

chimp1984 · 2021-05-20T01:49:33Z

chimp1984
May 20, 2021
Maintainer

I have not found another solution for ensuring the validity of the key in the hashmap for the data storage.
We use that to look up if we have already received data and if so we do not further propagate the data for the gossip protocol.
It also give protection so that a malicious node could not occupy a map slot and by that avoiding that the real data would get distributed.
Any scheme where we would trust the provided data (e.g. message contains hash so we do not need to calculate it) would not help for those 2 conflict scenarios as we would need to verify then both the value in the map as well as the new value. As the first use case is very common, that would likely lead to even higher costs as in case we caclulate the hash ourself we can trust the map value and never need to recaclulate it.
Using signatures or hmac would not help as well. A malicious node could create a valid signature but provide the wrong key (we could use then a normal uid instead of the hash). That would be an easier attack as the above with the hashes as one could take the uid of a competing offer, wait until its removed then publish themself a fake offer with the same uid and valid signature and the maker when back online will fail to get their offer broadcast.

If anyone has another idea, let me know, but I fear there is no alternative to using the hash of the serialized payload, thus bringing us back to the issue that the serialisation method should be deterministic.
If the serialisation would be difference for differnet users it might be less critical as long as it produces determinisitically at the local user the same bytes as that map is local. Though for the signature check it would be a problem if user A got different bytes used for the input for the signature than user B for verifying it. But I think we could use a uid instead of the hash of the data. The verification uses the sequence number as well and it is checked before if the sequence number has correct. So I think it would be safe to only use the uid as input for the signature, thus removing the risk in case an implementation of protobuf would generate different bytes.

Though we never saw any such issues beside that hashmaps are in java deterministically sorted but in Rust not. But avoiding hashmaps in proto is easy and solves that know issue.
So maybe we should not over-emphasize that theoretical risk. We also can freeze the protobug version if it would change to break deterministic behaviour.

Regarding the storage strategy:
A database would not make sense as we need to lookup frequently the hashes and a DB lookup would be too slow.
Also to create schemas for all potentially persisted data would be very cumbersome. Using the serialization of protobuf avoids that we need a second scheme.

I tend to use a similar model as in Bisq now. To have storage services which are implemented by clients. That way the clients only lookup their local maps and get better performance.
The p2p lib just manages a list of services and delegates the storage to those. Persistence will be implemented on that end as well.
Currently we have a map which holds data which are never persisted (like offers) as well as persisted data and different features (mainly protected data which allow removal and append only data).

One open problem is how we can support protected and persisted data. TempProposal was such a use case where only the owner can remove it but it got persisted. That caused the problem that if a user was online at the addData message they persisted the object. Later when the user was offline and the owner removed the TempProposal the user missed that RemoveDataMessage and kept the TempProposal from their persited data.
To fix that we would need to keep the latest RemoveDataMessage (using sequence number) so that when the user goes online again they got the missed RemoveDataMessage and apply it to remove the TempProposal.
Such a solution would require a TTL for the RemoveDataMessage so that those do not fill up storage, and then they need to be well adjusted to the TTL of the payload... Not sure if the added complexity is worht to support that use case. For TempProposal we have removed the delete option to avoid that problem, but we could have implemented it as not persisted data but with a long TTL. This would have avoided the whole problem with the little overhead that each node need to load those data at startup and the little risk that if the network has a network wide problem and forgets about that data that its less safe to recover that lost data.

chimp1984 · 2021-05-20T02:01:26Z

chimp1984
May 20, 2021
Maintainer Author

I just checked some metrics from current Bisq data store
The map for protected live data (not persisted) has 1510 entries.
The size of all keys is 51340 bytes (34 bytes per enty).
The size of all values is 5470304 bytes (5.47 MB -> average 3.6 kb per enty).
Protobuf serialisation for all value items took 11 ms.

0 replies

chimp1984 · 2021-05-20T15:52:29Z

chimp1984
May 20, 2021
Maintainer Author

Another idea to overcome the issue that protobuf is not guaranteed deterministic:
We could use Java serialisation for those purposes. The security risk from Java serialisation are irrelevant in those case as we have already a java object at that point and we do not use it to serialize untrusted bytes. I assume it is much faster as well as protobuf serialisation. The transient keyword provides also a bit of flexibility. For the in-memory hashmap backward compatibility is irrelevant anyway. For the persisted stores we will support the append-only use case only. Here we could get duplicated entries with an old and new version of the same obejct, but thats the same problem as when protobuf is used.

0 replies

chimp1984 · 2021-05-21T17:57:49Z

chimp1984
May 21, 2021
Maintainer Author

After some discussions with @sqrrm I think the data storage concept becomes a bit more clear. I try to summarize the current approach:

Access control

The access control layer will play the main role for protecting against ddos attacks.
Pow will be the base but if the user have some reputation-based proof like account age, account age witness, proof of burn, proof of social activity (e.g. by using the chat),...that can replace the pow requirement, so regular Bisq users will not get affected in an attack case by higher pow requirements. New users though will get higher pow costs.
In normal operation the pow requirement is close to zero so there is no performance degradation.

Peer management

Beside that the peer management will attach a network reputation to peers. E.g. a peer which is long time online and has decent response time will get a higher score compared to newer or less stable peers.
In attack scenarios the peer management can prefer those stable nodes with higher reputation over newer potentially attacking nodes. So the core users will be less effected by the attack than new users.
An attacker would need to build up such reputation over time, but if abused once (e.g. detected as abusing node) he loses that node as he get banned. So it comes with higher costs.

Data store

There are in-memory only data (like offers) and persisted data.

Each data type is kept independently in its own map and storage. All have their defined limits and if those limits are exceeded it enters "alert mode" where it throttle the accepted data and drops old data.
Probably all persisted data will have a TTL and the data owner/originator is responsible to republish data over time. Thus data of non active users get removed from the network store over time (e.g. account age of a non active user is not neeeded anymore).

I am not decided yet but tend to stick with the file based persistence approach. This can help as well for recovery from an attack by using backup files from the rolling backup to replace polluted files and gives more fine grained control compared to a database where all the data is combined. To start a separate DB for each storage data type would add quite a bit of overhead.

Data like Filter, Alert or Distpute agents are signed by a privilegued key (either hard coded or derived from the DAO) and verification will be done on the p2p network level so such data cannot be abused as invalid signatures lead to disconnecting the attacker and dropping the data.

Use cases

Here is a list of different data types to see how those can be treated in the new model (even some of the data will not be used or used differently).

ProtectedStoragePayload

Data kept in memory only, not persisted. Protected by signature of owner so only owner can remove (in case of MAILBOX_STORAGE_PAYLOAD only receiver can remove).

ALERT, ARBITRATOR, MEDIATOR, REFUND_AGENT, FILTER:

All are protected by a hard coded pubKey which need to match the signature. This is done at application layer in Bisq, but should be moved lower so invalid data get dropped immediately thus avoiding risks that those data can be polluted. By moving lower its not meant that the p2p network layer handles that but the data classes implement the verification and the p2p network calls that interface method.

OFFER_PAYLOAD:

We likely will keep offers online for a while (e.g. 2 days) even maker is offline. Maker still need to send StillOnline signals so takers can see which maker is online and which not.
Offer can be seen as low critical data, if they are not distributed it does not cause security risks just lowers availibility and usability.
We can limit max number of offer to e.g. 3 times the average. If it exceeds that the nodes start to drop older offers and futher increase the required pow for accepting offers.
We can have a max limit (e.g. 5 times the average) where no offers get added at all or the dumping of old offers will happen faster.

MAILBOX_STORAGE_PAYLOAD

Those are the most critical data as they are important for completing a trade. We can apply a similar strategy as above. Active users will republish their mailbox messages frequently thus updating their timestamp to not get dropped. The limits can be higher here. Currently in Bisq those data are about 4-5 MB but should be able to get reduced by optimisations to 1-2 MB. So allowing up to 10-20 MB seems acceptable.
Currently in Bisq we support attachments in disputes which are probably responsible for the outliers of large mailbox messages. By not supporting that and optimizing the payloads we will probably get much lower messages.
We also should consider to use data compression to further decrease the payload size. If attachement need to be further supported they can be separated as different data type and treated with a lower priority.

TEMP_PROPOSAL_PAYLOAD

Not clear yet if the DAO will be part of Misq or be kept as external app inside Bisq.
If we integrate the DAO the temp proposal should be kept in memory only. Atm its also persisted which did not work well with the remove feature.
The DAO data are more critical as well as in case they are not available could lead to failed DAO cycles. But it can be seen as an "non-fatal" outcome that in case of a network attack the DAO voting cycle fails due data inconsistency and need to be repeated. Beside that a similar approach can be used as for mailbox messages. Data are rather small so we can allow higher limits.
Furthermore we could use some special access control mechanism for DAO contributors (like utilizing the merit), so only contributors who have build up some reputation as contributors can add DAO related data.

Here is a list of the size of data we get at initial data request for a fresh app:

BlindVotePayload: 342 / 1,121 MB
RefundAgent: 1 / 1,427 kB
Filter: 6 / 43,905 kB
TempProposalPayload: 44 / 48,607 kB
MailboxStoragePayload: 1157 / 4,71 MB
ProposalPayload: 636 / 123,741 kB
Mediator: 2 / 2,824 kB
Alert: 1 / 1,568 kB
AccountAgeWitness: 3100 / 93,848 kB
OfferPayload: 268 / 480,594 kB
SignedWitness: 317 / 303,083 kB

Persisted data are those in current Bisq

ACCOUNT_AGE_WITNESS_STORE (about 3MB), SIGNED_WITNESS_STORE (7.6 MB)

We can drop old data if user is not active anymore. Active users will republish their data so get updated the timestamps.
We also should consider to include those data in the offer. I think thats the only use case where we need to have them available as in the trade the data to verify them is exchanged anyway. So maybe we don't need to store them at all?
For the feature to not be able to add outdated accout age data we need another appoach as the current one relies on honest seed nodes and we do not want to have special roles in the new model anymore. An approach based on Open Timestamp for instance could work. For now we just don't consider it a required feature.

BLIND_VOTE_STORE (1.3 MB), PROPOSAL_STORE (140 kb), TEMP_PROPOSAL_STORE,

TEMP_PROPOSAL_STORE should be not used anymore.
The others are rather critical data but can be limited in attack scenarios for past DAO contributors being allowed only to publish such data.
Also they are bound to BSQ transactions, which might be used for at least filtering out fake data in an attack case (can be done only after confirmations, and would need to be done from an external tool knowing about the DAO, but could help to clean up the data). We could also clean that up automatically at DAO result phase where we detect invalid data anyway. So from there we could delete then all invalid entries.

DAO_STATE_STORE (126 MB)

Thats the most problematic area as it grows linearily with BSQ txs and writing that large file at each change (e.g. each new block) has already perfromance issues.
One approach to deal with it is to split up the data object into several independent store files. Some data are mutable but the large data chunk is the linked list of immutable blocks and if we store each block as file or at least group it to lists of x blocks we could avoid to re-write the immutabel part. So that scaling problem need to solved anyway independent of the p2p network attack context.
When having separated that we could apply above strategies to assign different tolerance levels and limits to different types of data.
Another approach is to deal with it as externally provided data like we will do it for trade statistics.
in any case as the data can always be rebuilt from blockchain data and the DAO proposal and voting data we can recover from an attack cleanly and can apply rather strict limits to not get our resources abused.

TRADE_STATISTICS3_STORE (about 5 MB)

This data is not highly critical and moslty serves for informational purpose beside the usage as Bisq-internal price indicator which is not used for % based trades anyway due its insucure nature (easy to manipulate).
We could delegate that storage to some external system like a DHT (ipfs?) or have some capability to signal a node which provides that data. In attack case such a capability can flip so that data is not supported during high network load. Traders will republish their data at each startup so missed data during an attack will get re-filled later.
We could also stick with the historical storage approach based on version numbers and shipping past data as resources. That can make management of cleaning up after an attack also easier. The only downside of that current approach is that the maintainer need to take care to not ship invalid data, which is in that use case also less critical - for the DaoState data store though its critical.

So my conclusion is that the access control tool combined with more fine grained storage will be sufficient to protect the network from ddos attacks on the storage layer.
Further optimizations like compression, verifications and avoidance for the need to persist data will reduce the attack surface as well.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2P data storage #12

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

P2P data storage #12

chimp1984 May 20, 2021 Maintainer

Replies: 3 comments

chimp1984 May 20, 2021 Maintainer Author

chimp1984 May 20, 2021 Maintainer Author

chimp1984 May 21, 2021 Maintainer Author

Access control

Peer management

Data store

Use cases

ProtectedStoragePayload

ALERT, ARBITRATOR, MEDIATOR, REFUND_AGENT, FILTER:

OFFER_PAYLOAD:

MAILBOX_STORAGE_PAYLOAD

TEMP_PROPOSAL_PAYLOAD

Here is a list of the size of data we get at initial data request for a fresh app:

Persisted data are those in current Bisq

ACCOUNT_AGE_WITNESS_STORE (about 3MB), SIGNED_WITNESS_STORE (7.6 MB)

BLIND_VOTE_STORE (1.3 MB), PROPOSAL_STORE (140 kb), TEMP_PROPOSAL_STORE,

DAO_STATE_STORE (126 MB)

TRADE_STATISTICS3_STORE (about 5 MB)

chimp1984
May 20, 2021
Maintainer

chimp1984
May 20, 2021
Maintainer Author

chimp1984
May 20, 2021
Maintainer Author

chimp1984
May 21, 2021
Maintainer Author