Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irmin-(client|server): remove server-side state, simplify batch api, and cleanups #2272

Merged
merged 7 commits into from
Sep 29, 2023

Conversation

metanivek
Copy link
Member

This is a follow up PR from the initial irmin-server integration PR.

The big changes:

  1. Remove most server-side state. Previously, there was an API to manipulate trees on the server, and also "current branch/store" tracking per client. The server-side tree manipulation APIs have been removed and branch/store state is now kept on the client and passed when needed. We can add these sorts of client/server interactions in the future, but I think it is generally better to keep as little state on the server as possible.
  2. Simplify batch API. The batching API is now completely "build on the client and then send to the server for application". With the exception of (sometimes) adding local trees to a batch, creating batches will not need to call the server. The API is smaller then previously but we can grow it as needed based on use cases.

There are several other cleanups: adopting irmin's [%log.* ...] ppx since this is now an official irmin sub-project, adding a full example that demonstrates batching, and adding a mutex for requests so that the server is only processing one client request at a time.

It's also worth noting that I discovered a bug in irmin-git/ocaml-git while testing the batch apply API. I opened PR for ocaml-git that fixes the core issue, but until that is released and then irmin-git is patched, we cannot use the batch api's add_value with irmin-git since it results in a decoding issue on the server.

@metanivek metanivek requested review from art-w and clecat August 24, 2023 13:56
@metanivek metanivek added the no-changelog-needed No changelog is needed here label Aug 24, 2023
@samoht
Copy link
Member

samoht commented Aug 24, 2023

Thanks. Simplifying this API to store less state on the server is a good idea.

There is still something off in that API that we could improve. For instance, we have:

  • type tree ~= hash | concrete_tree for client/server
  • type batch ~= (path * tree) list

If you blink hard enough, both types should be Merkle proofs (i.e. a tree with some trees hidden behind a hash). I wonder if we could instead expose a client-side API to manipulate Merkle proofs and base all client/server messages on this instead of custom, non-optimised approximations.

I don't have more concrete suggestions apart that I believe you are going in the right direction :-) Also, another thing to consider for the client-side API is how it will look when using other languages (e.g. Python or C). So, we need to check what bindings would look like.

@metanivek
Copy link
Member Author

If you blink hard enough, both types should be Merkle proofs (i.e. a tree with some trees hidden behind a hash). I wonder if we could instead expose a client-side API to manipulate Merkle proofs and base all client/server messages on this instead of custom, non-optimised approximations.

The tree part of the batch api definitely could use more exploration (this PR only removes server-side id trees but keeps the rest mostly untouched). If you are adding a pre-existing tree, I think that the api works decently well (just using the tree's key), but if the tree doesn't have a key it falls back to concrete trees, which doesn't seem that great. I'll think about how to unify this using proofs but am open to more thoughts about how it might work if you have them to share!

I think the contents part of the batch works okay. It is either a new value or a hash for an existing value (although maybe it should be contents_key based instead).

Also, another thing to consider for the client-side API is how it will look when using other languages (e.g. Python or C). So, we need to check what bindings would look like.

Good point. I've exercised it lightly for OCaml through the example, but I agree looking at foreign bindings is a good exercise. I guess a first step would be adding irmin-client bindings to libirmin.

@samoht
Copy link
Member

samoht commented Aug 24, 2023

I think the initial goal we had with Zach was:

  1. the OCaml client bindings to be just an ordinary Irmin.S implementation, and so could just pass irmin-test
  2. tree operations have to be done on the client side.
  3. the client/server interface should be higher than the low-level store to allow some sort of batching

I think 1. and 2. are working for irmin-client, but using the low-level store API calls (so defeating 3.). The batch API was a middle-ground solution to fix 3. but I am not super happy with it either.

Maybe the right way to address this (not in this PR, which is already a reasonable simplification of the batch API) is to look at all the API calls that involve trees and see what kind of client/server interactions we want from those. I suspect we could re-implement a different tree.ml for clients, where reads will be on-demand (like they are today) but where we have a way to batch writes. Not totally sure how this would work without looking more in-depth tbh 😅

src/irmin-server/command.ml Outdated Show resolved Hide resolved
Instead of setting/getting a current branch from the server, use the
status of the store on the client to create it when needed on the
server.
@metanivek
Copy link
Member Author

@art-w thanks for the review! I've updated both the things you called out.

Copy link
Contributor

@art-w art-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@art-w art-w merged commit e24b729 into mirage:main Sep 29, 2023
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog-needed No changelog is needed here
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants