Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple back-ends #8

Open
ticki opened this issue Aug 5, 2017 · 8 comments
Open

Multiple back-ends #8

ticki opened this issue Aug 5, 2017 · 8 comments

Comments

@ticki
Copy link
Member

ticki commented Aug 5, 2017

I was wondering if the API could be generalized to the extend that switching to non-epoch (e.g. conc) backends would be possible.

For example, there could be a Backend trait with an interface something like this:

  1. gc for forcing garbage collection.
  2. add_garbage for - well - adding new garbage.
  3. start_block_dtor for blocking an object's destruction
  4. end_block_dtor for ending the block.

Under epochs, 3. and 4. would corresponds to pin and Guard::drop respectively. Under hazards, 3. and 4. would correspond to creating hazards and freeing them respectively.

cc. @stjepang.

@ticki
Copy link
Member Author

ticki commented Aug 5, 2017

Actually, it'd need start_critical too for the section before start_block_dtor (in conc, it would create Blocked hazard).

@jeehoonkang
Copy link
Contributor

I believe @stjepang named the new repo crossbeam-epoch in order to support other safe reclamation schemes in the future (e.g. crossbeam-hazard). Right?

Recently, after reading @ticki's blog on conc, I started to imagine what should be the interface of the Backend trait. I think there are many subtle issues. I will summarize them in int issue in a near future.

@ticki
Copy link
Member Author

ticki commented Aug 6, 2017

I believe @stjepang named the new repo crossbeam-epoch in order to support other safe reclamation schemes in the future (e.g. crossbeam-hazard). Right?

That's exactly how I got the idea.

Recently, after reading @ticki's blog on conc, I started to imagine what should be the interface of the Backend trait. I think there are many subtle issues. I will summarize them in int issue in a near future.

There are many issues, but I think that it's possible, because as long as the trait is extensive enough, the unused interfaces for either backend can simply be empty methods.

@ticki
Copy link
Member Author

ticki commented Aug 6, 2017

The trait could look like this:

trait Backend {
    fn try_gc() -> bool;
    fn gc() {
        while !Self::try_gc() {}
    }

    fn begin_epoch();
    fn end_epoch();

    fn begin_read() -> usize;
    unsafe fn protect(ptr: *const u8, handle: usize);
    unsafe fn unprotect(handle: usize);

    fn add_garbage(ptr: *const u8);
}

In the start of your manipulating function, you'd call begin_epoch(). Then before every concurrent read, you call begin_read(), then you use the handle with the pointer you read and call protect() until you don't need to read it anymore, which is where you call unprotect on the handle given by begin_read(), then when you need to queue destruction, you call add_garbage, and finally you call epoch_end.

It seems very messy like that, but imagine wrapping it in some convinient API:

  • begin_read, protect and unprotect is unified into a RAII type, Reader, akin to conc::Guard.
  • begin_epoch and end_epoch is unified into a RAII type, Epoch, akin to crossbeam::mem::sync::Guard.

@ghost
Copy link

ghost commented Aug 7, 2017

Having a common API that encapsulates both EBR and HP is definitely a compelling idea. Last year I implemented exactly that in C++ (the code is not public, sorry!). In the end this common backend API worked ok with simple stuff like lock-free stacks and queues, but was more painful with skiplists.

My general feeling was that the idea seems nice in theory, but in practice isn't worth it. However, that's just my 2 cents based on one semi-successful attempt, and I encourage you to explore this idea further.

@ticki
Copy link
Member Author

ticki commented Aug 7, 2017

@stjepang, could you elaborate on the cases where it fails?

@ghost
Copy link

ghost commented Aug 7, 2017

I think my main beef with it was that the API is the lowest common denominator of EBR and HP, so you have to deal with the pains of both. With very complex data structures the backend API gets kinda messy, because you have to mark the critical section (the EBR part) as well as protect each pointer read (the HP part). And then if you want to get into optimizations, it becomes even more painful.

What is compelling about this approach is that the user can choose the optimal garbage collection scheme for themselves. While that sounds nice, I actually tend to believe that the benefits of giving the choice to the user are very little. For concurrent data structures there usually already is the right choice.

For example, skiplists with HP are around 3 times slower than skiplists with EBR, so why would anyone want HP? Or, hash tables should be almost equally fast with HP and EBR (depends on the hash table flavor), so why would anyone want EBR?

The direction I'm more interested in exploring are hybrid approaches: combining EBR and HP so that you get the best of both worlds. For example, if you're searching for a key in a B+ tree, you probably want EBR because you're going to load multiple nodes while traversing the tree. However, if you're implementing an iterator over leaves of the tree, you want to protect nodes using HP (because they don't have those annoying critical sections).

@ticki
Copy link
Member Author

ticki commented Aug 8, 2017

I think my main beef with it was that the API is the lowest common denominator of EBR and HP, so you have to deal with the pains of both. With very complex data structures the backend API gets kinda messy, because you have to mark the critical section (the EBR part) as well as protect each pointer read (the HP part). And then if you want to get into optimizations, it becomes even more painful.

That's true.

What is compelling about this approach is that the user can choose the optimal garbage collection scheme for themselves. While that sounds nice, I actually tend to believe that the benefits of giving the choice to the user are very little. For concurrent data structures there usually already is the right choice.

Well, it means that complex structures such as hash tables won't have to be implemented twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants