Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes the IGrainWithGuidKey constraint from ISyncWorker. #55

Merged
merged 1 commit into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 47 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![Build and test](https://github.com/OrleansContrib/Orleans.SyncWork/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/OrleansContrib/Orleans.SyncWork/actions/workflows/ci.yml) [![Coverage Status](https://coveralls.io/repos/github/OrleansContrib/Orleans.SyncWork/badge.svg?branch=main)](https://coveralls.io/github/OrleansContrib/Orleans.SyncWork?branch=main)

![Latest NuGet Version](https://img.shields.io/nuget/v/Orleans.SyncWork)
![Latest NuGet Version](https://img.shields.io/nuget/v/Orleans.SyncWork)
![License](https://img.shields.io/github/license/OrleansContrib/Orleans.SyncWork)

This package's intention is to expose an abstract base class to allow [Orleans](https://github.com/dotnet/orleans/) to work with long running, CPU bound, synchronous work, without becoming overloaded.
Expand All @@ -13,61 +13,67 @@ The project was built primarily with .net3 in mind, though the varying major ver

### Requirements

* [.net 6.0 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)
* [.net 7.0 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/7.0)
* [.net 8.0 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/8.0)
* [dotnet-format](https://github.com/dotnet/format)
- [.net 6.0 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)
- [.net 7.0 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/7.0)
- [.net 8.0 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/8.0)
- [dotnet-format](https://github.com/dotnet/format)

## Project Overview

There are several projects within this repository, all with the idea of demonstrating and/or testing the claim that the NuGet package https://www.nuget.org/packages/Orleans.SyncWork/ does what it is claimed it does.

Note that this project's major revision is kept in-line with the Orleans major version, so the project
does not necessarily abide by [SemVer](https://semver.org/), but we try as much as possible to do so. If breaking changes are introduced, descriptions of the breaking change and how to implement against it should be provided in release notes.

The projects in this repository include:

* [Orleans.SyncWork](#orleanssyncwork)
* [Orleans.SyncWork.Tests](#orleanssyncworktests)
* [Orleans.SyncWork.Demo.Api](#orleanssyncworkdemoapi)
* [Orleans.SyncWork.Demo.Api.Benchmark](#orleanssyncworkdemoapibenchmark)
* [Orleans.SyncWork.Demo.Services](#orleanssyncworkdemoservices)
- [Orleans.SyncWork](#orleanssyncwork)
- [Orleans.SyncWork.Tests](#orleanssyncworktests)
- [Orleans.SyncWork.Demo.Api](#orleanssyncworkdemoapi)
- [Orleans.SyncWork.Demo.Api.Benchmark](#orleanssyncworkdemoapibenchmark)
- [Orleans.SyncWork.Demo.Services](#orleanssyncworkdemoservices)

### Orleans.SyncWork

The meat and potatoes of the project. This project contains the abstraction of "Long Running, CPU bound, Synchronous work" in the form of an abstract base class [SyncWorker](https://github.com/OrleansContrib/Orleans.SyncWork/blob/main/src/Orleans.SyncWork/SyncWorker.cs); which implements an interface [ISyncWorker](https://github.com/OrleansContrib/Orleans.SyncWork/blob/main/src/Orleans.SyncWork/ISyncWorker.cs).
The meat and potatoes of the project. This project contains the abstraction of "Long Running, CPU bound, Synchronous work" in the form of an abstract base class [SyncWorker](https://github.com/OrleansContrib/Orleans.SyncWork/blob/main/src/Orleans.SyncWork/SyncWorker.cs); which implements an interface [ISyncWorker](https://github.com/OrleansContrib/Orleans.SyncWork/blob/main/src/Orleans.SyncWork/ISyncWorker.cs).

When long running work is identified, you can extend the base class `SyncWorker`, providing a `TRequest` and `TResponse` unique to the long running work. This allows you to create as many `ISyncWork<TRequest, TResponse>` implementations as necessary, for all your long running CPU bound needs! (At least that is the hope.)
When long running work is identified, you can extend the base class `SyncWorker`, providing a `TRequest` and `TResponse` unique to the long running work. This allows you to create as many `ISyncWork<TRequest, TResponse>` implementations as necessary, for all your long running CPU bound needs! (At least that is the hope.)

Basic "flow" of the SyncWork:

* `Start`
* Poll `GetStatus` until a `Completed` or `Faulted` status is received
* `GetResult` or `GetException` depending on the `GetStatus`
- `Start`
- Poll `GetStatus` until a `Completed` or `Faulted` status is received
- `GetResult` or `GetException` depending on the `GetStatus`

This package introduces a few "requirements" against Orleans:

* In order to not overload Orleans, a `LimitedConcurrencyLevelTaskScheduler` is introduced. This task scheduler is registered (either manually or through the provided extension method) with a maximum level of concurrency for the silo being set up. This maximum concurrency ***MUST*** allow for idle threads, lest the Orleans server be overloaded. In testing, the general rule of thumb was `Environment.ProcessorCount - 2` max concurrency. The important part is that the CPU is not fully "tapped out" such that the normal Orleans asynchronous messaging can't make it through due to the blocking sync work - this will make things start timing out.
* Blocking grains are stateful, and are currently keyed on a Guid. If in a situation where multiple grains of long running work is needed, each grain should be initialized with its own unique identity.
* Blocking grains *likely* ***CAN NOT*** dispatch further blocking grains. This is not yet tested under the repository, but it stands to reason that with a limited concurrency scheduler, the following scenario would lead to a deadlock:
* Grain A is long running
* Grain B is long running
* Grain A initializes and fires off Grain B
* Grain A cannot complete its work until it gets the results of Grain B
- In order to not overload Orleans, a `LimitedConcurrencyLevelTaskScheduler` is introduced. This task scheduler is registered (either manually or through the provided extension method) with a maximum level of concurrency for the silo being set up. This maximum concurrency **_MUST_** allow for idle threads, lest the Orleans server be overloaded. In testing, the general rule of thumb was `Environment.ProcessorCount - 2` max concurrency. The important part is that the CPU is not fully "tapped out" such that the normal Orleans asynchronous messaging can't make it through due to the blocking sync work - this will make things start timing out.
- Blocking grains are stateful, and are currently keyed on a Guid. If in a situation where multiple grains of long running work is needed, each grain should be initialized with its own unique identity.
- Blocking grains _likely_ **_CAN NOT_** dispatch further blocking grains. This is not yet tested under the repository, but it stands to reason that with a limited concurrency scheduler, the following scenario would lead to a deadlock:

- Grain A is long running
- Grain B is long running
- Grain A initializes and fires off Grain B
- Grain A cannot complete its work until it gets the results of Grain B

In the above scenario, if "Grain A" is "actively being worked" and it fires off a "Grain B", but "Grain A" cannot complete its work until "Grain B" finishes its own, but "Grain B" cannot *start* its work until "Grain A" finishes its work due to limited concurrency, you've run into a situation where the limited concurrency task scheduler can never finish the work of "Grain A".
That was quite a sentence, hopefully the point was conveyed somewhat sensibly. There may be a way to avoid the above scenario, but I have not yet deeply explored it.
In the above scenario, if "Grain A" is "actively being worked" and it fires off a "Grain B", but "Grain A" cannot complete its work until "Grain B" finishes its own, but "Grain B" cannot _start_ its work until "Grain A" finishes its work due to limited concurrency, you've run into a situation where the limited concurrency task scheduler can never finish the work of "Grain A".

That was quite a sentence, hopefully the point was conveyed somewhat sensibly. There may be a way to avoid the above scenario, but I have not yet deeply explored it.

#### Usage

Extend the base class to implement a long running grain.
Create an interface for the grain, which implements `ISyncWorker<TRequest, TResult>`, as well as one of the `IGrainWith...Key` interfaces. Then create a new class that extends the `SyncWorker<TRequest, TResult>` abstract class, and implements the new interface that was introduced:

```cs
public class PasswordVerifier : SyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IGrain
public interface IPasswordVerifierGrain : ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IGrainWithGuidKey

public class PasswordVerifierGrain : SyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IPasswordVerifierGrain
{
private readonly IPasswordVerifier _passwordVerifier;

public PasswordVerifier(
ILogger<PasswordVerifier> logger,
LimitedConcurrencyLevelTaskScheduler limitedConcurrencyLevelTaskScheduler,
ILogger<PasswordVerifier> logger,
LimitedConcurrencyLevelTaskScheduler limitedConcurrencyLevelTaskScheduler,
IPasswordVerifier passwordVerifier) : base(logger, limitedConcurrencyLevelTaskScheduler)
{
_passwordVerifier = passwordVerifier;
Expand Down Expand Up @@ -103,25 +109,25 @@ var request = new PasswordVerifierRequest()
Password = "my super neat password that's totally secure because it's super long",
PasswordHash = "$2a$11$vBzJ4Ewx28C127AG5x3kT.QCCS8ai0l4JLX3VOX3MzHRkF4/A5twy"
}
var passwordVerifyGrain = grainFactory.GetGrain<ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>>(Guid.NewGuid());
var passwordVerifyGrain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
var result = await passwordVerifyGrain.StartWorkAndPollUntilResult(request);
```

The above `StartWorkAndPollUntilResult` is an extension method defined in the package ([SyncWorkerExtensions](https://github.com/OrleansContrib/Orleans.SyncWork/blob/main/src/Orleans.SyncWork/SyncWorkerExtensions.cs)) that `Start`s, `Poll`s, and finally `GetResult` or `GetException` upon completed work. There would seemingly be place for improvement here as it relates to testing unexpected scenarios, configuration based polling, etc.
The above `StartWorkAndPollUntilResult` is an extension method defined in the package ([SyncWorkerExtensions](https://github.com/OrleansContrib/Orleans.SyncWork/blob/main/src/Orleans.SyncWork/SyncWorkerExtensions.cs)) that `Start`s, `Poll`s, and finally `GetResult` or `GetException` upon completed work. There would seemingly be place for improvement here as it relates to testing unexpected scenarios, configuration based polling, etc.

### Orleans.SyncWork.Tests

Unit testing project for the work in [Orleans.SyncWork](#orleanssyncwork). These tests bring up a "TestCluster" which is used for the full duration of the tests against the grains.
Unit testing project for the work in [Orleans.SyncWork](#orleanssyncwork). These tests bring up a "TestCluster" which is used for the full duration of the tests against the grains.

One of the tests in particular throws 10k grains onto the cluster at once, all of which are long running (~200ms each) on my machine - more than enough time to overload the cluster if the limited concurrency task scheduler is not working along side the `SyncWork` base class correctly.

TODO: still could use a few more unit tests here to if nothing else, document behavior.

### Orleans.SyncWork.Demo.Api

This is a demo of the `ISyncWork<TRequest, TResult>` in action. This project is being used as both a Orleans Silo, and client. Generally you would stand up nodes to the cluster separate from the clients against the cluster. Since we have only one node for testing purposes, this project acts as both the silo host and client.
This is a demo of the `ISyncWork<TRequest, TResult>` in action. This project is being used as both a Orleans Silo, and client. Generally you would stand up nodes to the cluster separate from the clients against the cluster. Since we have only one node for testing purposes, this project acts as both the silo host and client.

The [OrleansDashboard](https://github.com/OrleansContrib/OrleansDashboard) is also brought up with the API. You can see an example of hitting an endpoint in which 10k password verification requests are received here:
The [OrleansDashboard](https://github.com/OrleansContrib/OrleansDashboard) is also brought up with the API. You can see an example of hitting an endpoint in which 10k password verification requests are received here:

![Dashboard showing 10k CPU bound, long running requests](/docs/images/dashboard.PNG)

Expand Down Expand Up @@ -186,7 +192,7 @@ public class Benchy
var tasks = new List<Task>();
for (var i = 0; i < TotalNumberPerBenchmark; i++)
{
var grain = grainFactory.GetGrain<ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>>(Guid.NewGuid());
var grain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
tasks.Add(grain.StartWorkAndPollUntilResult(_request));
}

Expand All @@ -197,14 +203,14 @@ public class Benchy

And here are the results:

| Method | Mean | Error | StdDev |
|---------------------- |---------:|---------:|---------:|
| Serial | 12.399 s | 0.0087 s | 0.0077 s |
| MultipleTasks | 12.289 s | 0.0106 s | 0.0094 s |
| Method | Mean | Error | StdDev |
| --------------------- | -------: | -------: | -------: |
| Serial | 12.399 s | 0.0087 s | 0.0077 s |
| MultipleTasks | 12.289 s | 0.0106 s | 0.0094 s |
| MultipleParallelTasks | 1.749 s | 0.0347 s | 0.0413 s |
| OrleansTasks | 2.130 s | 0.0055 s | 0.0084 s |
| OrleansTasks | 2.130 s | 0.0055 s | 0.0084 s |

And of course note, that in the above the Orleans tasks are *limited* to my local cluster. In a more real situation where you have multiple nodes to the cluster, you could expect to get better timing, though you'd probably have to deal more with network latency.
And of course note, that in the above the Orleans tasks are _limited_ to my local cluster. In a more real situation where you have multiple nodes to the cluster, you could expect to get better timing, though you'd probably have to deal more with network latency.

### Orleans.SyncWork.Demo.Services

Expand Down
2 changes: 1 addition & 1 deletion samples/Orleans.SyncWork.Demo.Api.Benchmark/Benchy.cs
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ public async Task OrleansTasks()
var tasks = new List<Task>();
for (var i = 0; i < TotalNumberPerBenchmark; i++)
{
var grain = grainFactory.GetGrain<ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>>(Guid.NewGuid());
var grain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
tasks.Add(grain.StartWorkAndPollUntilResult(_request));
}

Expand Down
4 changes: 2 additions & 2 deletions samples/Orleans.SyncWork.Demo.Api/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
{
try
{
var passwordVerifyGrain = grainFactory.GetGrain<ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>>(Guid.NewGuid());
var passwordVerifyGrain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
return await passwordVerifyGrain.StartWorkAndPollUntilResult(request);
}
catch (Exception e)
Expand All @@ -59,7 +59,7 @@

for (var i = 0; i < 10_000; i++)
{
var passwordVerifyGrain = grainFactory.GetGrain<ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>>(Guid.NewGuid());
var passwordVerifyGrain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
tasks.Add(passwordVerifyGrain.StartWorkAndPollUntilResult(request));
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
namespace Orleans.SyncWork.Demo.Services.Grains;

public interface IPasswordVerifierGrain
: ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IGrainWithGuidKey;
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@

namespace Orleans.SyncWork.Demo.Services.Grains;

public class PasswordVerifier : SyncWorker<PasswordVerifierRequest, PasswordVerifierResult>
public class PasswordVerifierGrain : SyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IPasswordVerifierGrain
{
private readonly IPasswordVerifier _passwordVerifier;

public PasswordVerifier(
ILogger<PasswordVerifier> logger,
public PasswordVerifierGrain(
ILogger<PasswordVerifierGrain> logger,
LimitedConcurrencyLevelTaskScheduler limitedConcurrencyLevelTaskScheduler,
IPasswordVerifier passwordVerifier) : base(logger, limitedConcurrencyLevelTaskScheduler)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,22 @@

namespace Orleans.SyncWork.Demo.Services.TestGrains;

public class GrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable : SyncWorker<TestDelayExceptionRequest, TestDelayExceptionResult>, IGrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable
{
public GrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable(
ILogger<GrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable> logger,
LimitedConcurrencyLevelTaskScheduler limitedConcurrencyScheduler
) : base(logger, limitedConcurrencyScheduler) { }

protected override async Task<TestDelayExceptionResult> PerformWork(TestDelayExceptionRequest request)
{
Logger.LogInformation($"Waiting {request.MsDelayPriorToResult} on {this.IdentityString}");
await Task.Delay(request.MsDelayPriorToResult);

throw new TestGrainException("This is an expected exception, I'm testing for it!");
}
}

[GenerateSerializer]
public class TestGrainException : Exception
{
Expand All @@ -18,22 +34,5 @@ public class TestDelayExceptionRequest
}

[GenerateSerializer]
public class TestDelayExceptionResult
{
}

public class GrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable : SyncWorker<TestDelayExceptionRequest, TestDelayExceptionResult>
{
public GrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable(
ILogger<GrainThatWaitsSetTimePriorToExceptionResultBecomingAvailable> logger,
LimitedConcurrencyLevelTaskScheduler limitedConcurrencyScheduler
) : base(logger, limitedConcurrencyScheduler) { }
public class TestDelayExceptionResult;

protected override async Task<TestDelayExceptionResult> PerformWork(TestDelayExceptionRequest request)
{
Logger.LogInformation($"Waiting {request.MsDelayPriorToResult} on {this.IdentityString}");
await Task.Delay(request.MsDelayPriorToResult);

throw new TestGrainException("This is an expected exception, I'm testing for it!");
}
}
Loading
Loading