Skip to content

Commit

Permalink
Cache clusters (#55)
Browse files Browse the repository at this point in the history
* Extract SolidCache::Cluster, just one cluster for now

* Change config to accept clusters/cluster

* Write to multiple clusters, read from first

* Async writes to other clusters

* Update checkout action

* Fix merge errors

* Update readme with cluster config
  • Loading branch information
djmb authored Aug 10, 2023
1 parent fa9d0dc commit 1a4a6a3
Show file tree
Hide file tree
Showing 26 changed files with 682 additions and 415 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ jobs:
TARGET_DB: ${{ matrix.database }}
steps:
- name: Checkout code
uses: actions/checkout@v2
uses: actions/checkout@v3
- name: Setup Ruby and install gems
uses: ruby/setup-ruby@v1
with:
Expand Down
55 changes: 42 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
# SolidCache
SolidCache is a database backed ActiveSupport cache store implementation.
SolidCache is a database-backed ActiveSupport cache store implementation.

Using SQL databases backed by solid state storage we can have caches that are much larger and cheaper than traditional memory only Redis or Memcached backed caches.
Using SQL databases backed by SSDs we can have caches that are much larger and cheaper than traditional memory only Redis or Memcached backed caches.

Testing on HEY shows that reads and writes are 25%-50% slower than with a Redis cache. However this is not a significant percentage of the overall request time.

If cache misses are expensive (up to 50x the cost of a hit on HEY), then there are big advantages to caches that can hold months rather than days worth of data.

## Usage

To set solid cache as your Rails cache, you should add this to your environment config:
To set SolidCache as your Rails cache, you should add this to your environment config:

```ruby
config.cache_store = :solid_cache_store
```

SolidCache is a FIFO (first in, first out) cache. While this is not as efficient as an LRU cache, this is mitigated by the longer cache lifespans and it provides some advantages:
SolidCache is a FIFO (first in, first out) cache. While this is not as efficient as an LRU cache, this is mitigated by the longer cache lifespans.

A FIFO cache is much easier to manage:
1. We don't need to track when items are read
2. We can estimate and control the cache size by comparing the maximum and minimum IDs.
3. By deleting from one end of the table and adding at the other end we can avoid fragmentation (on MySQL at least).
Expand Down Expand Up @@ -65,8 +66,8 @@ These can be set in your Rails configuration:
Rails.application.configure do
config.solid_cache.connects_to = {
shards: {
shard1: { writing: :cache_primary_shard1, reading: :cache_primary_shard1 },
shard2: { writing: :cache_primary_shard1, reading: :cache_primary_shard1 }
shard1: { writing: :cache_primary_shard1 },
shard2: { writing: :cache_primary_shard1 }
}
}
end
Expand All @@ -77,11 +78,13 @@ end
Solid cache supports these options in addition to the universal `ActiveSupport::Cache::Store` options.

- `error_handler` - a Proc to call to handle any `ActiveRecord::ActiveRecordError`s that are raises (default: log errors as warnings)
- `shards` - an Array of the database shards to connect to (shard connects_to must be configured separately via the SolidCache engine config)
- `trim_batch_size` - the batch size to use when deleting old records (default: `100`)
- `max_age` - the maximum age of entries in the cache (default: `2.weeks.to_i`)
- `max_entries` - the maximum number of entries allowed in the cache (default: `2.weeks.to_i`)
- `cluster` - a Hash of options for the cache database cluster, e.g { shards: [:database1, :database2, :database3] }
- `clusters` - and Array of Hashes for separate cache clusters (ignored if `:cluster` is set)

For more information on cache clusters see [Sharding the cache](#sharding-the-cache)
### Cache trimming

SolidCache tracks when we write to the cache. For every write it increments a counter by 1.25. Once the counter reaches the `trim_batch_size` it add a task to run on a cache trimming thread. That task will:
Expand All @@ -92,7 +95,7 @@ SolidCache tracks when we write to the cache. For every write it increments a co

Incrementing the counter by 1.25 per write allows us to trim the cache faster than we write to it if we need to.

Only triggering trimming when we write means that the if the cache is idle the background thread is also idle.
Only triggering trimming when we write means that the if the cache is idle, the background thread is also idle.

### Using a dedicated cache database

Expand Down Expand Up @@ -125,7 +128,7 @@ $ mv db/migrate/*.solid_cache.rb db/cache/migrate
Set the engine configuration to point to the new database:
```
Rails.application.configure do
config.solid_cache.connects_to = { database: { writing: :cache, reading: :cache } }
config.solid_cache.connects_to = { default: { writing: :cache } }
end
```

Expand Down Expand Up @@ -163,15 +166,41 @@ production:
Rails.application.configure do
config.solid_cache.connects_to = {
shards: {
cache_shard1: { writing: :cache_shard1, reading: :cache_shard1 },
cache_shard2: { writing: :cache_shard2, reading: :cache_shard2 },
cache_shard3: { writing: :cache_shard3, reading: :cache_shard3 },
cache_shard1: { writing: :cache_shard1 },
cache_shard2: { writing: :cache_shard2 },
cache_shard3: { writing: :cache_shard3 },
}
}

config.cache_store = :solid_cache_store, shards: [ :cache_shard1, :cache_shard2, :cache_shard3 ]
config.cache_store = [ :solid_cache_store, cluster: { shards: [ :cache_shard1, :cache_shard2, :cache_shard3 ] } ]
end
```

### Secondary cache clusters

You can add secondary cache clusters. Reads will only be sent to the primary cluster (i.e. the first one listed).

Writes will go to all clusters. The writes to the primary cluster are synchronous, but asyncronous to the secondary clusters.

To specific multiple clusters you can do:

```ruby
Rails.application.configure do
config.solid_cache.connects_to = {
shards: {
cache_primary_shard1: { writing: :cache_primary_shard1 },
cache_primary_shard2: { writing: :cache_primary_shard2 },
cache_secondary_shard1: { writing: :cache_secondary_shard1 },
cache_secondary_shard2: { writing: :cache_secondary_shard2 },
}
}

primary_cluster = { shards: [ :cache_primary_shard1, :cache_primary_shard2 ] }
secondary_cluster = { shards: [ :cache_primary_shard1, :cache_primary_shard2 ] }
config.cache_store = [ :solid_cache_store, clusters: [ primary_cluster, secondary_cluster ] ]
end
```

### Enabling encryption

Add this to an initializer:
Expand Down
29 changes: 0 additions & 29 deletions lib/solid_cache/async_execution.rb

This file was deleted.

18 changes: 18 additions & 0 deletions lib/solid_cache/cluster.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@

module SolidCache
class Cluster
require "solid_cache/cluster/hash_ring"
require "solid_cache/cluster/connection_handling"
require "solid_cache/cluster/async_execution"
require "solid_cache/cluster/trimming"
require "solid_cache/cluster/stats"

include ConnectionHandling, AsyncExecution
include Trimming
include Stats

def initialize(options = {})
super(options)
end
end
end
31 changes: 31 additions & 0 deletions lib/solid_cache/cluster/async_execution.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
module SolidCache
class Cluster
module AsyncExecution
def initialize(options)
super()
@executor = Concurrent::SingleThreadExecutor.new(max_queue: 100, fallback_policy: :discard)
end

private
def async(&block)
# Need current shard right now, not when block is called
current_shard = Entry.current_shard
@executor << ->() do
wrap_in_rails_executor do
with_shard(current_shard) do
block.call(current_shard)
end
end
end
end

def wrap_in_rails_executor
if SolidCache.executor
SolidCache.executor.wrap { yield }
else
yield
end
end
end
end
end
94 changes: 94 additions & 0 deletions lib/solid_cache/cluster/connection_handling.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
module SolidCache
class Cluster
module ConnectionHandling
attr_reader :async_writes

def initialize(options = {})
super(options)
@shards = options.delete(:shards)
@async_writes = options.delete(:async_writes)
end

def writing_all_shards
return enum_for(:writing_all_shards) unless block_given?

shards.each do |shard|
with_shard(shard) do
async_if_required { yield }
end
end
end

def shards
@shards || SolidCache.all_shard_keys || [nil]
end

def writing_across_shards(list:, trim: false)
across_shards(list:) do |list|
async_if_required do
result = yield list
trim(list.size) if trim
result
end
end
end

def reading_across_shards(list:)
across_shards(list:) { |list| yield list }
end

def writing_shard(normalized_key:, trim: false)
with_shard(shard_for_normalized_key(normalized_key)) do
async_if_required do
result = yield
trim(1) if trim
result
end
end
end

def reading_shard(normalized_key:)
with_shard(shard_for_normalized_key(normalized_key)) { yield }
end

private
def with_shard(shard)
if shard
Record.connected_to(shard: shard) { yield }
else
yield
end
end

def across_shards(list:)
in_shards(list).map do |shard, list|
with_shard(shard) { yield list }
end
end

def in_shards(list)
if shards.count == 1
{ shards.first => list }
else
list.group_by { |value| shard_for_normalized_key(value.is_a?(Hash) ? value[:key] : value) }
end
end

def shard_for_normalized_key(normalized_key)
hash_ring&.get_node(normalized_key) || shards&.first
end

def hash_ring
@hash_ring ||= shards.count > 0 ? HashRing.new(shards) : nil
end

def async_if_required
if async_writes
async { yield }
else
yield
end
end
end
end
end
Loading

0 comments on commit 1a4a6a3

Please sign in to comment.