Skip to content

Commit

Permalink
Make memo lifetime lte arguments with default hash
Browse files Browse the repository at this point in the history
  • Loading branch information
cevans87 committed Mar 9, 2020
1 parent 50ecb02 commit 6828eae
Show file tree
Hide file tree
Showing 4 changed files with 346 additions and 55 deletions.
105 changes: 88 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,89 @@ Python 3.6+ decorators including

If 'duration' is provided, memoize will only retain return values for up to given 'duration'.

If 'get_key' is provided, memoize will use the function to calculate the memoize hash key.
If 'keygen' is provided, memoize will use the function to calculate the memoize hash key.

If 'size' is provided, memoize will only retain up to 'size' return values.

A warning about arguments inheriting `object.__hash__`:

It doesn't make sense to keep a memo if it's impossible to generate the same input again.
Inputs that inherit the default `object.__hash__` are unique based on their id, and thus,
their location in memory. If such inputs are garbage-collected, they are assumed to be gone
forever. For that reason, when those inputs are garbage collected, `memoize` will drop memos
created using those inputs.

Here are some common patterns where this behaviour will not cause any problems.

- Basic immutable types that have specific, consistent hash functions (int, str, etc.).
@memoize
def foo(a: int, b: str, c: Tuple[int, ...], d: range) -> Any: ...

foo(1, 'bar', (1, 2, 3), range(42)) # Function called. Result cached.
foo(1, 'bar', (1, 2, 3), range(42)) # Function not called. Cached result returned.

- Classmethods rely on classes, which inherit from `object.__hash__`. However, classes
are almost never garbage collected until a process exits so memoize will work as
expected.

class Foo:

@classmethod
@memoize
def bar(cls) -> Any: ...

foo = Foo()
foo.bar() # Function called. Result cached.
foo.bar() # Function not called. Cached result returned.

del foo # Memo not cleared since lifetime is bound to class Foo.

foo = Foo()
foo.bar() # Function not called. Cached result returned.
foo.bar() # Function not called. Cached result returned.

- Long-lasting object instances that inherit from `object.__hash__`.

class Foo:

@memoize
def bar(self) -> Any: ...

foo = Foo()
foo.bar() # Function called. Result cached.
foo.bar() # Function not called. Cached result returned.

del foo # Memo is cleared since lifetime is bound to instance foo.

foo = Foo()
foo.bar() # Function called. Result cached.
foo.bar() # Function not called. Cached result returned.

Here are common patterns that will not behave as desired (for good reason).

- Using ephemeral objects that inherit from `object.__hash__`. Firstly, these inputs
will only hash equally sometimes, by accident, if their id is recycled from a
previously deleted input. Secondly, we delete memos based on inputs that inherit
from `object.__hash__` at the same time as that input is garbage collected, so
generating the memo is wasted effort.

# Inherits object.__hash__
class Foo: ...

@memoize
def bar(foo: Foo) -> Any: ...

bar(Foo()) # Memo is immediately deleted since Foo() is garbage collected.
bar(Foo()) # Same as previous line. Memo is immediately deleted.

Examples:

- Body will run once for unique input 'bar' and result is cached.
@memoize
def foo(bar) -> Any: ...

foo(1) # Function actually called. Result cached.
foo(1) # Function not called. Previously-cached result returned.
foo(1) # Function not called. Cached result returned.
foo(2) # Function actually called. Result cached.

- Same as above, but async.
Expand All @@ -41,7 +112,7 @@ Python 3.6+ decorators including
def init(self, _): ...

Foo(1) # Instance is actually created.
Foo(1) # Instance not created. Previously-cached instance returned.
Foo(1) # Instance not created. Cached instance returned.
Foo(2) # Instance is actually created.

- Calls to foo(1), foo(bar=1), and foo(1, baz='baz') are equivalent and only cached once
Expand All @@ -62,16 +133,16 @@ Python 3.6+ decorators including
def foo(bar) -> Any: ...

foo(1) # Function actually called. Result cached.
foo(1) # Function not called. Previously-cached result returned.
foo(1) # Function not called. Cached result returned.
sleep(61)
foo(1) # Function actually called. Previously-cached result was too old.
foo(1) # Function actually called. Cached result was too old.

- Memoize can be explicitly reset through the function's 'memoize' attribute
@memoize
def foo(bar) -> Any: ...

foo(1) # Function actually called. Result cached.
foo(1) # Function not called. Previously-cached result returned.
foo(1) # Function not called. Cached result returned.
foo.memoize.reset()
foo(1) # Function actually called. Cache was emptied.

Expand All @@ -84,13 +155,13 @@ Python 3.6+ decorators including
len(foo.memoize) # returns 2

- Memoization hash keys can be generated from a non-default function:
@memoize(get_key=lambda a, b, c: (a, b, c))
@memoize(keygen=lambda a, b, c: (a, b, c))
def foo(a, b, c) -> Any: ...

- If part of the returned key from get_key is awaitable, it will be awaited.
- If part of the returned key from keygen is awaitable, it will be awaited.
async def await_something() -> Hashable: ...

@memoize(get_key=lambda bar: (bar, await_something()))
@memoize(keygen=lambda bar: (bar, await_something()))
async def foo(bar) -> Any: ...

- Properties can be memoized
Expand All @@ -101,11 +172,11 @@ Python 3.6+ decorators including

a = Foo()
a.bar # Function actually called. Result cached.
a.bar # Function not called. Previously-cached result returned.
a.bar # Function not called. Cached result returned.

b = Foo() # Memoize uses 'self' parameter in hash. 'b' does not share returns with 'a'
b.bar # Function actually called. Result cached.
b.bar # Function not called. Previously-cached result returned.
b.bar # Function not called. Cached result returned.

- Be careful with eviction on methods.
Class Foo:
Expand All @@ -119,7 +190,7 @@ Python 3.6+ decorators including

- The default memoize key generator can be overridden. The inputs must match the function's.
Class Foo:
@memoize(get_key=lambda self, a, b, c: (a, b, c))
@memoize(keygen=lambda self, a, b, c: (a, b, c))
def bar(self, a, b, c) -> Any: ...

a, b = Foo(), Foo()
Expand All @@ -129,12 +200,12 @@ Python 3.6+ decorators including

# Hash key will again be (a, b, c)
# Be aware, in this example the returned result comes from a.bar(...), not b.bar(...).
b.bar(1, 2, 3) # Function not called. Previously-cached result returned.
b.bar(1, 2, 3) # Function not called. Cached result returned.

- If the memoized function is async and any part of the key is awaitable, it is awaited.
async def morph_a(a: int) -> int: ...

@memoize(get_key=lambda a, b, c: (morph_a(a), b, c))
@memoize(keygen=lambda a, b, c: (morph_a(a), b, c))
def foo(a, b, c) -> Any: ...

- Values can persist to disk and be reloaded when memoize is initialized again.
Expand All @@ -146,7 +217,7 @@ Python 3.6+ decorators including

# Process is restarted. Upon restart, the state of the memoize decorator is reloaded.

foo(1) # Function not called. Previously-cached result returned.
foo(1) # Function not called. Cached result returned.

- Be careful with 'db' and memoize values that don't hash consistently upon process restart.

Expand All @@ -156,7 +227,7 @@ Python 3.6+ decorators including
def bar(cls, a) -> Any: ...

Foo.bar(1) # Function actually called. Result cached.
Foo.bar(1) # Function not called. Previously-cached result returned.
Foo.bar(1) # Function not called. Cached result returned.

# Process is restarted. Upon restart, the state of the memoize decorator is reloaded.

Expand All @@ -166,7 +237,7 @@ Python 3.6+ decorators including
# You can create a consistent hash key to avoid this.
class Foo:
@classmethod
@memoize(db=True, get_key=lambda cls, a: (f'{cls.__package__}:{cls.__name__}', a))
@memoize(db=True, keygen=lambda cls, a: (f'{cls.__package__}:{cls.__name__}', a))
def bar(cls, a) -> Any: ...

- Alternative location of 'db' can also be given as pathlib.Path or str.
Expand Down
Loading

0 comments on commit 6828eae

Please sign in to comment.