Checkpoint dirname argument can also be a callable #870
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Solves #848
Description
For
Checkpoint
, as of now,dirname
can only be a string. With thisupdate, it can also be a callable with no arguments that returns a string.
What this solves is that the directory that a model is saved in can now
contain a dynamic element. This way, if you run, e.g., grid search with
n_jobs>1
+ checkpoint, each checkpoint instance can have its owndirectory name (e.g. using a function that returns a random name), while
the files inside the directory still follow the same naming.
Without such a possibility, if a user runs grid search with
n_jobs>1
andcheckpoint with
load_best=True
, the loaded model would always bewhatever happens to be the latest one stored, which can result
in (silent) errors.
If anyone has a better idea how to solve the underlying problem, I'm
open to it.
Implementation
As a consequence of the
dirname
now not being known at__init__
time, Iremoved the validation of the filenames from there. We still validate
them inside
initialize
, which is sufficient in my opinion.In theory, we could call the
dirname
function inside__init__
tovalidate it, and then call it again inside
initialize
to actually setit, but I don't like that. The reason is that we would call a
function that is possible non-deterministic or might have side effects
twice, with unknown consequences. This should be avoided if possible.
Example
Before, code like this would fail:
With this feature, we can do:
and it works.