Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial patience & delta mode to early_stopping #14

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

JamesAllingham
Copy link
Contributor

(Mostly) addresses #11.

The main changes are:

  • Fix min_delta being ignored by the early_stopping callback.
  • Add a delta_mode option to early stopping that allows min_delta to either be interpreted as a "absolute" or "relative" difference.
  • Add an initial_patience option, which is the same as patience but only considers the first n steps. This is useful when the initial stages of training are very noisy, etc. It is similar to start_from_epoch from Keras.
  • Renamed existing "mode" to "optimization_mode" for clarity. (Also in the checkpoint callback, for consistency).
  • Added tests.

This doesn't address the problem that the best weights are only returned if the patience has run out. After looking more closely at the code, I think that fixing it requires a more considerable refactoring of the code, so I'll make another issue to discuss. In short, I would like to upgrade the callback functionality such that each callback can have multiple callback functions. E.g., the early_stopping callback could have a generic __loop_callback__ that is called on the schedule specified by the user, and a on_epoch_end that is called at the end of training. This seems to be how the Keras callbacks work, and as far as I understand, it isn't possible with the current setup. But more on that later, let me know what you think of these changes!

 * Renamed exisitng "mode" to "optimization_mode" for clarity.
  * Added test.
@codecov
Copy link

codecov bot commented Sep 2, 2023

Codecov Report

Merging #14 (9270181) into main (f39f265) will increase coverage by 0.49%.
Report is 1 commits behind head on main.
The diff coverage is 93.75%.

@@            Coverage Diff             @@
##             main      #14      +/-   ##
==========================================
+ Coverage   78.39%   78.88%   +0.49%     
==========================================
  Files          12       12              
  Lines        1578     1596      +18     
==========================================
+ Hits         1237     1259      +22     
+ Misses        341      337       -4     
Files Changed Coverage Δ
ciclo/loops/loop.py 86.46% <ø> (+2.25%) ⬆️
ciclo/callbacks.py 65.73% <93.54%> (+2.03%) ⬆️
ciclo/__init__.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -55,6 +55,11 @@ class OptimizationMode(str, Enum):
max = auto()


class DeltaMode(str, Enum):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the same convention as the existing mode argument (now called optimization_mode) , i.e., using an Enum to represent the options. For consistency, perhaps it makes sense to define a similar Enum for the InnerLoopAggregation options introduced in #9? (Or we could use that PR's approach of defining a string Literal type for the different options?)

@cgarciae
Copy link
Owner

cgarciae commented Sep 5, 2023

Thanks @JamesAllingham! Will review it soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants