-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP (do not merge yet): Export callback functions in R package (fixes #2479) #5018
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for working on this @mayer79 !
First, I think that before exporting these callbacks, we should change the names. I don't think R packages should use names with dots for anything other than S3 methods. There are many things in this package that violate that rule (e.g. lgb.train()
, lgb.cv()
) and they've been left unchanged for historical reasons, but I'd like to take this opportunity where new functions are being exported to try to avoid introducing more such names.
What do you think about this?
cb.early.stop()
-->cb_early_stop()
cb.print.evaluation()
-->cb_print_evaluation()
cb.record.evaluation()
-->cb_record_evaluation()
If you agree with that, would you be willing to open a separate PR just changing those names?
passing a list of callback functions to
lgb.train()
etc. does not yet work properly because the "old" parametersearly_stopping_rounds
,eval_freq
etc. override the callback list. How should we deal with this?
Thanks for pointing this out! Over in the Python package, we've recently removed these special keyword arguments to lgb.train()
and lgb.cv()
in favor of encouraging the use of callbacks (see #4574 and #4908).
I think the R package should do something similar. That would mean:
- For now, raise deprecation warnings in
lgb.train()
andlgb.cv()
if keyword argumentearly_stopping_rounds
,eval_freq
, etc. are non-NULL, and in those deprecation warnings encourage the use of the corresponding callbacks. - Add an issue to the backlog to remove those warnings and arguments after the release of LightGBM 4.0.
HOWEVER...we also should continue to support the case where parameters like early_stopping_rounds
is passed via params
.
In the Python package, the possibility of passing both params={"early_stopping_rounds": 10}
and callbacks=[lgb.early_stopping(15)]
isn't handled explicitly. If the parameter early_stopping_rounds
is passed in params
, an lgb.early_stopping()
is added to the callbacks
list.
LightGBM/python-package/lightgbm/engine.py
Lines 198 to 209 in f185695
if "early_stopping_round" in params: | |
callbacks_set.add( | |
callback.early_stopping( | |
stopping_rounds=params["early_stopping_round"], | |
first_metric_only=first_metric_only, | |
verbose=_choose_param_value( | |
main_param_name="verbosity", | |
params=params, | |
default_value=1 | |
).pop("verbosity") > 0 | |
) | |
) |
I think that what will happen in those places is that both callbacks will be evaluated on each iteration, and early stopping will happen if either of them thinks early stopping has been met (I'd have to test that to be sure).
I think that in the R package, since we're just newly exporting the callbacks, we should be stricter and raise an error prior to training beginning if early_stopping_rounds
from params
and cb_early_stopping()
are both provided.
The name and all arguments passed in are stored in the callback function's attributes...
LightGBM/R-package/R/callback.R
Lines 318 to 319 in f185695
attr(callback, "call") <- match.call() | |
attr(callback, "name") <- "cb.early.stop" |
...so it should be possible to check whether or not any of the functions in callbacks
have name "cb.early.stop"
AND whether the values passed to that function conflict with early_stopping_rounds
passed through params
.
@jameslamb Thanks a lot for your detailed feedback and instructions. I am a bit short in time but will start to change the callback functions as you proposed in a separate PR. It definitively makes sense to get rid of the dots in those function names. |
Thanks very much @mayer79 ! Take your time. I don't think exposing |
@jameslamb : I think we can close the PR for now, thanks for the remainder. We can pick up the ideas later. |
Ok no problem. Thanks as always for your help with LightGBM! |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
This PR fixes #2479, i.e., it exports and documents the callback functions in R.
@jameslamb Currently, passing a list of callback functions to
lgb.train()
etc. does not yet work properly because the "old" parametersearly_stopping_rounds
,eval_freq
etc. override the callback list. How should we deal with this?callbacks
. This will need a couple of changes inlgb.train()
,lgb.cv()
,lightgbm()
.