Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter Tuning #228

Open
kmishra9 opened this issue Aug 7, 2019 · 2 comments
Open

Hyperparameter Tuning #228

kmishra9 opened this issue Aug 7, 2019 · 2 comments

Comments

@kmishra9
Copy link

kmishra9 commented Aug 7, 2019

I'm curious -- as I haven't seen built-in hyperparameter optimization functionality in sl3, is there a recommended way to go about doing that? Right now I'm essentially using caret to tune, then taking the best_tune of every model I fit and plopping those arguments into the appropriate make_learner call. Any plans to build this into sl3, or is the workflow I'm describing essentially the recommended move?

@jeremyrcoyle jeremyrcoyle self-assigned this Aug 7, 2019
@jeremyrcoyle
Copy link
Collaborator

Great question (thanks for all your high-quality feedback lately). As I understand it, you're essentially using caret to do a discrete Super Learner over a grid of hyperparameter values for each learner, and then combining those discrete SLs in a continuous SL (although hopefully not nesting the cross-validation between those two).

We need to better support building out a set of learners over a grid of hyperparemeter values, and it's been on the todo list for far too long (see, e.g., #2). For now, this is still a DIY thing unfortunately.

Having enumerated such a grid for each learner, I think you would be better off just "concatenating" the grids together to form your continuous SL library instead of doing the two stage SL. If you wanted to enforce sparsity in set of learners selected by SuperLearner, you could do so by adjusting your metalearner with an appropriate constraint. It's worth having a worked example of this, so i'll plan to add one

@kmishra9
Copy link
Author

kmishra9 commented Aug 7, 2019

Ah, so I'm actually using caret like this (which may provide a helpful basis for a worked example or vignette):

training_parameters = trainControl(
    method           = "cv",
    number           = 5,
    search           = "random",
    returnData       = TRUE,
    verboseIter      = TRUE,
    predictionBounds = c(0, 150),
    allowParallel    = TRUE
)

model_11 = train(
    x          = final_log_continuous_dataset_caret_boosting_train,
    y          = final_log_continuous_dataset_train %>% filter(num_claims > 9) %>% pull(eGFR),
    method     = "xgbTree",
    weights    = final_log_continuous_dataset_train %>% filter(num_claims > 9) %>% pull(num_eGFRs),
    trControl  = training_parameters,
    tuneLength = 70
)

So some distinctions: random search instead of grid search, and training using discrete models at a time, rather than the sl3 framework. I think a read a paper somewhere indicating that a tuneLength of > 60 will on average reach hyperparameters that are at most 5% from optimality, so 70 just for extra security. And then I do this with sl3:

train_task = make_sl3_Task(
    data         = final_log_continuous_dataset_train,
    covariates   = covariates,
    outcome      = outcome,
    outcome_type = "continuous",
    weights      = weights
)

[...]
lrnr_xgboost = make_learner(learner_class = Lrnr_xgboost, model_11$bestTune %>% unlist())

stack         = make_learner(Stack, lrnr_glm, lrnr_randomForest, lrnr_xgboost)
metalearner   = make_learner(Lrnr_nnls)
super_learner = Lrnr_sl$new(learners = stack,
                            metalearner = metalearner)

model_13 = super_learner$train(train_task)

Hope that's helpful! I also tuned the rf model with caret in the exact same way, omitted for brevity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants