Hyperparameter Tuning #228

kmishra9 · 2019-08-07T17:36:50Z

I'm curious -- as I haven't seen built-in hyperparameter optimization functionality in sl3, is there a recommended way to go about doing that? Right now I'm essentially using caret to tune, then taking the best_tune of every model I fit and plopping those arguments into the appropriate make_learner call. Any plans to build this into sl3, or is the workflow I'm describing essentially the recommended move?

The text was updated successfully, but these errors were encountered:

jeremyrcoyle · 2019-08-07T19:27:10Z

Great question (thanks for all your high-quality feedback lately). As I understand it, you're essentially using caret to do a discrete Super Learner over a grid of hyperparameter values for each learner, and then combining those discrete SLs in a continuous SL (although hopefully not nesting the cross-validation between those two).

We need to better support building out a set of learners over a grid of hyperparemeter values, and it's been on the todo list for far too long (see, e.g., #2). For now, this is still a DIY thing unfortunately.

Having enumerated such a grid for each learner, I think you would be better off just "concatenating" the grids together to form your continuous SL library instead of doing the two stage SL. If you wanted to enforce sparsity in set of learners selected by SuperLearner, you could do so by adjusting your metalearner with an appropriate constraint. It's worth having a worked example of this, so i'll plan to add one

kmishra9 · 2019-08-07T20:19:28Z

Ah, so I'm actually using caret like this (which may provide a helpful basis for a worked example or vignette):

training_parameters = trainControl(
    method           = "cv",
    number           = 5,
    search           = "random",
    returnData       = TRUE,
    verboseIter      = TRUE,
    predictionBounds = c(0, 150),
    allowParallel    = TRUE
)

model_11 = train(
    x          = final_log_continuous_dataset_caret_boosting_train,
    y          = final_log_continuous_dataset_train %>% filter(num_claims > 9) %>% pull(eGFR),
    method     = "xgbTree",
    weights    = final_log_continuous_dataset_train %>% filter(num_claims > 9) %>% pull(num_eGFRs),
    trControl  = training_parameters,
    tuneLength = 70
)

So some distinctions: random search instead of grid search, and training using discrete models at a time, rather than the sl3 framework. I think a read a paper somewhere indicating that a tuneLength of > 60 will on average reach hyperparameters that are at most 5% from optimality, so 70 just for extra security. And then I do this with sl3:

train_task = make_sl3_Task(
    data         = final_log_continuous_dataset_train,
    covariates   = covariates,
    outcome      = outcome,
    outcome_type = "continuous",
    weights      = weights
)

[...]
lrnr_xgboost = make_learner(learner_class = Lrnr_xgboost, model_11$bestTune %>% unlist())

stack         = make_learner(Stack, lrnr_glm, lrnr_randomForest, lrnr_xgboost)
metalearner   = make_learner(Lrnr_nnls)
super_learner = Lrnr_sl$new(learners = stack,
                            metalearner = metalearner)

model_13 = super_learner$train(train_task)

Hope that's helpful! I also tuned the rf model with caret in the exact same way, omitted for brevity.

jeremyrcoyle self-assigned this Aug 7, 2019

nhejazi added enhancement feature request labels Sep 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparameter Tuning #228

Hyperparameter Tuning #228

kmishra9 commented Aug 7, 2019 •

edited

Loading

jeremyrcoyle commented Aug 7, 2019

kmishra9 commented Aug 7, 2019 •

edited

Loading

Hyperparameter Tuning #228

Hyperparameter Tuning #228

Comments

kmishra9 commented Aug 7, 2019 • edited Loading

jeremyrcoyle commented Aug 7, 2019

kmishra9 commented Aug 7, 2019 • edited Loading

kmishra9 commented Aug 7, 2019 •

edited

Loading

kmishra9 commented Aug 7, 2019 •

edited

Loading