Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lrnr_rpart$train() works, but as part of stack fails #230

Open
kmishra9 opened this issue Aug 9, 2019 · 5 comments
Open

Lrnr_rpart$train() works, but as part of stack fails #230

kmishra9 opened this issue Aug 9, 2019 · 5 comments
Assignees
Labels

Comments

@kmishra9
Copy link

kmishra9 commented Aug 9, 2019

Hey there,

So as the title indicates, I had a stack with a bunch of learners running within the delayed framework:

stack = make_learner(
    Stack,
    lrnr_glm,
    lrnr_randomForest,
    lrnr_xgboost,
    lrnr_xgboost_limited,
    lrnr_rpart,
    lrnr_svm,
    lrnr_solnp,
    lrnr_earth
)
[...]
scheduled_super_learner = Scheduler$new(
    delayed_object = delayed_learner_train(learner = super_learner, task = train_task),
    job_type =  FutureJob,
    nworkers = cpus_logical,
    verbose = TRUE
)

model_13 = scheduled_super_learner$compute()

and got this error:

Error in order(results$index) : argument 1 is not a vector
In addition: There were 11 warnings (use warnings() to see them)
Failed on predict
Error in self$compute_step() : 
  Error in order(results$index) : argument 1 is not a vector

updating chain from ready to running
run:1 ready:0 workers:12
updating chain from running to resolved
Failed on chain
Error in self$compute_step() : Error in self$compute_step() : 
  Error in order(results$index) : argument 1 is not a vector

Removing lrnr_rpart from the stack works, but using lrnr_rpart on the train_task directly also appears to work. 🤷‍♂

No worries if this is unhelpful, vague, or just irrelevant, but trying to provide feedback when I run into bugs if it helps the package mature! I simply removed lrnr_rpart from the stack and continued on my way.

Big fan of the sl3 framework thus far!

@jeremyrcoyle jeremyrcoyle self-assigned this Aug 9, 2019
@jeremyrcoyle
Copy link
Collaborator

Thanks. I'll try to reproduce!

@nhejazi nhejazi added the bug label Sep 27, 2019
@jeremyrcoyle
Copy link
Collaborator

Hi @kmishra9 , sorry for the long delay. I tried to reproduce the issue with rpart in stacks as follows:

# try to reproduce https://github.com/tlverse/sl3/issues/230
library(sl3)
library(testthat)
library(rpart)

# define test dataset
data(mtcars)
task <- sl3_Task$new(mtcars, covariates = c(
  "cyl", "disp", "hp", "drat", "wt", "qsec",
  "vs", "am", "gear", "carb"
), outcome = "mpg")


lrnr_rpart <- Lrnr_rpart$new()
lrnr_mean <- Lrnr_mean$new()
stack <- Stack$new(lrnr_rpart, lrnr_mean)

stack_fit <- stack$train(task)
predict <- stack_fit$predict()

But wasn't able to. I understand it may not be possible due to private data or other concerns, but I think i'll need a MRE in order to identify the issue here. Until then, i'll close this issue.

@jeremyrcoyle
Copy link
Collaborator

For categorical data, this is because Lrnr_rpart needs to pack_predictions. Will fix ASAP

@jeremyrcoyle
Copy link
Collaborator

Seems like this is affecting Lrnr_ranger as well

@nhejazi
Copy link
Member

nhejazi commented Mar 2, 2021

Has this been resolved @jeremyrcoyle?

@nhejazi nhejazi changed the title lrnr_rpart$train works, but as part of stack fails Lrnr_rpart$train() works, but as part of stack fails Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants