Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pfr(): predict not working for fpc terms #63

Open
fabian-s opened this issue Apr 12, 2016 · 1 comment
Open

pfr(): predict not working for fpc terms #63

fabian-s opened this issue Apr 12, 2016 · 1 comment
Assignees

Comments

@fabian-s
Copy link
Member

m <- pfr(pasat ~ fpc(rcst), data=DTI[complete.cases(DTI),][1:100,])
predict(m, newdata = DTI[complete.cases(DTI),][-(1:100),])
# Error in eval(expr, envir, enclos) : object 'X.tmat' not found
# In addition: Warning message:
# In (function (object, newdata, type = "link", se.fit = FALSE, terms = NULL,  :
#  not all required variables have been supplied in  newdata!

@jgellar : sorry to keep filing bugs against your code, but not being able to generate predictions really sucks.... something like this may help

@sbrockhaus
Copy link

The same problem occurs for lf.vd() terms. I want to use a model with variable-domain covariate for binary response. To asses prediciton accuracy, out-of-bag prediciton is inevitable.

library(refund)
data(sofa)
fit.vd1 <- pfr(death ~ lf.vd(SOFA) + age + los, family="binomial", data=sofa)
pred <- predict(fit.vd1, newdata = sofa)
# Error in eval(expr, envir, enclos) : object 'SOFA.arg' not found

A workaround is to use weights:

## fit the model using weights 
train_ind <- sample(0:1, size = nrow(sofa), replace=TRUE)
fit_train <- pfr(death ~ lf.vd(SOFA) + age + los, family="binomial", data=sofa, 
                 weights = train_ind)
## only keep the predictions with weight 0
pred_oob <- predict(fit_train, type = "response")[train_ind == 0]

But this is rather tedious... And I am not sure, how the data with weight 0 enter the model anyway. Consider the following model fit where the training data is used instead of using weights. Thus, the models fit_train and fit_train_data should be equivalent.

## compare the model fit with weights to the model fit on the training data only
train_data <- sofa[train_ind == 1, ]
fit_train_data <- pfr(death ~ lf.vd(SOFA) + age + los, family="binomial", data=train_data) 
## the two models should be equivalent, but e.g. the means differ 
fit_train$pfr$datameans
fit_train_data$pfr$datameans

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants