You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To keep things simple I only use a glm as the model for both the propensity score and the outcome mean. I am surprised to see that the output is exactly the same for both procedures. The CV-TMLE seems to complain about glm not being "CV-aware" which might be the reason. However I don't understand why that should be the case. My understanding of CV-TMLE is that:
The dataset should be splitted in V folds
The glm models (for both A and Y) should be fitted on each split, so we should have V instantiations of each glm each trained on a different split.
The targeting step is pooled from predictions of the V glm model pairs on their respective validation sets
The final estimate is the average of estimates across validation folds
The influence curve (I am not entirely sure if it is pooled across validation samples or if multiple variance estimates are made and averaged)
As I understand it, we could have used a Super Learning instead of a GLM which would have resulted in another nested cross-validation procedure but Super Learning is not a requirement of CV-TMLE. The code to reproduce is below: you can tweak the learner_list to change to a super learner and then 2 different outputs are returned and no "CV-aware" complaint is formulated.
I would appreciate some clarification on the procedure and why this is happening! Thanks!
Hello,
I am following the tutorial and trying to look at the difference between CV-TMLE and TMLE with the perinatal dataset.
perinatal.csv
To keep things simple I only use a glm as the model for both the propensity score and the outcome mean. I am surprised to see that the output is exactly the same for both procedures. The CV-TMLE seems to complain about glm not being "CV-aware" which might be the reason. However I don't understand why that should be the case. My understanding of CV-TMLE is that:
As I understand it, we could have used a Super Learning instead of a GLM which would have resulted in another nested cross-validation procedure but Super Learning is not a requirement of CV-TMLE. The code to reproduce is below: you can tweak the
learner_list
to change to a super learner and then 2 different outputs are returned and no "CV-aware" complaint is formulated.I would appreciate some clarification on the procedure and why this is happening! Thanks!
The text was updated successfully, but these errors were encountered: