-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add_cumu_hazard does not aggregate hazard values correctly if input data are not sorted correctly #227
Comments
bei Beispiel 2 musst du aber noch zusätzlich nach celltype gruppieren? |
ja - das ist halt eben nicht dokumentiert dass man das machen muss und sonst grütze raus kommt |
i see yes. Thought about it, not sure how to handle. No way to know that it is grouping factor rather than e.g. time-dependent covariate.. |
@fabian-s Ideas how to do it technically? check if nrow(unique(newdata) ) != length(unique(tend)) and if so group internally by all variables that are not "PED" variables? |
and then do wouldn't that break if time-dependent covariates with any duplicated values are present, like you said yourself 2 comments above? if that can be avoided that seems like a reasonable fix. since this seems fragile/intransparent, i'd suggest to make prepping the input data the user's responsibility: verify that the input data for EDIT: s.th like:
BTW: code above should probably check against the actual minimal |
so my thinking was, if the same tend value appears multiple times, than we should group, such that each sequence of non-unique timepoints is one group. In case of TDCs with concurrent effect we'd have multiple values of the covar, but a unique sequence of time-points, so no grouping |
yap, probably need to go back to the drawing board... |
don't have enough detailed context to know what the PED for cumul. effects etc would look like, but this all seems to reinforce my general point above that it would be cleaner to check the minimal conditions the input data HAS to satisfy instead of trying to guess what the user may have wanted to achieve and then automagically modify the input accordingly -- these conditions includes at least:
|
I think prediction of S(t), H(t), CIF(t) etc. at specific time-points is not an unreasonable request. For example for evaluation measures. We do have this functionality here: https://github.com/adibender/pammtools/blob/master/R/predict.R I'll think about it. Thanks for the input 🙏🏻 |
agree. to enable this, it seems feasible to -- internally, automatically --
i think this strategy might even have worked with the "pam_fit_wrong" input data in the top example... |
Created on 2023-03-08 with reprex v2.0.2
Session info
The text was updated successfully, but these errors were encountered: