Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi state #244

Open
wants to merge 84 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
a20b16f
Add additional test for multistate. So far only example
johannespiller Jan 8, 2024
c05ca98
Add temporary folder to upload debugging files, can be deleted when f…
johannespiller Jan 9, 2024
2034a1a
Added split-data.R functions to debug
johannespiller Jan 9, 2024
be55d24
Fix error 'get_cut applied to an object of class list' by specifying …
johannespiller Jan 10, 2024
d40f11a
seperate debugging file in functions and script
johannespiller Jan 10, 2024
1d59898
Added dimensionality tests for as_ped_multistate
johannespiller Jan 15, 2024
56e0de4
initial version of add_trans_prob
johannespiller Feb 26, 2024
f16f5e8
pammtools comparison with beyersmann example
johannespiller Feb 26, 2024
071b8bb
update add_trans_prob and run scripts so that make_newdata with group…
johannespiller Feb 28, 2024
e0137de
add run script of survival package example. excessive runtime for fit…
johannespiller Mar 1, 2024
dc0fd87
include scam class in warn_about_time_points to accept scam splines
johannespiller Mar 1, 2024
ce32ca4
fix type in warning
johannespiller Mar 1, 2024
781bac2
add x-make.scam for scam objects2
johannespiller Mar 1, 2024
2ba2adc
add object class scam
johannespiller Mar 1, 2024
baec409
Bugfix of hard coded dimensions
johannespiller Mar 1, 2024
f603a5a
add all calculated examples
johannespiller Mar 1, 2024
8b4fc0e
add some comments
johannespiller Mar 1, 2024
0124a88
update picture of hgb spline for iwsm abstract
johannespiller Mar 11, 2024
8b7bef3
update transition probability picture
johannespiller Mar 11, 2024
370402a
rename plots with different effects: contour, groups
johannespiller Mar 12, 2024
561d389
push up-to-date code for sample calculation
johannespiller Mar 12, 2024
88e91bf
add comment
johannespiller Mar 12, 2024
aef4a84
add new figure for iwsm abstract with contour lines
johannespiller Mar 14, 2024
b8ee5f8
update code to calculate example
johannespiller Mar 14, 2024
cbc6946
add iwsm files
johannespiller Mar 18, 2024
3eff881
add debugging file of fct trans_prob: improve performance, debugging,…
johannespiller Mar 18, 2024
c05d8fa
Performance update transition matrices, testing on large dataset ndata
johannespiller Mar 21, 2024
7fdcd69
initial simulation file for debugging transition probability code
johannespiller Mar 21, 2024
0d91e77
initial simulation codes for msm
johannespiller Apr 25, 2024
80dce34
added new simulations and extended beyersmann example calculations fo…
johannespiller May 13, 2024
2293cfe
add comparison of mvna results and pammtools for multistate with time…
johannespiller May 22, 2024
7d0f846
up-to-date-code
johannespiller May 22, 2024
8d31416
up-to-date files
johannespiller May 22, 2024
62ab5c1
re-organize simulations
johannespiller May 22, 2024
b840f8c
re-name
johannespiller May 22, 2024
d2df669
add basic example
johannespiller May 22, 2024
3ce8f2d
throw out simulation code from beyersmann msm example
johannespiller May 22, 2024
223b098
bugfixes and minor changes
johannespiller Jun 3, 2024
4272b45
initial comparison example to cox with recurrent events and multistate
johannespiller Jun 4, 2024
da89701
update examples used for IWSM Mini Paper for poster session
johannespiller Jun 17, 2024
26bbc3e
add add_trans_prob
johannespiller Jun 27, 2024
37a3675
update simulation and example codes
johannespiller Jun 27, 2024
ac9ca56
add further simulation results
johannespiller Jun 27, 2024
b6418bf
add further example figures for iwsm and useR
johannespiller Jun 27, 2024
30aef6f
add_trans_prob: wrapper w/o ci
johannespiller Jul 9, 2024
65b408b
include lines of code to debug wrapper
johannespiller Jul 9, 2024
8a32e83
debugging and building of confidence intervals
johannespiller Jul 10, 2024
f02f854
add add_trans_ci function
johannespiller Jul 11, 2024
f68d9b5
include add_trans_ci in add_functions
johannespiller Jul 12, 2024
c7aa047
initial version for multistate vignette with prothr example from work…
johannespiller Jul 12, 2024
43cd435
add keyword to use get_sim_cumu
johannespiller Jul 30, 2024
e5bca5a
optimize add_trans_ci code to reduce data storage
johannespiller Aug 1, 2024
39df399
include AJ estimator
johannespiller Aug 1, 2024
0c8e8c6
update docu
adibender Aug 1, 2024
53ccd8f
Merge remote-tracking branch 'origin/master' into multi-state
adibender Aug 1, 2024
d22858c
push tmp folder
johannespiller Aug 6, 2024
cd32792
merge conflict man
johannespiller Aug 6, 2024
047353a
merge conflict
johannespiller Aug 6, 2024
43e7340
resolve descr and namespace
johannespiller Aug 6, 2024
b9683ec
multi state helpers
johannespiller Aug 6, 2024
c4ea59e
include add_trans_ci in add_trans_prob
johannespiller Aug 6, 2024
ddc8184
resolve error
johannespiller Aug 7, 2024
5323cf6
commit all changes due to errors loading the package
johannespiller Aug 7, 2024
30099bd
remove .Rdata
johannespiller Aug 7, 2024
d01c17c
push most current multi state version
johannespiller Aug 8, 2024
27a3584
comparision of transition probabilities are added
johannespiller Aug 12, 2024
dd2dd6e
change colors of plots
johannespiller Aug 12, 2024
dcb7495
correct wrong data table test
johannespiller Aug 12, 2024
5ddb571
delete print statements
johannespiller Aug 12, 2024
e059733
most current version
johannespiller Aug 12, 2024
c1cd017
include minimax fct to overcome well-definedness issues as a quick fix
johannespiller Aug 30, 2024
7f42eef
include test for add_trans_prob fct
johannespiller Aug 30, 2024
05391f5
adapt split-data.R to work with variable cut. Include event time poin…
johannespiller Sep 12, 2024
579ec35
save debugging file for next weekly
johannespiller Sep 12, 2024
de74e6b
debug split-data max_time. now working for multi state data as well
johannespiller Sep 17, 2024
8df3096
fix error message of tests: outcome var was not defined in cut-points.R
johannespiller Sep 17, 2024
b2e9109
fix error message: include mstate package to load prothr data for tes…
johannespiller Sep 17, 2024
3758ae2
fix error message: include multi-state vignette in pkgdown
johannespiller Sep 17, 2024
dcaef08
fix error with geom_stepribbon: no manual found, copied code from master
johannespiller Sep 18, 2024
2735212
add manuals which have been missing, created using devtools::document()
johannespiller Sep 18, 2024
e5afe01
update namespace and documentation with missing fcts
johannespiller Sep 18, 2024
463ef44
debugging error messages in from package
johannespiller Sep 18, 2024
08ca968
Fix headline and exclude tidyverse lib
johannespiller Sep 18, 2024
ffb5cb4
add msm and mvna library to website
johannespiller Sep 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added .here
Empty file.
7 changes: 5 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -32,19 +32,22 @@ Imports:
Formula,
mvtnorm,
pec,
vctrs (>= 0.3.0)
vctrs (>= 0.3.0),
mstate
Suggests:
testthat
Config/Needs/website:
coxme,
eha,
etm,
scam,
msm,
mvna,
TBFmultinomial
License: MIT + file LICENSE
LazyData: true
URL: https://adibender.github.io/pammtools/
BugReports: https://github.com/adibender/pammtools/issues
RoxygenNote: 7.1.2
RoxygenNote: 7.3.2
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,14 @@ export(GeomStepHazard)
export(GeomStepribbon)
export(GeomSurv)
export(add_cif)
export(add_counterfactual_transitions)
export(add_cumu_hazard)
export(add_hazard)
export(add_surv_prob)
export(add_tdc)
export(add_term)
export(add_trans_ci)
export(add_trans_prob)
export(arrange)
export(as_ped)
export(as_ped_multistate)
Expand Down
236 changes: 232 additions & 4 deletions R/add-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ add_term <- function(

col_ind <- map(term, grep, x = names(object$coefficients)) %>%
unlist() %>% unique() %>% sort()
is_gam <- inherits(object, "gam")
is_gam <- (inherits(object, "gam") | inherits(object, "scam"))

X <- prep_X(object, newdata, reference, ...)[, col_ind, drop = FALSE]

Expand All @@ -71,24 +71,39 @@ add_term <- function(

}

#' Create model matrix from model object
#' @param object a model object
#' @keywords internal
make_X <- function(object, ...) {

UseMethod("make_X", object)

}

#' @inherit make_X
#' @keywords internal
make_X.default <- function(object, newdata, ...) {

X <- model.matrix(object$formula[-2], data = newdata, ...)

}

#' @inherit make_X
#' @keywords internal
make_X.gam <- function(object, newdata, ...) {

X <- predict.gam(object, newdata = newdata, type = "lpmatrix", ...)

}

#' @inherit make_X
#' @keywords internal
make_X.scam <- function(object, newdata, ...) {

X <- predict.scam(object, newdata = newdata, type = "lpmatrix", ...)

}

prep_X <- function(object, newdata, reference = NULL, ...) {

X <- make_X(object, newdata, ...)
Expand Down Expand Up @@ -231,7 +246,7 @@ get_hazard.default <- function(
type <- match.arg(type)
ci_type <- match.arg(ci_type)

is_gam <- inherits(object, "gam")
is_gam <- (inherits(object, "gam") | inherits(object, "scam"))
if (is.null(time_var)) {
time_var <- ifelse(is_gam, "tend", "interval")
} else {
Expand Down Expand Up @@ -495,7 +510,7 @@ add_ci <- function(

ci_type <- match.arg(ci_type)

is_gam <- inherits(object, "gam")
is_gam <- (inherits(object, "gam") | inherits(object, "scam"))
if (is_gam) {
V <- object$Vp
} else {
Expand Down Expand Up @@ -705,7 +720,7 @@ get_cif.default <- function(
sim_coef_mat,
...) {

is_gam <- inherits(object, "gam")
is_gam <- (inherits(object, "gam") | inherits(object, "scam"))
if (is.null(time_var)) {
time_var <- ifelse(is_gam, "tend", "interval")
} else {
Expand Down Expand Up @@ -750,3 +765,216 @@ get_cif.default <- function(
newdata

}

## Transition Probability Matrix for multi-state data
#' @keywords internal
get_trans_prob <- function(
newdata,
# object,
time_var = NULL,
interval_length = "intlen",
transition = "transition",
tend = "tend",
cumu_hazard = "cumu_hazard",
...) {

# interval_length
assert_character(interval_length)
assert_subset(interval_length, colnames(newdata))
# transition
assert_character(transition)
assert_subset(transition, colnames(newdata))
# time
assert_character(tend)
assert_subset(tend, colnames(newdata))
# cumu_hazard
assert_character(cumu_hazard)
assert_subset(cumu_hazard, colnames(newdata))
assert_data_frame(newdata, all.missing = FALSE)

# assert_class(object, classes = "glm")

interval_length <- sym(interval_length)
transition <- sym(transition)

# include from and to, to obtain transition probability in multidim array
newdata <- newdata %>%
mutate(from = as.numeric(gsub("->.*", "", !!transition))
, to = as.numeric(gsub(".*->", "", !!transition)))

# get unique transitions to build transition matrix
unique_transition <- data.frame(unique(newdata %>% select(!!transition, from, to)))
# get unique time points
unique_tend <- data.frame(unique(newdata %>%
ungroup(!!transition) %>%
select(!!tend)))

# transition matrix
m <- sapply(unique_transition[,c(2,3)], max) + 1 #transition starts at 0, integer of matrix at 1
M <- array(0, dim=c(max(m), max(m), nrow(unique_transition)))


# create transition matrices to be used at every time point,
# multiply matrices with "scalar" alpha_ij_k which is the delta cumu hazard at time t_k for transition i->j

for (iter in 1:nrow(unique_transition)){
M[unique_transition$from[iter] + 1, unique_transition$to[iter] + 1,iter] <- 1
M[unique_transition$from[iter] + 1, unique_transition$from[iter] + 1,iter] <- -1
}

# add cumu hazards to dataset
newdata <- newdata %>%
group_by(!!transition) %>%
mutate(delta_cumu_hazard = cumu_hazard - ifelse(is.na(lag(cumu_hazard)), 0, lag(cumu_hazard)))

# create dA array, to calculate transition probabilities
alpha <- array(rep(0, nrow(unique_tend)*nrow(unique_transition)), dim=c(nrow(unique_tend), nrow(unique_transition)))
I <- array(rep(diag(max(m)), nrow(unique_tend))
, dim=c( max(m), max(m), nrow(unique_tend)))
A <- array(0, dim=c(max(m), max(m), nrow(unique_tend)))
cum_A <- array(0, dim=c(max(m), max(m), nrow(unique_tend)))

# calculate differences in hazards
alpha <- sapply(1:nrow(unique_transition), function(iter){
val <- newdata %>% ungroup() %>% filter(transition == unique_transition[iter,1]) %>% arrange(tend)
val$delta_cumu_hazard
})

for (t in 1:nrow(unique_tend)) {
for (trans in 1:nrow(unique_transition)){
A[,,t] <- A[,,t] + M[,,trans] * alpha[t, trans]
}
}

# # for debugging
# print(A[,,nrow(unique_tend)])

# prepare transition probabilities
A <- I + A

for (iter in 1:nrow(unique_tend)) {
if (iter == 1) {
cum_A[,,iter] = A[,,iter]
} else {
cum_A[,,iter] = cum_A[,,iter-1] %*% A[,,iter] #use matrix multiplikation
}
}

# transform array so that transition probability can be joined via tend and transition
tmp <- cbind(unique_tend
, sapply(1:nrow(unique_transition), function(row) cum_A[unique_transition$from[row] + 1, unique_transition$to[row] + 1, ]))
colnames(tmp) <- c("tend", as.character(unique_transition$transition))
trans_prob_df <- tmp %>%
pivot_longer(cols = c(as.character(unique_transition$transition)), names_to = "transition", values_to = "trans_prob") %>%
mutate(trans_prob = pmin(pmax(trans_prob, 0), 1))

# join probabilities and return matrix
newdata <- newdata %>%
left_join(trans_prob_df, by=c("tend", "transition")) %>%
select(-delta_cumu_hazard, -from, -to)

return(newdata)

}

#' Add transition probabilities
#' @export
add_trans_prob <- function(
newdata
, object
, overwrite = FALSE
, ci = FALSE
, alpha = 0.05
, n_sim = 100L
, time_var = NULL
, interval_length = "intlen",
...
) {


if (!overwrite) {
if ("trans_prob" %in% names(newdata)) {
stop("Data set already contains 'trans_prob' column.
Set `overwrite=TRUE` to overwrite")
}
} else {
rm.vars <- intersect(
c("trans_prob"
, "trans_lower"
, "trans_upper"
),
names(newdata))
newdata <- newdata %>% select(-one_of(rm.vars))
}

# add confidence intervals if wanted
if (ci) {
newdata <- newdata |>
add_trans_ci(object) |>
add_cumu_hazard(object, overwrite = T, ci=F)
} else {
newdata <- newdata |>
add_cumu_hazard(object, overwrite = T, ci=F)
}


old_groups <- dplyr::groups(newdata)
res_data <- newdata %>% ungroup(transition)
out_data <- group_split(res_data) |>
map(res_data, .f = ~ group_by(.x, transition))|>
map(res_data, .f = ~ get_trans_prob(.x)) |>
map(res_data, .f = ~ group_by(.x, !!!old_groups)) |>
bind_rows() |>
select(-cumu_hazard)

return(out_data)

}

#' helper function for add_trans_ci
#' @keywords internal
get_sim_cumu <- function(newdata, ...) {

newdata$cumu_hazard <- cumsum(newdata$intlen * newdata$hazard)

newdata

}

#' Add transition probabilities confidence intervals
#' @keywords internal
add_trans_ci <- function(newdata, object, n_sim=100L, alpha=0.05, ...) {

X <- predict.gam(object, newdata = newdata, type = "lpmatrix")
coefs <- coef(object)
V <- object$Vp

# define groups: 1. all grouping variables -> cumu hazards, 2. all but transition -> trans_prob
groups_array <- group_indices(newdata)
groups_trans <- newdata %>% ungroup(transition) %>% group_indices()

sim_coef_mat <- mvtnorm::rmvnorm(n_sim, mean = coefs, sigma = V)
sim_fit_mat <- apply(sim_coef_mat, 1, function(z)
exp(X %*% z))

# create list with replicated newdata
nlst <- as.list(replicate(n_sim, newdata[,c("tend", "transition", "intlen")], simplify=F))

# add cumu-hazard in each element and calculate trans_prob with perturbed hazards
nlst <- lapply(1:n_sim, function(i) {
nlst[[i]] <- cbind(nlst[[i]], hazard = sim_fit_mat[, i]) # add hazard
# split by group and calculate cumu hazard
nlst[[i]] <- split(nlst[[i]], groups_array) %>% map_dfr(get_sim_cumu)
nlst[[i]] <- split(nlst[[i]], groups_trans) %>% map_dfr(get_trans_prob)

nlst[[i]]
})

sim_trans_probs <- do.call(cbind, lapply(nlst, function(df) df$trans_prob))
newdata$trans_lower <- apply(sim_trans_probs, 1, quantile, probs = alpha / 2)
newdata$trans_upper <- apply(sim_trans_probs, 1, quantile, probs = 1 - alpha / 2)

newdata
}


2 changes: 1 addition & 1 deletion R/as-ped.R
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ as_ped_multistate <- function(
dots$transition <- transition
dots$min_events <- min_events
dots$timescale <- timescale

ped <- do.call(split_data_multistate, dots)
attr(ped, "time_var") <- get_lhs_vars(dots$formula)[1]

Expand Down
28 changes: 23 additions & 5 deletions R/get-cut-points.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ get_cut.default <- function(
event = 1L,
...) {

outcome_vars <- get_lhs_vars(formula)
if (is.null(cut)) {
outcome_vars <- get_lhs_vars(formula)
if (length(outcome_vars) == 2) {
cut <- unique(data[[outcome_vars[1]]][1L * (data[[outcome_vars[2]]]) == event])
} else {
Expand All @@ -33,14 +33,32 @@ get_cut.default <- function(
cut <- cut[cut < max_time]
cut <- c(cut, max_time)
}
} else {
if (length(outcome_vars) == 2) {
# sort interval cut points in case they are not (so that interval factor
# variables will be in correct ordering)
cut <- sort(unique(cut))
} else {
# sort interval cut points in case they are not (so that interval factor
# variables will be in correct ordering)
# add transitions within interval cut points to not lose information
cut <- sort(unique(union(cut, data[[outcome_vars[2]]][1L * (data[[outcome_vars[3]]]) == event &
1L * (data[[outcome_vars[2]]]) < max(cut)])
)
)
}
if (!is.null(max_time)) {
cut <- cut[cut < max_time]
cut <- c(cut, max_time)
}
}
# sort interval cut points in case they are not (so that interval factor
# variables will be in correct ordering)
sort(unique(cut))

return(sort(unique(cut)))

}


#' @rdname get_cut
#' @inherit get_cut
get_cut.list <- function (
data,
formula,
Expand Down
Loading
Loading