Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result Discrepancy between lifelines CoxTimeVaryingFitter and R survival #1600

Open
opendataminer opened this issue Feb 29, 2024 · 1 comment

Comments

@opendataminer
Copy link

A very simple test case that illustrate the result discrepancy, the param for x are identical for both cases, but the baseline cumulative hazard are different, especially the index t are of different values! My fault if it is due to any misusage of the 2 packages.

Python Version

from lifelines import CoxTimeVaryingFitter

test2 = pd.DataFrame(dict(
    start=[1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
    stop =[2, 3, 6, 7, 8, 9, 9, 9,14,17],
    event=[1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
    x    =[1, 0, 0, 1, 0, 1, 1, 1, 0, 0 ]
))

s_model = CoxTimeVaryingFitter(penalizer=0)
s_model.fit(test2, event_col='event', start_col='start', stop_col='stop', 
            formula=' ~ x', show_progress=False, robust=False)

s_model.baseline_cumulative_hazard_

Output
image

R Version

library(survival)

test2 <- list(start=c(1, 2, 5, 2, 1, 7, 3, 4, 8, 8),
                   stop =c(2, 3, 6, 7, 8, 9, 9, 9,14,17),
                   event=c(1, 1, 1, 1, 1, 1, 1, 0, 0, 0),
                   x    =c(1, 0, 0, 1, 0, 1, 1, 1, 0, 0) )

cox_pp_00 <- coxph( Surv(start, stop, event) ~ x, test2, robust = FALSE, ties = 'efron', method = 'efron')

basehaz(cox_pp_00, centered=FALSE)

Output
image

@MetzgerSK
Copy link

For what it's worth: the apparent R/Python mismatch might have the source as the R/Stata mismatch, as described here.

I mention it only because when I flipped basehaz(cox_pp_00, centered=FALSE) in your MWE to basehaz(cox_pp_00, centered=TRUE), the output was identical to centered=FALSE:

> basehaz(cox_pp_00, centered=FALSE)
     hazard time
1 0.5052761    2
2 0.8409462    3
3 1.0434840    6
4 1.2974621    7
5 1.5514402    8
6 2.0066161    9
7 2.0066161   14
8 2.0066161   17
> basehaz(cox_pp_00, centered=TRUE)
     hazard time
1 0.5052761    2
2 0.8409462    3
3 1.0434840    6
4 1.2974621    7
5 1.5514402    8
6 2.0066161    9
7 2.0066161   14
8 2.0066161   17

The identical output, to me, suggests that centered isn't touching the part of survival's behavior that gives rise to the R/Stata differences, meaning that behavior's still in play to explain the R/Python discrepency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants