-
Notifications
You must be signed in to change notification settings - Fork 1
/
pbc.Rnw
470 lines (430 loc) · 20.1 KB
/
pbc.Rnw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
\documentclass{article}[11pt]
\usepackage{amsmath}
\addtolength{\textwidth}{1in}
\addtolength{\oddsidemargin}{-.5in}
\setlength{\evensidemargin}{\oddsidemargin}
\newcommand{\code}[1]{\texttt{#1}}
\title{Time dependent covariates and absolute risk}
\author{Terry Therneau}
\date{19 Sept 2023}
\begin{document}
\maketitle
<<setup, echo=FALSE>>=
library(knitr)
library(survival)
library(ggplot2)
opts_chunk$set(comment=NA, tidy=FALSE, highlight=FALSE, echo=FALSE,
fig.width=4.5, fig.height=3, fig.path="figures/",
device="pdf", dev.args=list(pointsize=8),
cache=FALSE, background="#ffffff",
prompt=TRUE,
strip.white=FALSE, mymar=TRUE)
options(contrasts= c("contr.treatment", "contr.poly"),
show.signif.stars = FALSE, continue=" ", width=65)
# because "mymar" is set to TRUE above, this hook will be run on every chunk.
# it sets the margins, font, etc to values that work
knit_hooks$set(mymar= function(before, options, envir) {
if (before) {
look.for <- c("mar", "font", "cex") # options we want
plist <- options[match(look.for, names(options), nomatch=0)]
if (is.null(plist$mar)) plist$mar <- c(4, 4, .5, .5)
if (is.null(plist$cex)) plist$cex <- 1
do.call("par", plist)
} else NULL
})
@
\section{Introduction}
A common question, and pitfall, is how to create survival curves from a
Cox model with time-dependent covariates.
A survival curve is a statement, at some given point in time, of predicted
future survival given known covariates \emph{at that time}.
When the model includes time-dependent covariates, such a prediction requires,
as well, joint prediction of future covariate paths for any time-dependent
covariates.
Standard software does not provided this joint prediction, instead there is
wide opportunity to produce incorrect or misleading results.
(Note that this is a different question than time-dependent coefficients, which
appear in a discussion of proportional hazards.)
As an illustration, we reprise an example from Section 10.2.4 of
\cite{Therneau00}.
<<pbcurve1>>=
# Create the time1, time2 data set
pdat0 <- subset(pbcseq, !duplicated(id))
pdat0$bili3 <- cut(pdat0$bil, c(0, 1, 5, 100), c("normal", "1-5", ">5"))
pdata <- tmerge(pdat0[,c(1,4:6)], pdat0, id=id,
death = event(futime, status==2),
options= list(tstart="day1", tstop="day2"))
pdata <- tmerge(pdata, pbcseq, id=id, edema = tdc(day, edema),
bili= tdc(day, bili), albumin = tdc(day, albumin),
protime = tdc(day, protime))
pdata <- survSplit(Surv(day1, day2, death)~ ., pdata, cut=c(365, 365*3),
episode="enum")
pdata$year <- factor(pdata$enum, c(3,1,2), c("3+", "0-1", "1-3"))
pdata$enum <- NULL # no longer needed
pdata$age1 <- pdata$age + pdata$day1/365.25
pdata$age2 <- pdata$age + pdata$day2/365.25
pdata$bili3 <- cut(pdata$bili, c(0, 1, 5, 100), c("normal", "1-5", ">5"))
# Now create the multistate data set using bilirubin groups
# There is already a new obs whenever bilirubin changes
new <- which(diff(pdata$id)==0 & diff(as.numeric(pdata$bili3))!=0)
temp <- 4*(pdata$death) # vector of 0 or 4
temp[new] <- as.integer(pdata$bili3)[new+1]
pdata$bstate <- factor(temp, 0:4, c("none", "normal", "1-5", ">5", "death"))
# pfit0 = the original pbc model
# pfit1 = non-time dependent fit, updated data
# pfit2 = time-dependent fit
# pfit3 = time-dependent, on age scale
# pcheck = verify that years since enrollment has only minor impact
pfit0 <- coxph(Surv(time, status==2) ~ age + edema + log(bili) + albumin +
log(protime), pbc, x=TRUE)
pfit1 <- coxph(Surv(futime, status==2) ~ age + edema + log(bili) + albumin +
log(protime), pdat0, x=TRUE)
pfit2 <- coxph(Surv(day1, day2, death) ~ age + edema + log(bili) +
albumin + log(protime), pdata)
pfit3 <- coxph(Surv(age1, age2, death) ~ edema + log(bili) + albumin +
log(protime), pdata)
pcheck <- update(pfit3, . ~ .+ year)
# anova(pfit3, pcheck)
@
\begin{table} \centering
\begin{tabular}{ccccc}
&All 424 & 312 & 312 \\ &Time fixed & Time fixed & Time dependent \\ \hline
<<coeftable, results="asis">>=
iqr <- apply(pfit1$x, 2, function(x) diff(quantile(x, c(.25, .75))))
iqr[2] <- 1 #doesn't work for edema
temp <- iqr* cbind(pfit0$coef, pfit1$coef, pfit2$coef, c(0, pfit3$coef))
temp <- abs(temp)
temp2 <- c(concordance(pfit0)$concordance, concordance(pfit1)$concordance,
concordance(pfit2)$concordance, concordance(pfit3)$concordance)
cat("age &", paste(sprintf("%4.2f", temp[1,1:3]), collapse= ' & '), "\\\\ \n")
for (i in 2:5)
cat(rownames(temp)[i], '&', paste(sprintf("%4.2f", temp[i,]),
collapse= ' & '), "\\\\ \n")
cat("\\hline concordance&", sprintf("%4.2f", temp2), "\n")
@
\end{tabular}
\caption{Standardized coefficients for the original PBC model using all
424 subjects,
and using the 312 study subjects with either time fixed or time dependent
predictors on entry scale, and then time-dependent on age scale.
These represent the effect of a 1 unit change in edema, and a change
from the 25th to 75th percentile for the other 4.}
\label{pbctab1}
\end{table}
This study recuited subjects with primary biliary cholangitis to a placebo
controlled trial of D-penicillamine. PBC is a chronic
condition that at the time had no effective therapy.
During the recruitment period 424 patients met the eligivility criteria;
312 agreed to participate fully and another 108 to initial laboratory
measurements and long term follow-up.
Sequential laboratory data and further follow-up is available on the 312
enrollees. The survival package data set \code{pbc} contains baseline values
and survival for all 418 and \code{pbcseq} the sequential lab values.
As background, table \ref{pbctab1} shows the coefficients for each
of the 5 variables used in the risk score model of
Dickson et al (doi: 10.1002/hep.1840100102),
3 models fit to the extended data, and the model concordance.
The models using baseline data for all 418 and the 312 in the study
have essentially the
same predictive power, but the time-dependent covariate models are clearly
stronger.
Bilirubin is the most important predictor: a subject at the 75th percentile
has exp(1.2) $\approx 3.3$ fold hazard as someone at the 25th percentile of
bilirubin.
As an aside, the coefficients for enrollment time, in the age scale model,
are not significant but also not trivial (estimated 1.3 increase).
But the first 5 years' survival for the study and expanded cohort essentially
overlap.
<<coef, echo=TRUE>>=
print(pcheck, digits=2)
@
\begin{figure}
<<pbcfig0>>=
p1 <- survfit(Surv(time, status==2) ~1, pbc)
p2 <- survfit(Surv(futime, status==2)~1, pdat0)
plot(p1, xmax= 5*365.25, xscale= 365.25, conf.int=F, fun='event', lwd=2,
xlab="Years from randomization", ylab="Death")
lines(p2, col=2, lty=2, lwd=2, fun='event', conf.int=FALSE)
legend(100, .25, c("All 418", "Randomized 312"), lty=1:2, col=1:2, lwd=2)
@
\caption{Comparison of the 312 randomized to the full cohort.}
\label{pbcfig0}
\end{figure}
\begin{figure}
<<peffect>>=
scores <- pfit1$x %*% diag(coef(pfit1))
scores <- scale(scores, scale=FALSE) # subtract the mean
dscore <- data.frame(eta= c(scores), variable=names(pfit1$coef)[col(scores)])
ggplot(dscore, aes(x=variable, y=eta)) +
geom_violin(trim=FALSE, fill='lightblue')+
geom_boxplot(width=0.03) + theme_minimal()
@
\caption{Relative effect of each covariate on the risk score $\eta$.}
\label{peffect}
\end{figure}
\begin{figure}
<<pbcfig1>>=
temp <- survcondense(Surv(age1, age2, death) ~ bili+ albumin, id=id, pdata)
count <- table(temp$id)
p10 <- subset(temp, id %in% names(count)[count>9]) # 10 or more
idx <- unique(p10$id)
oldpar <- par(mfrow=c(2,2), mar=c(5,5,1,1))
for (j in 1:4) {
plot(bili ~ age1, p10, log='y',type='n', xlab="Age")
for (i in seq(j, length(idx), by=4)) {
who <- (p10$id == idx[i])
lines(bili ~age1, p10, subset= who, col= 1 + i%%7)
k <- who & (p10$death==1)
if (any(k)) points(p10$age1[k], p10$bili[k], pch=1,
col= 1 + i%%7)
}
abline(h=1.3, lty=3)
}
par(oldpar)
@
\caption{Bilirubin trajectories for the 53 subjects with 10 or more laboratory
measurements, separated into 4 panels to decrease overlap. A bilirubin
of $\le 1$ is normal, and above 1.3 begins to be a cause for concern
(the horizontal dotted line). Deaths are marked with a circle.}
\label{pbcfig1}
\end{figure}
Figure \ref{pbcfig1} helps explain the superiority of the time-dependent model.
To paraphase one of the MD investigators, the liver has a moderate amount of
excess capacity. As the disease's inflammatory process continues, it steadily
converts functioning tissue to scar, and liver function tests slowly rise to a
point but then rapidly increase when that excess capacity is exhausted.
At the start
of the trial subjects ranged from early to late disease; as time goes on
we see that many of them experience the bilirubin acceleration.
\subsection{KM curves}
\begin{figure}
<<pkm1>>=
km1 <- survfit(Surv(futime, status==2) ~ bili3, pdat0)
km2 <- survfit(Surv(day1, day2, death) ~ bili3, pdata)
aj3 <- survfit(Surv(day1, day2, bstate) ~ 1, pdata, id=id, istate=bili3)
aj3a <- survfit(Surv(day1, day2, bstate) ~ 1, pdata, id=id, istate=bili3,
p0=c(1,0,0,0))
aj3b <- survfit(Surv(day1, day2, bstate) ~ 1, pdata, id=id, istate=bili3,
p0=c(0,1,0,0))
aj3c <- survfit(Surv(day1, day2, bstate) ~ 1, pdata, id=id, istate=bili3,
p0=c(0,0,1,0))
plot(km1, col=1:3, lwd=2, fun="event", xscale= 365.25, xmax=12*365.25, ylim=0:1,
xlab="Years from randomization", ylab="Death")
lines(km2, col=1:3, lwd=2, lty=2, fun="event")
#lines(aj3a[4], col=1, lwd=2, lty=3, conf.int=F)
#lines(aj3b[4], col=2, lwd=2, lty=3, conf.int=F)
#lines(aj3c[4], col=3, lwd=2, lty=3, conf.int=F)
legend(5,1, c("bilirubin >5", "bilirubin 1-5", "normal"), col=3:1, lty=1,
lwd=2, bty='n')
@
\caption{Survival from baseline for the 3 bilirubin groups. Solid lines are
the Kaplan-Meier, dashed are the ``extended Kaplan-Meier''.}
\label{pkm1}
\end{figure}
For illustration we will divide the subjects into 3 groups: normal bilirubin
of $\le 1$, 1--5 and 5+.
Figure \ref{pkm1} shows the standard Kaplan-Meier for each of the 3 subgroups
as a solid line, divided by bilirubin level at baseline.
The dashed lines are the ``extended Kaplan-Meier'' recommended by Snapinn et al
\cite{Snapinn05}.
These curves make direct use of the time-dependent data set: at each time point
the increment to bili$>$5 curve is based on the set of subjects
currently at risk and \emph{currently} having a bilirubin level that is $>5$.
In the usual KM the increment is based on the initial bilirubin.
The impact on counts is shown in table \ref{pkm2}.
Notice that nearly all the death (100/140) are now attributed to the
high bilirubin group.
\begin{table} \centering
\begin{tabular}{cccccc}
& \multicolumn{4}{c}{Number at risk} & Total \\
& 0 & 4y & 8y & 12y & Deaths \\ \hline
\multicolumn{1}{l}{Usual KM\phantom{MMMMM}} \\
<<pkm2, results="asis">>=
temp1 <- summary(km1, times=c(0,4,8,12)*365.25, extend=TRUE)
#nrisk is currently incorrect for a time-dependent KM
rcount <- function(time, data=pdata) {
if (time==0) atrisk <- (data$day1==0) else
atrisk <- (data$day1 < time) & (data$day2 >= time)
table(data$bili3[atrisk])
}
temp2 <- c(rcount(0), rcount(4*365.25), rcount(8*365.25), rcount(12*365.25))
temp3 <- rbind(matrix(temp1$n.risk, ncol=4, byrow=TRUE),
matrix(temp2, ncol=4, byrow=FALSE))
temp4 <- table(pdat0$status, pdat0$bili3)
temp5 <- table(pdata$death, pdata$bili3)
temp3 <- cbind(temp3, c(temp4[3,], temp5[2,]))
xx <- c("Normal", "1--5", "$\\>$5")
for (i in 1:3)
cat(xx[i], " & ", paste(temp3[i,], collapse= "&"), "\\\\ \n")
cat("\\multicolumn{1}{l}{Extended KM} \\\\ \n")
for (i in 1:3)
cat(xx[i], " & ", paste(temp3[i+3,], collapse="& "), "\\\\\n")
@
\end{tabular}
\caption{Number at risk a 0, 4, 8, and 12 years along with the total
number of deaths for each group, for the two estimators.}
\label{pkm2}
\end{table}
How do we interpret these curves? The simple KM is simple: each curve is
an estimate of the future survival, from randomization, for a set of subjects
in the given state at randomization. No use is made of the follow-up
lab values.
For the extended KM, the argument is made that the normal bilirubin curve
estimates the survival of subjects who start and remain in that state, i.e.,
their bilirubin never rises above 1, and likewise that curve 2 represents
subjects whose bilirubin remains between 1 and 5. A more cautionary note
is provided in \cite{Sjolander20}, who looks at the risk sets more carefully
from a causal models perspective; who finds that underlying premise that those
currently in group 1--3 represent subjects who are always in that state requires
additional strong assumptions.
Our view is more simple: even if the curve can estimate what it claims to
estimate,
of what use is it? In this disease the liver status will invariably
seriously decline over time; the estimator has created curves for someone who
does not exist.
\begin{figure}
<<pkm3>>=
oldpar <- par(mfrow=c(2,2))
par(mar=c(1,1,1,1))
states <- c("normal\nbilirubin", "1-5", ">5", "death")
smat <- matrix(0L, 4,4, dimnames= list(states, states))
smat[1:3,4] <- 1
smat[1,2] <- smat[2,3] <- 1
smat[2,1] <- smat[3,2] <- 1.4
statefig(matrix(c(3,1), ncol=1), smat, alty=c(2,1,2,1,1,1,1))
par(mar=c(5,5,1,1))
plot(aj3a, col=1:4, lwd=2, xscale=365.25, xlab="Years from randomization",
ylab="P(state)")
abline(0,0,lty=3)
plot(aj3b, col=1:4, lwd=2, xscale=365.25, xlab="Years from randomization",
ylab="P(state)")
legend(4*365, 1, c("Normal","1-5", ">5","Death"), col=1:4, lty=1, bty='n')
abline(0,0,lty=3)
plot(aj3c, col=1:4, lwd=2, xscale=365.25, xlab="Years from randomization",
ylab="P(state)")
abline(0,0,lty=3)
par(oldpar)
@
\caption{Potential state space for the PBC data, along with Aalen-Johansen
estimates assuming that everyone starts in the normal state (upper right),
the bilirubin 1--5 state (lower left), or bilirubin $>5$ state
(lower right).
}
\label{pkm3}
\end{figure}
An alternative that does make use of the evolving laboratory data, but also
estimates a quantity of direct interest, is a multi-state model shown in
figure \ref{pkm3}. Figure \ref{pkm4} shows the predictions for each starting
state along with the simple KM.
The multi-state curve for the 'normal bilirubin state has an increment, at
each death time, of
\begin{equation*}
\sum_i P(s(t) =i | s(0)=1) P({\rm death} | s(t)=i)
\end{equation*}
where $s(t)$ is the state.
We speculate that the increase in death rate for the 'normal' curve is a
reflection of reclassification as subjects go between the first and
second state.
\begin{figure}
<<pmk4>>=
plot(km1, fun='event', xscale=365.25, col=1:3, lwd=2, xmax=12*365.25,
xlab="Years from randomization", ylab="P(death)")
lines(aj3a[4], col=1, lwd=2, lty=2, conf.int=FALSE)
lines(aj3b[4], col=2, lwd=2, lty=2, conf.int=FALSE)
lines(aj3c[4], col=3, lwd=2, lty=2, conf.int=FALSE)
@
\caption{Standard Kaplan-Meier (solid) along with multi-state estimates of
survival for the PBC data.}
\label{pkm4}
\end{figure}
\subsection{Hazard models}
We can repeat the same exercise with predicted curves from a Cox model.
Figure \ref{pbccurve2} contains the overall KM as a reference, along with
the predicted curve based on the fit using baseline values, and the fit using
time-dependent covariaties.
In both cases these are for a subject with average covariate values.
The simple curve (dotted) is quite a bit below the KM, which is a consequence
of the fact that $E(f(X)) \ne f(E(X))$ for any non-linear function $f$.
The predicted survival curve from a Cox model is a quite non-linear
function of the linear predictor $\eta$, we have drawn the curve for an
average $\eta$.
The marginal prediction from the model is an average of all $n=312$ predicted
curves from the model, one for each subjects, and it is very close to the
KM. (Not exactly identical due to differential follow-up and the PH assumption.)
The prediction from the time-dependent model suffers from the same issue as
the extendend KM: it is attempting to predict the survival of a subject who
starts with these average covariate values, and then never changes.
The computational mechanism is different than the EKM estimate: all observations
participate in creating a single baseline hazard estimate, and this is then used
to predict the curve for a subject with constant covariates.
Whether said computation is justifiable or not, in our opinion the result
is simply not interesting.
\begin{figure}
<<pbccurve2>>=
psurv1 <- survfit(Surv(futime, status==2) ~1, pdat0)
dummy <- data.frame(age=50, bili=1.35, edema=0, albumin=3.5, protime=10.6)
psurv2 <- survfit(pfit1, newdata=dummy)
psurv3 <- survfit(pfit2, newdata=dummy)
psurv4 <- survfit(pfit1, newdata=pdat0)
psurv4$surv <- rowMeans(psurv4$surv)
plot(psurv1, lwd=2, conf.int=FALSE, xscale=365.25, xmax=365.25*12, fun='event',
xlab="Years post enrollment", ylab="Death")
lines(psurv2[1], lty=3, lwd=2,conf.int=FALSE, fun="event")
lines(psurv3[1], lty=2, lwd=2, fun="event", conf.int=FALSE)
lines(psurv4, lty=3, lwd=1,fun='event', conf.int=FALSE)
@
\caption{Overall survival for the PBC dataset (dotted line), along with
predicted survival from the time-fixed (solid) and time-dependent
Cox models for a subject with median covariate values at enrollment:
age, bilirubin, edema, albumin and prothrombin times of 50, 1.35, absent,
3.5 and 10.6, respectively. The marginal curve is shown as a dotted line,
and is nearly coincident with the KM.}
\label{pbccurve2}
\end{figure}
A possible solution would be to create a marginal curve.
The software allows for specification of a time-dependent covariate path,
and will then produce the curve corresponding to that path. Compute the
curves for all $n$ patients and average them.
The problem with this idea is differential follow-up. Subject 1, for instance,
died at day 400. Unless we are willing to project some hypothetical future
covariate path for the observation, the predicted curve also stops after just
over 1 year.
Taking a simple average, i.e., marginal(t) = mean of all curves defined at t,
will certainly be discontinuous and may not even be monotone.
Multistate models again offer a potential solution to the issue.
Consider the following model:
<<pbcm1, echo=TRUE>>=
test <- survcheck(Surv(day1, day2, bstate) ~1, pdata, id=id, istate= bili3)
test$transitions
mfit1 <- coxph(list(Surv(day1, day2, bstate) ~ 1,
c(1:3):"death" ~ age + edema + albumin +
log(protime) / common + shared),
data= pdata, id=id, istate= bili3)
print(mfit1, digits=2)
msurv1 <- survfit(mfit1,newdata=dummy, p0=c(1,0,0,0))
test <- coxph(Surv(day1, day2, death) ~ age + edema + albumin + log(protime)
+ bili3, data= pdata)
@
This is a more complex model.
I have forced no covariates for the transitions between bilirubin
states. Bilirubin itself should not be a preditor of those, due to edge
effects if nothing else, e.g., a subject with bili of 4.9 is more likely to
transition to the 5+ state than someone with a bilirubin of 3 simply due to
measurement variability.
Proportional hazards has been assumed for the 3 transitions to death, resulting
in scale factors of 1, 1.7, and 2.2.
Though not printed above, the coefficients and log likelihood of the test
fit are identical to mfit1.
The advantage of mfit comes when we ask for a predicted survival curve:
the bilirubin level is given as an intial state which then evolves over time,
rather than a fixed numerical value which does not. The other three
covariates still appear as time fixed variables, so this solution is incomplete,
but it gives the flavor.
A plot of the death state for msurv1 corresponds to death for a hypothetical
subject whose bilrubin can evolve but other variables remain fixed (perhaps
even more uninterpreatable than the EKM).
Bilirubin is by far the strongest predictor, however, and it is not surprising
that the curve lies between the dashed and dotted lines of
figure \ref{pbccurve2} (not shown).
\subsection{Using age scale}
\end{document}