From c19d28b21199653d9aad0ee4cd68b2b6d0125aa7 Mon Sep 17 00:00:00 2001 From: Ben Bolker Date: Wed, 27 Mar 2024 14:18:07 -0400 Subject: [PATCH] working on ms (spellcheck and tweaks) --- doc.Rnw | 97 ++++++++++++++++++++++++++++----------------------------- 1 file changed, 47 insertions(+), 50 deletions(-) diff --git a/doc.Rnw b/doc.Rnw index 7dce49d..f9af74f 100644 --- a/doc.Rnw +++ b/doc.Rnw @@ -1,20 +1,16 @@ \section*{Abstract} -Rabies spread by domestic dogs continues to cause tens of thousands of human deaths every year in low- and middle-income countries. Despite this heavy mortality burden, rabies is often neglected, perhaps because it has been effectively controlled from high-income countries through mass dog vaccination. -%% \mr{Maybe discuss neglect elsewhere; there are issues.} \mli{Repharsed a bit, not seeing a big issue in general.} -Current estimates of the intrinsic reproductive number (\rzero) (a metric disease spread risk) of canine rabies from a wide range of times and locations are low (values \textless 2), with narrow confidence intervals compared to many infectious diseases. -%% This consistent narrow range of estimates across historical outbreaks is surprising. -%% \mr{Can we assume readers of this journal know about \rzero?} \mli{I tried to add a few words to help this; I think it is ok.} -We combined incidence data from historical outbreaks of canine rabies from around the world (1917-2003) with high-quality contact-tracing data from Tanzania (2002-present) to investigate initial growth rates (\littler), generation-interval distributions (\G) and reproductive numbers (\rzero). -We updated earlier work by: choosing outbreak windows algorithmically; fitting \littler using a more appropriate statistical method that accounts for decreases through time; and propagating uncertainty from both \littler and \G when estimating \rzero. +% Please keep the Author Summary between 150 and 200 words +%% current length not including comments: 197 words +Rabies spread by domestic dogs continues to cause tens of thousands of human deaths every year in low- and middle-income countries. Nevertheless rabies is often neglected, perhaps because it has been controlled through dog vaccination in high-income countries. +%% \mr{Maybe discuss neglect elsewhere; there are issues.} \mli{Rephrased a bit, not seeing a big issue in general.} +Estimates of canine rabies's intrinsic reproductive number (\rzero), a metric of disease spread, from a wide range of times and locations are relatively low (values $<2$), with narrow confidence intervals. Given rabies's persistence, this consistently low and narrow range of estimates is surprising. +We combined incidence data from historical outbreaks of canine rabies from around the world with high-quality contact-tracing data from Tanzania to investigate initial growth rates (\littler), generation-interval distributions (\G), and reproductive numbers (\rzero). +We improved on earlier estimates by choosing outbreak windows algorithmically; fitting \littler using a more appropriate statistical method that accounts for decreases through time; and incorporating uncertainty from both \littler and \G in our confidence intervals on \rzero. Our \rzero estimates are larger than previous estimates, with wider confidence intervals. +Our novel hybrid approach for estimating \rzero and its uncertainty is applicable to other disease systems where researchers estimate \rzero by combining estimates of \littler and \G. %\mr{is it still a "hybrid" approach now that uncertainty is estimated via bootstraps rather than a Bayesian MCMC sampler?} -This hybrid approach for estimating \rzero and its uncertainty is applicable to other disease systems where researchers estimate \rzero by combining estimates of \littler and \G. - -% Please keep the Author Summary between 150 and 200 words -% Use first person. PLOS ONE authors please skip this step. -% Author Summary not valid for PLOS ONE submissions. % \linenumbers %% switch for line number @@ -23,75 +19,75 @@ This hybrid approach for estimating \rzero and its uncertainty is applicable to Canine rabies, primarily spread by domestic dogs, is a vaccine-preventable disease that continues to cause tens of thousands of human deaths every year in low- and middle-income countries (LMICs) \citep{taylor2017difficulties, minghui2018new}. Canine rabies has been effectively eliminated from high-income countries by mass dog vaccination \citep{rupprecht2008can}. -Despite the effectiveness of vaccinating dogs, rabies continues to cause many human deaths and large economic losses in LMICs due to the limited implementation of rabies control strategies \citep{hampson2015estimating}. -The past two decades have seen an increase in rabies control efforts --- including dog vaccination campaigns and improvements in surveillance \citep{kwoba2019dog, mtema2016mobile, gibson2018one, mazeri2018barriers, wallace2015establishment}. -%\jd{How badly to COVID impact this progress?} -%\mli{This would not be the place to put this I think.. maybe in the discussion?} +Despite the effectiveness of dog vaccination, rabies continues to cause many human deaths and large economic losses in LMICs due to the limited implementation of rabies control strategies \citep{hampson2015estimating}. +The past two decades have seen an increase in rabies control efforts, including dog vaccination campaigns and improvements in surveillance \citep{kwoba2019dog, mtema2016mobile, gibson2018one, mazeri2018barriers, wallace2015establishment}. The World Health Organization (WHO) and partners (OIE, FAO, GARC) joined forces to support LMICs in eliminating human deaths from dog-mediated rabies by 2030 \citep{minghui2018new, abela20162016}. Mass dog vaccination campaigns have begun in some LMICs and are being scaled up \citep{castillo2019socio, evans2019implementation}. However, the emergence of SARS-CoV-2 pandemic disrupted rabies control and elimination efforts \citep{nadal2022impact}. -As the SARS-CoV-2 pandemic is transitioning out of global emergency, rabies control programmes are slowly unpaused. +As the SARS-CoV-2 pandemic is transitioning out of global emergency, rabies control programmes are slowly resuming. % https://www.who.int/news-room/fact-sheets/detail/rabies An understanding of rabies epidemiology --- in particular, reliable estimates of the basic reproductive number (\rzero), a quantitative measure of disease spread that is often used to guide vaccination strategies --- could inform rabies control efforts. -\rzero is defined as the expected number of secondary cases generated from each primary case in a fully susceptible population \citep{macdonald1952analysis}. -Estimates of \rzero using various methods (i.e., direct estimates from infection histories, epidemic tree reconstruction, and epidemic curve methods) based on historical outbreaks of rabies have generally been surprisingly low, typically between 1 and 2 with narrow confidence intervals for variety of regions and time periods \citep{hampson2009transmission, kurosawa2017rise, kitala2002comparison}. -With such a low \rzero one might expect rabies to fade out from behavioural control measures combined with stochastic fluctuations, even in the absence of vaccination. -In contrast to diseases with a large \rzero (e.g., rinderpest, with \rzero $\approx 4$ \citep{mariner2005model}), \rzero estimates for rabies imply that control through vaccination should be relatively easy. - +The basic reproductive number \rzero is defined as the expected number of secondary cases generated from each primary case in a fully susceptible population \citep{macdonald1952analysis}. +Estimates of \rzero using various methods (direct estimates from infection histories, epidemic tree reconstruction, and epidemic curve methods) based on historical outbreaks of rabies have generally been surprisingly low, typically between 1 and 2 with narrow confidence intervals for a variety of regions and time periods \citep{hampson2009transmission, kurosawa2017rise, kitala2002comparison}. +With such a low \rzero one might expect rabies to fade out due to a combination of behavioural control measures and stochastic fluctuations, even in the absence of vaccination. +In contrast to diseases with a large \rzero (e.g., rinderpest, with \rzero $\approx 4$ \citep{mariner2005model}), \rzero estimates for rabies suggest that control through vaccination should be relatively easy. % \rzero estimates for rabies using various methods (i.e., direct estimates from infection histories, epidemic tree reconstruction and have been consistently low, with narrow confidence intervals \citep{hampson2009transmission}. -Here we revisit and explore why rabies, with low \rzero, nonetheless persists in many countries around the world. +Here we revisit and explore why rabies, with its low \rzero, nonetheless persists in many countries around the world. Such persistence suggests that rabies's potential for spread, and therefore the difficulty of rabies control, may have been underestimated. -In this paper, we will combine information derived from epidemic curves with a high-resolution contact tracing dataset that provides large number of observed generation intervals (which is rare for infectious disease studies) to estimate \rzero. +We will combine information derived from epidemic curves with a high-resolution contact tracing data set that provides large number of observed generation intervals (rare for infectious disease studies) to estimate \rzero. %% This reassessment can reevaluate the estimation of $\rzero$ for rabies outbreaks and understanding of disease control more generally. %% \mr{I think the "more generally" part desserves a bit more preamble/evidence... maybe this is a bit late to get to the general overall, although this may depend where you're publishing and the hopes that this can spur rabies vaccination campaigns} \mli{Happy to get rid of it.} %% need a \rzero P about euler and bad CI \section*{Materials and Methods} -\rzero is often estimated by combining two other epidemiological quantities: the initial growth rate of an epidemic (\littler) and the generation interval (\G) distribution, where a \G is defined as the time between successive infections along a transmission chain \citep{park2018exploring}. -The growth rate \littler is often estimated by fitting a growth rate to time series data from the early stages of epidemics. +\rzero is often estimated by combining two other epidemiological quantities: the initial growth rate of an epidemic (\littler) and the generation interval (\G) distribution, where the generation interval is defined as the time between successive infections along a transmission chain \citep{park2018exploring}. +The initial growth rate \littler is often estimated by fitting a model to time series data from the early stages of epidemics. \G is an individual-level quantity that measures the time between an individual getting infected to infecting another individual. The generation interval distribution is the natural way to link \littler and \rzero \citep{wallinga2006generation, champredon2015intrinsic}. -During an outbreak in a fully susceptible population, \rzero can be calculated from \littler and the \G distribution +During an outbreak in a fully susceptible population, +\bmb{Not sure we need the first clause here? $\littler$ implies that we are at the beginning of an outbreak?} +\rzero can be calculated from \littler and the \G distribution by the Euler-Lotka equation \citep{wallinga2006generation} \begin{equation} \rzero = \frac{1}{\sum_{t=1}^{\infty} G(t)e^{-rt}}, \label{eq:EL} \end{equation} where $t$ is time, and $G(t)$ is the generation interval distribution. -This formula is convenient to calculate point estimates of \rzero; however, propagating uncertainty from the estimates of \littler and the \G distribution has rarely been done. +This formula is convenient to calculate point estimates of \rzero; however, researchers rarely propagate uncertainty from the estimates of \littler and the \G distribution through this formula. \subsection*{Initial growth rate} Disease incidence typically increases approximately exponentially during the early stages of an epidemic. The initial growth rate \littler is often estimated by fitting exponential curves from near the beginning to near the peak of an epidemic. However, growth rates estimated from an exponential model can be biased downward, overconfident, and sensitive to the choice of fitting windows \citep{ma2014estimating}. -Here we used logistic, rather than exponential curves to more robustly estimate \littler \citep{ma2014estimating, chowell2017fitting}. +Here we used logistic rather than exponential curves to more robustly estimate \littler \citep{ma2014estimating, chowell2017fitting}. %% \mr{This seems like such a big piece of the difference between KH and WZLi estimates; I'm surprised it only gets mentioned here in M\&M but not highlighted in abstract/methods. Especially since KH is an author, seems fine to be more firm in claiming logistic is more appropriate and likely to resolve some of the "paradox"\\ } \mli{Added a few words in the abstract.} %\mr{code-formatted variable names may be alienating, maybe describe in more conventional terms with either just words, or mathy variable names?} -We selected fitting windows algorithmically for each outbreak as follows: 1) we broke each time series into “phases:” a new phase starts after a peak of height at least \code{minPeak} cases followed by a proportional decline of at least \code{declineRatio}; 2) In each phase, we identify a prospective fitting window starting after the last observation of 0 cases and extending one observation past the highest value in the phase (unless the highest value is itself the last observation); 3) we then fit our model to the cases in the fitting window if (and only if) it has a peak of at least \code{minPeak} cases, a length of at least \code{minLength} observations, and a ratio of at least \code{minClimb} between the highest and lowest observations. We tried a handful of parameter combinations before settling on a final set during an expert consultation. These explorations are detailed, and the final choices noted, in our code repository. +We selected fitting windows algorithmically for each outbreak as follows: (1) we break each time series into “phases”: a new phase starts after a peak of height at least \code{minPeak} cases followed by a proportional decline of at least \code{declineRatio}; (2) In each phase, we identify a prospective fitting window starting after the last observation of 0 cases and extending one observation past the highest value in the phase (unless the highest value is itself the last observation); (3) we then fit our model to the cases in the fitting window if (and only if) it has a peak of at least \code{minPeak} cases, a length of at least \code{minLength} observations, and a ratio of at least \code{minClimb} between the highest and lowest observations. We tried a handful of parameter combinations before settling on a final set during an expert consultation. These explorations are detailed, and the final choices noted, in our code repository. \subsection*{Observed Generation intervals} -Transmission events are generally hard to observe for most diseases. -In an earlier, influential rabies paper, estimated generation intervals were constructed by summing two quantities: a latent period (the time from infection to infectiousness), and a wait time (time from infectiousness to transmission) \citep{hampson2009transmission}. +Transmission events are hard to observe directly for most diseases. +An earlier, influential rabies paper constructed estimated generation intervals by summing two quantities: a latent period (the time from infection to infectiousness), and a wait time (time from infectiousness to transmission) \citep{hampson2009transmission}. Since clinical signs and infectiousness appear at nearly the same time in rabies, the incubation period (the time from infection to clinical signs) is routinely used as a proxy for the latent period. -In the Hampson et al. analysis, latent (really, incubation) periods and infectious periods were randomly and independently resampled from empirically observed distributions \citep{hampson2009transmission}, and then waiting times sampled uniformly from the selected infection periods. +\citeauthor{hampson2009transmission} randomly and independently resampled latent (really, incubation) periods and infectious periods from empirically observed distributions \citep{hampson2009transmission}, and then sampled waiting times uniformly from the selected infection periods. -However, this approach for constructing \G values (i.e., summing independently resampled values of incubation and infectious periods) does not account for the possibility of multiple transmissions from the same individual, nor does it account for correlations between time distributions and biting behaviour. +However, constructing \G values by summing independently resampled values of incubation and infectious periods accounts neither for the possibility of multiple transmissions from the same individual, nor for correlations between time distributions and biting behaviour. \fref{intervals} illustrates the generation intervals of a single transmission event from a rabid animal (comprising a single incubation period plus a waiting time) and multiple transmission events from a rabid animal (comprising a single incubation period and three waiting times). +\bmb{the commented-out statement here seems important?} % To account for multiple transmission, the incubation periods needs to be reweighted by the number of transmissions. -For diseases like rabies, where transmissions links (and generation intervals) are observable, multiple transmissions and possible correlation structures are all implicitly accounted for within the observation processes through contact tracing. +For diseases like rabies, where transmission links (and generation intervals) are observable, multiple transmissions and possible correlation structures are all implicitly accounted for within the observation processes through contact tracing. %\mr{I'm confused by the "implicitly accounted for"; why not simply explicitly observed?} \begin{center} \begin{figure}[ht!] \includegraphics[scale = 0.5]{./interval.png} \caption{\textbf{Decomposing generation intervals.} -Generation intervals start when a focal animal acquires infection (open red circle) and end after a period of viral replication (dashed line) when an animal shows clinical signs (blue star), becomes infectious (solid black circle) and infects another animal --- in rabies, the onset of clinical signs and of becoming infectious are closely synchronized. +Generation intervals start when a focal animal acquires infection (open red circle) and end after a period of viral replication (dashed line) when an animal shows clinical signs (blue star), becomes infectious (solid black circle) and infects another animal --- in rabies, the onset of clinical signs and of infectiousness are closely synchronized. Once the infectious period (grey block) starts, there is a wait time (solid black line) until a susceptible host (solid red circles) is bitten. The infectious period ends with the death of the focal host (black X). -The generation interval is the interval between the focal animal getting infected, and when it infects a new case (red interval between open and solid circles). (right) If a single biter transmits multiple times, the wait times are generally different, but the incubation period is the same for each transmission event.} +The generation interval is the interval between the focal animal getting infected, and when it infects a new case (red interval between open and solid circles). (right) If a single biter transmits multiple times, the wait times generally vary, but the incubation period is the same for each transmission event.} \flabel{intervals} \end{figure} \end{center} @@ -102,7 +98,7 @@ The generation interval is the interval between the focal animal getting infecte In a population where some animals are not susceptible, calculations based on estimates of \littler and the \G distribution (\ref{eq:EL}) estimate the \emph{realized} average number of cases per case, also known as the effective reproductive number \re. In the case of rabies, vaccination is the only known cause of immunity (case fatality in dogs is believed to be 100\%). -For a given population with $\nu$ vaccination proportion, the estimated $\rzero$ with vaccination correction is the following: +For a given population with $\nu$ vaccination proportion, the estimated $\rzero$ with vaccination correction is \begin{equation} \rzero = \frac{\re}{(1 - \nu)}. \label{eq:RE} @@ -117,11 +113,11 @@ load("check.rda") @ We used data from December 2002 -- November 2022, from an ongoing contact tracing project in Tanzania \citep{hampson2008rabies, hampson2009transmission}. -Since 2002, there are \Sexpr{dogsTransmissionNum} domestic dog recorded events (i.e., domestic dogs bitten by an animal), and \Sexpr{dogsSuspectedNum} suspected rabid dogs in the Serengeti, Tanzania. +The data set contains \Sexpr{dogsTransmissionNum} domestic dog recorded events (i.e., domestic dogs bitten by an animal), and \Sexpr{dogsSuspectedNum} suspected rabid dogs in the Serengeti, Tanzania. Transmission events were documented through retrospective interviews with witnesses, applying diagnostic epidemiological and clinical criteria from the six-step method \citep{tepsumethanon2005six}. Each dog was given a unique identifier, and date of the bite and clinical signs were recorded if applicable and available. -\Sexpr{nrow(dogsUnknownBiter)} of dog transmissions were from unidentified domestic animals or wildlife. -We restricted our analysis in this paper to domestic dog transmissions (i.e., dog to dog), and obtained \Sexpr{countVec["Generation Interval"]} directly observed generation intervals (i.e. both biter and secondary case have "time bitten" records). +\Sexpr{nrow(dogsUnknownBiter)} of the dog transmissions were from unidentified domestic animals or wildlife. +We restricted our analysis to domestic dog transmissions (i.e., dog to dog), and obtained \Sexpr{countVec["Generation Interval"]} directly observed generation intervals (i.e. both biter and secondary case have ``time bitten" records). There were four observed dogs with multiple exposures (i.e., bitten by different identified biters), generating extra generation intervals, but it is unclear which transmission event transmitted rabies to these dogs. For simplicity, we omitted these four dogs and their generation intervals from our analysis. @@ -134,12 +130,13 @@ load("slow/egf_R0.rda") @ To propagate uncertainties for both \littler and \G, we used a hybrid approach. -We first fitted logistic models, with negative binomial observation error, to incidence data to estimate \littler implemented in the R package ``epigrowthfit'' \cite{epigrowthfit}. -We then compute a sample of \Sexpr{nsamp} $\hat{\rzero}$ values using equation (\ref{eq:RE}); for each value of $\hat{\rzero}$, we first draw a value of $\hat{\littler}$ from a normal distribution from the estimates of the logistic fit and an independent sample of \G from the empirical contact tracing data. To sample \G from the empirical contact tracing data, we first take a weighted sample of \Sexpr{nboot} biters, which accounts for biter-level variation, and for each biter, we sample a \G from its respective transmission event, to account for individual variation. +We first fit logistic models, with negative binomial observation error, to incidence data to estimate \littler implemented in the R package {\tt epigrowthfit} \cite{epigrowthfit}. +We then compute a sample of \Sexpr{nsamp} $\hat{\rzero}$ values using equation (\ref{eq:RE}). +For each value of $\hat{\rzero}$, we first draw a value of $\hat{\littler}$ from a Normal distribution matching the estimated sampling distribution of the logistic fit parameters and an independent sample of \G from the empirical contact tracing data. To sample \G from the empirical contact tracing data, we first take a weighted sample of \Sexpr{nboot} biters, which accounts for biter-level variation, and for each biter, we sample a \G from its respective transmission event, to account for individual variation. We then matched samples of \G to the \littler samples to produce a range of estimates for \rzero. -This hybrid sampling approach incorporates both sources of uncertainties from \littler and \G. -when calculating \rzero estimates. -Finally, we take the 2.5, 50, 97.5\% percentiles of the distribution of \rzero estimates for each rabies outbreak. +This hybrid sampling approach incorporates the uncertainties in both \littler and \G +in the distribution \rzero estimates. +Finally, we use the 2.5, 50, and 97.5 percentiles of the distribution of \rzero estimates to get point estimates and confidence limits for $\rzero$ for each rabies outbreak. \section*{Results} @@ -167,7 +164,7 @@ The weighted incubation period distribution more closely resembles the generatio \begin{figure}[h] \includegraphics[page=1,scale = 0.7]{rplot_combo.Rout.pdf} \caption{\textbf{Growth rate estimates for global historical outbreaks of rabies.} Estimates and 95\% confidence intervals of \littler in global historical outbreaks estimated from exponential (dotted) and logistic (solid) model fits.} -Different colors represents different phases from the times series data. +Different colors represent different phases from the times series data. \flabel{littler} \end{figure} \end{center} @@ -205,7 +202,7 @@ The hybrid approach provides larger values of \rzero and wider confidence interv \begin{figure}[h] \includegraphics[page=1,scale = 0.7]{mexico.Rout.pdf} \caption{\textbf{Effects of \littler, corrected \G on the estimates of \rzero in Mexico outbreak.} -Exponential \littler and naive \G is the analgous fitting to \cite{hampson2009transmission} using our algorithmic windowing selection. \rzero using logistic to estimate \littler is larger than exponential and corrected \G increase \rzero and uncertainty. +Exponential \littler and naive \G is the analogous fitting to \cite{hampson2009transmission} using our algorithmic windowing selection. \rzero using logistic to estimate \littler is larger than exponential and corrected \G increase \rzero and uncertainty. %\bmb{more complete caption? Make this horizontal to match the other figs? Extend R0-axis to have a lower limit at \rzero=1?} %\mr{what ben said :-)} %\mli{I actually like it this way.} @@ -240,11 +237,11 @@ Re-analysis of these data also allowed us to identify an overlooked fact about r Nevertheless, our estimates suggest that rabies \rzero may be larger, and more uncertain, than previously thought. This finding may explain some of the formerly unexplained variations in the success of rabies-control programs (e.g., low levels of coverage (30–50\%) have been successful in some settings while high coverage 75\% was not enough to control rabies in others \citep{eng1993urban}). -While our primary goal was to understand why estimates of rabies \rzero were small with narrow confidence intervals, our analysis also revealed an interesting biological process through the lense of generation intervals from contact tracing data: the need to account for biting behaviour in the incubation period distribution, in order to match the generation interval distribution. +While our primary goal was to understand why estimates of rabies \rzero were small with narrow confidence intervals, our analysis also revealed an interesting biological process through the lens of generation intervals from contact tracing data: the need to account for biting behaviour in the incubation period distribution, in order to match the generation interval distribution. \rzero is typically used as a first approximation for interventions such as vaccination to determine herd immunity thresholds. However, both heterogeneity in contacts and the correlations between incubation periods and transmission that we observed here through the generation interval suggest that simple \rzero estimation methods may be inadequate and should be used with caution. -Rabies is a nice system to hightlight this effect because transmission events and latent periods are observable and directly recorded with contact tracing. The correlation effect we observed here may be important in other disease systems in general, despit ethe fact that generation intervals are not always observable and uncertainties may be even greater. +Rabies is a nice system to highlight this effect because transmission events and latent periods are observable and directly recorded with contact tracing. The correlation effect we observed here may be important in other disease systems in general, despite the fact that generation intervals are not always observable and uncertainties may be even greater. %\mr{not sure which paper is destined to come out first, or if you're subitting them (i.e., this one an correlations paper) together somehow... I think there could be a more explicit linking between the two papers. I think the sentence Ben suggested you refine could be more positive: you advocate estimating R0 with uncertainty propagation (and also why logistic is more appropriate than exponential, and how general you think that should be. I don't love landing on the mechanistic fitting suggestion; think it would be good to end on strong sentences focused on take-home messages.) Consider moving most of last paragraph up to first in discussion. I think more can be said about how Rabies is a nice model system because latent period and transmission events are so easy to observe, and natural immunity is not thought to exist, but your findings can be applicable in disease systems where the uncertainties in generation interval may be even greater, or something like that.} %\mli{suggstion accepted}