--- title: "Uncertainty bands for fitted disease progress curves" output: rmarkdown::html_vignette: df_print: paged vignette: > %\VignetteIndexEntry{Uncertainty bands for fitted disease progress curves} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "center", fig.width = 7, fig.height = 4.8 ) ``` ```{r setup} library(epifitter) ``` ## Why add intervals? Fitted disease progress curves are usually shown as a single line, but the line is only an estimate. Confidence bands help communicate the uncertainty around the fitted epidemic curve. This article focuses on uncertainty in the fitted mean curve. The bands shown by `plot_fit()` are not prediction intervals for future observations, and they do not replace a full model-based analysis of treatment effects. They are a visual tool for asking how stable the fitted curve is under residual resampling. In `epifitter`, confidence bands are added directly in `plot_fit()`: ```r plot_fit(fit, conf_int = TRUE, ci_method = "bootstrap", nsim = 500) ``` The argument `conf_int` controls whether the interval is drawn. The argument `ci_method` controls how the interval is estimated. Use these intervals to compare fitted shapes cautiously. If the scientific question is a formal comparison among treatments, use a statistical model that matches the experimental design after defining the target quantity. ## Example data We first simulate one logistic epidemic with moderate observational noise and fit the four linearized epidemic models. The noise is intentionally visible in this example so the confidence bands are easy to inspect. ```{r} set.seed(123) epi <- sim_logistic( N = 30, y0 = 0.01, dt = 5, r = 0.25, alpha = 0.40, n = 1 ) fit <- fit_lin(time = epi$time, y = epi$random_y) head(fit$data) ``` ## Residual bootstrap confidence bands The default bootstrap method resamples residuals from the fitted curve, creates new plausible disease progress curves at the observed assessment times, refits the model many times, and builds the band from the distribution of fitted curves. This approach assumes that the residual pattern is a reasonable basis for generating plausible alternative curves. It is useful for visualization, but it should still be checked against the biology of the system and the assessment design. ```{r fig.alt="Fitted logistic disease progress curve with a bootstrap confidence band."} plot_fit( fit, models = "Logistic", conf_int = TRUE, ci_method = "bootstrap", nsim = 100, seed = 123 ) ``` For final analyses, increase `nsim`, for example: ```r plot_fit(fit, conf_int = TRUE, ci_method = "bootstrap", nsim = 500) ``` Larger values of `nsim` make the interval more stable, but also make the plot slower to compute. ## Wild bootstrap confidence bands The wild bootstrap method randomly changes the sign of resampled residuals before refitting the model. This keeps the original assessment times intact and is available with `ci_method = "wild"`. Wild bootstrap intervals are often useful as a sensitivity check when residual variation changes across the range of fitted values. ```{r fig.alt="Fitted logistic disease progress curve with a wild bootstrap confidence band."} plot_fit( fit, models = "Logistic", conf_int = TRUE, ci_method = "wild", nsim = 100, seed = 123 ) ``` ## Compare multiple models Confidence bands can also be drawn for more than one model at a time. ```{r fig.alt="Fitted exponential, logistic, and Gompertz disease progress curves with bootstrap confidence bands."} plot_fit( fit, models = c("Exponential", "Logistic", "Gompertz"), conf_int = TRUE, ci_method = "bootstrap", nsim = 100, seed = 123, y_bounds = NULL ) ``` Here `y_bounds = NULL` leaves fitted values on their natural model scale. This is useful when comparing model forms because the exponential model can extrapolate above 1, which is a warning sign when disease intensity is a proportion. The fitted curve also does not have to sit in the middle of the band: bootstrap intervals are built from quantiles of refitted curves, so they can be asymmetric when the model is nonlinear or poorly matched to the data. ## Nonlinear fits The same interface works with nonlinear fits returned by `fit_nlin()` and `fit_nlin2()`. ```{r fig.alt="Nonlinear fitted logistic disease progress curve with a bootstrap confidence band."} nlin_fit <- fit_nlin( time = epi$time, y = epi$random_y, starting_par = list(y0 = 0.01, r = 0.1), maxiter = 200 ) plot_fit( nlin_fit, models = "Logistic", conf_int = TRUE, ci_method = "bootstrap", nsim = 100, seed = 123 ) ``` ## Choosing a method Use `ci_method = "bootstrap"` for the main fitted-curve intervals. Use `ci_method = "wild"` as a sensitivity check when you want a second residual bootstrap strategy that preserves the original assessment times. In very smooth datasets, the confidence band may be narrow. Wider bands usually appear when the observations are noisier, when there are fewer assessment times, or when the fitted model is less certain. In both cases, `level` controls the confidence level: ```r plot_fit(fit, conf_int = TRUE, ci_method = "bootstrap", level = 0.90) plot_fit(fit, conf_int = TRUE, ci_method = "bootstrap", level = 0.95) ``` The bands shown by `plot_fit()` are confidence bands for the fitted curve, not prediction intervals for future observations. Report `nsim`, `level`, the fitting method, and the interval method when using these plots in reports or manuscripts. By default, `plot_fit()` keeps fitted curves and bands on the disease proportion scale from 0 to 1. To inspect unconstrained fitted values, use: ```r plot_fit(fit, conf_int = TRUE, y_bounds = NULL) ```