Package 'epifitter'

Title: Analysis and Simulation of Plant Disease Progress Curves
Description: Tools for analysis, visualization, and simulation of plant disease progress curves. Includes functions to calculate area-under-the-curve summaries, fit and compare exponential, monomolecular, logistic, and Gompertz models using linear or nonlinear regression, work with single or multiple epidemics, and produce 'ggplot2'-based visualizations. Also includes an experimental powdery mildew dataset for reproducible teaching and research workflows. See Madden, Hughes, and van den Bosch (2007) <doi:10.1094/9780890545058> for background on the epidemiological methods.
Authors: Kaique dos S. Alves [aut, cre] (ORCID: <https://orcid.org/0000-0001-9187-0252>), Emerson M. Del Ponte [aut] (ORCID: <https://orcid.org/0000-0003-4398-409X>), Adam H. Sparks [aut] (ORCID: <https://orcid.org/0000-0002-0061-8359>)
Maintainer: Kaique dos S. Alves <[email protected]>
License: MIT + file LICENSE
Version: 1.0.1
Built: 2026-06-20 17:25:50 UTC
Source: https://github.com/alvesks/epifitter

Help Index


Area under the disease progress curve

Description

Calculate the area under a disease progress curve using the trapezoidal method.

Usage

AUDPC(
  time,
  y,
  y_proportion = TRUE,
  type = "absolute",
  aggregate = c("mean", "median", "none")
)

Arguments

time

A numeric vector of assessment times.

y

A numeric vector of disease intensity values.

y_proportion

Logical. Are the 'y' values expressed as proportions?

type

Either '"absolute"' or '"relative"'.

aggregate

How to handle multiple observations at the same time point. The default, '"mean"', averages replicated observations before calculating area. '"median"' uses the median and '"none"' requires unique time values. A warning is issued when repeated time values are aggregated.

Value

A numeric scalar with the AUDPC value.

References

Madden, L. V., Hughes, G., and van den Bosch, F. (2007). The Study of Plant Disease Epidemics. American Phytopathological Society.

Examples

epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.5, n = 1)
AUDPC(time = epi$time, y = epi$y, y_proportion = TRUE)

Estimate AUDPC from two observations

Description

Estimate the area under the disease progress curve from only the initial and final observations under a logistic epidemic assumption.

Usage

AUDPC_2_points(time, y0, yT)

Arguments

time

Time elapsed between the two assessments.

y0

Initial disease intensity as a proportion.

yT

Final disease intensity as a proportion.

Value

A numeric scalar with the estimated AUDPC.

References

Jeger, M. J., and Viljanen-Rollinson, S. L. H. (2001). The use of the area under the disease-progress curve (AUDPC) to assess quantitative disease resistance in crop cultivars. Theoretical and Applied Genetics, 102, 32-40.

Examples

epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.5, n = 1)
AUDPC_2_points(time = epi$time[7], y0 = epi$y[1], yT = epi$y[7])

Area under the disease progress stairs

Description

Calculate the area under the disease progress stairs, an alternative to AUDPC that gives more balanced weight to the first and last observations.

Usage

AUDPS(
  time,
  y,
  y_proportion = TRUE,
  type = "absolute",
  aggregate = c("mean", "median", "none")
)

Arguments

time

A numeric vector of assessment times.

y

A numeric vector of disease intensity values.

y_proportion

Logical. Are the 'y' values expressed as proportions?

type

Either '"absolute"' or '"relative"'.

aggregate

How to handle multiple observations at the same time point. The default, '"mean"', averages replicated observations before calculating area. '"median"' uses the median and '"none"' requires unique time values. A warning is issued when repeated time values are aggregated.

Value

A numeric scalar with the AUDPS value.

References

Simko, I., and Piepho, H.-P. (2012). The area under the disease progress stairs: Calculation, advantage, and application. Phytopathology, 102, 381-389.

Examples

epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.5, n = 1)
AUDPS(time = epi$time, y = epi$y, y_proportion = TRUE)

Exponential model differential equation

Description

Internal helper used by the simulation functions to solve the exponential epidemic model with 'deSolve::ode()'.

Usage

expo_fun(t, y, par)

Arguments

t

Time.

y

State variable.

par

Model parameters.

Value

A list containing the rate of change.


Fit epidemic models using linearization

Description

Fit exponential, monomolecular, logistic, and Gompertz models to disease progress data using linearized forms of each model.

Usage

fit_lin(time, y)

Arguments

time

Numeric vector of assessment times.

y

Numeric vector of disease intensity values.

Value

A list with fit statistics, parameter estimates, and prediction data.

Examples

set.seed(1)
epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.2, n = 4)
fit_lin(time = epi$time, y = epi$random_y)

Fit models to multiple disease progress curves

Description

Apply 'fit_lin()', 'fit_nlin()', or 'fit_nlin2()' to multiple disease progress curves stored in a data frame.

Usage

fit_multi(
  time_col,
  intensity_col,
  data,
  strata_cols = NULL,
  starting_par = list(y0 = 0.01, r = 0.03, K = 0.8),
  maxiter = 500,
  nlin = FALSE,
  estimate_K = FALSE,
  weights_col = NULL,
  weight_method = c("none", "binomial", "mean", "cv", "power"),
  weight_eps = 0.01,
  weight_power = 1
)

Arguments

time_col

Character name specifying the time column.

intensity_col

Character name specifying the disease intensity column.

data

A data frame containing the variables for model fitting.

strata_cols

Character vector specifying grouping columns. Use 'NULL' to fit all rows as a single epidemic. Defaults to 'NULL'.

starting_par

Named list of starting values for model parameters.

maxiter

Maximum number of iterations for nonlinear fitting. Must be a positive number.

nlin

Logical. Should nonlinear fitting be used?

estimate_K

Logical. Should the asymptote 'K' be estimated?

weights_col

Optional character name specifying a column of positive weights for weighted nonlinear least squares. Used only when 'nlin = TRUE'.

weight_method

Weighting strategy passed to [fit_nlin()] or [fit_nlin2()]. Use '"none"' for ordinary nonlinear least squares or '"binomial"', '"mean"', '"cv"', or '"power"' for two-step fitted-value weighting approximations.

weight_eps

Small positive constant used by fitted-value weighting methods to keep weights finite near 0 and 1.

weight_power

Non-negative power used when 'weight_method = "power"'.

Value

A list with grouped parameter estimates and prediction data.

Examples

set.seed(1)
epi1 <- sim_gompertz(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.2, n = 2)
epi2 <- sim_gompertz(N = 30, y0 = 0.01, dt = 5, r = 0.2, alpha = 0.2, n = 2)
data <- dplyr::bind_rows(epi1, epi2, .id = "curve")
fit_multi(time_col = "time", intensity_col = "random_y", data = data, strata_cols = "curve")
fit_multi(time_col = "time", intensity_col = "random_y", data = data)

Fit epidemic models using nonlinear regression

Description

Fit exponential, monomolecular, logistic, and Gompertz models to disease progress data using nonlinear regression.

Usage

fit_nlin(
  time,
  y,
  starting_par = list(y0 = 0.01, r = 0.03),
  maxiter = 50,
  weights = NULL,
  weight_method = c("none", "binomial", "mean", "cv", "power"),
  weight_eps = 0.01,
  weight_power = 1
)

Arguments

time

Numeric vector of assessment times.

y

Numeric vector of disease intensity values.

starting_par

Named list with starting values for 'y0' and 'r'. When omitted or partially specified, 'epifitter' supplies data-driven fallback values.

maxiter

Maximum number of iterations. Must be a positive number.

weights

Optional numeric vector of positive weights, or a function that receives a data frame with columns 'time', 'y', 'predicted', and 'model' and returns positive weights. When supplied, these weights are used directly and 'weight_method' must be '"none"'.

weight_method

Weighting strategy for nonlinear least squares. Use '"none"' for ordinary nonlinear least squares, '"binomial"' for 1/(p^(1p^)+ϵ)1 / (\hat{p}(1 - \hat{p}) + \epsilon), '"mean"' for 1/(p^+ϵ)1 / (\hat{p} + \epsilon), '"cv"' for 1/(p^2+ϵ)1 / (\hat{p}^2 + \epsilon), or '"power"' for 1/(p^2θ+ϵ)1 / (|\hat{p}|^{2\theta} + \epsilon).

weight_eps

Small positive constant added to fitted-value variance approximations to keep weights finite near 0 and 1.

weight_power

Non-negative power θ\theta used only when 'weight_method = "power"'.

Details

Weighted fits use weighted nonlinear least squares, not a binomial or beta likelihood. 'weight_method' options other than '"none"' use a two-step working approximation: first fit the model without weights, derive weights from the fitted values, and then refit the model. Report the selected weighting rule as an assumption.

Value

A list with fit statistics, parameter estimates, and prediction data.

Examples

set.seed(1)
epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.2, n = 4)
fit_nlin(time = epi$time, y = epi$random_y, starting_par = list(y0 = 0.01, r = 0.03))

Fit epidemic models and estimate the asymptote

Description

Fit monomolecular, logistic, and Gompertz epidemic models using nonlinear regression while also estimating the maximum disease intensity parameter 'K'.

Usage

fit_nlin2(
  time,
  y,
  starting_par = list(y0 = 0.01, r = 0.03, K = 0.8),
  maxiter = 50,
  weights = NULL,
  weight_method = c("none", "binomial", "mean", "cv", "power"),
  weight_eps = 0.01,
  weight_power = 1
)

Arguments

time

Numeric vector of assessment times.

y

Numeric vector of disease intensity values.

starting_par

Named list with starting values for 'y0', 'r', and 'K'. When omitted or partially specified, 'epifitter' supplies data-driven fallback values.

maxiter

Maximum number of iterations. Must be a positive number.

weights

Optional numeric vector of positive weights, or a function that receives a data frame with columns 'time', 'y', 'predicted', and 'model' and returns positive weights. When supplied, these weights are used directly and 'weight_method' must be '"none"'.

weight_method

Weighting strategy for nonlinear least squares. Use '"none"' for ordinary nonlinear least squares, '"binomial"' for 1/(p^(1p^)+ϵ)1 / (\hat{p}(1 - \hat{p}) + \epsilon), '"mean"' for 1/(p^+ϵ)1 / (\hat{p} + \epsilon), '"cv"' for 1/(p^2+ϵ)1 / (\hat{p}^2 + \epsilon), or '"power"' for 1/(p^2θ+ϵ)1 / (|\hat{p}|^{2\theta} + \epsilon).

weight_eps

Small positive constant added to fitted-value variance approximations to keep weights finite near 0 and 1.

weight_power

Non-negative power θ\theta used only when 'weight_method = "power"'.

Details

Weighted fits use weighted nonlinear least squares, not a binomial or beta likelihood. 'weight_method' options other than '"none"' use a two-step working approximation: first fit the model without weights, derive weights from the fitted values, and then refit the model. Report the selected weighting rule as an assumption.

Value

A list with fit statistics, parameter estimates, and prediction data.

Examples

set.seed(1)
epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.5, n = 4)
fit_nlin2(
  time = epi$time,
  y = epi$random_y * 0.8,
  starting_par = list(y0 = 0.01, r = 0.1, K = 0.8),
  maxiter = 1024
)

Gompertz model differential equation

Description

Internal helper used by the simulation functions to solve the Gompertz epidemic model with 'deSolve::ode()'.

Usage

gompi_fun(t, y, par)

Arguments

t

Time.

y

State variable.

par

Model parameters.

Value

A list containing the rate of change.


Logistic model differential equation

Description

Internal helper used by the simulation functions to solve the logistic epidemic model with 'deSolve::ode()': dy/dt=ry(1y/K)dy/dt = r y (1 - y / K).

Usage

logi_fun(t, y, par)

Arguments

t

Time.

y

State variable.

par

Model parameters.

Value

A list containing the rate of change.


Monomolecular model differential equation

Description

Internal helper used by the simulation functions to solve the monomolecular epidemic model with 'deSolve::ode()'.

Usage

mono_fun(t, y, par)

Arguments

t

Time.

y

State variable.

par

Model parameters.

Value

A list containing the rate of change.


Plot fitted epidemic models

Description

Create a faceted 'ggplot2' panel showing observed and fitted values for the selected epidemic models. Optionally, add confidence bands around the fitted curves.

Usage

plot_fit(
  object,
  point_size = 1.2,
  line_size = 1,
  models = c("Exponential", "Monomolecular", "Logistic", "Gompertz"),
  conf_int = FALSE,
  ci_method = c("bootstrap", "wild"),
  nsim = 500,
  level = 0.95,
  seed = NULL,
  n_grid = 100,
  ci_alpha = 0.2,
  y_bounds = c(0, 1)
)

Arguments

object

A fitted object returned by 'fit_lin()', 'fit_nlin()', or 'fit_nlin2()'.

point_size

Point size for observed values.

line_size

Line width for fitted curves.

models

Character vector with the models to display.

conf_int

Logical. If 'TRUE', draw confidence bands around fitted curves.

ci_method

Method used to estimate confidence bands. Use '"bootstrap"' for residual bootstrap intervals, or '"wild"' for wild residual bootstrap intervals. The older '"case"' option is accepted as a deprecated alias for '"wild"'.

nsim

Number of bootstrap samples used when 'conf_int = TRUE'.

level

Confidence level for the interval.

seed

Optional random seed used for interval estimation.

n_grid

Number of time points used to draw fitted curves and confidence bands.

ci_alpha

Transparency of the confidence band.

y_bounds

Numeric vector of length two used to constrain plotted fitted values and confidence bands. The default keeps disease intensity on the usual proportion scale from 0 to 1. Use 'NULL' to show unconstrained fitted values.

Value

A 'ggplot2' object.

Examples

epi <- sim_logistic(N = 30, y0 = 0.01, dt = 5, r = 0.3, alpha = 0.2, n = 4)
fit <- fit_lin(time = epi$time, y = epi$random_y)
plot_fit(fit)

plot_fit(fit, conf_int = TRUE, nsim = 100)

Powdery mildew disease progress curves in organic tomato

Description

Experimental disease progress curve data for powdery mildew under different irrigation systems and soil moisture levels in organic tomato.

Format

A data frame with 240 rows and 5 variables:

irrigation_type

Irrigation system.

moisture

Soil moisture level.

block

Experimental block.

time

Assessment time.

sev

Disease severity as a proportion.

References

Lage, D. A. C., Marouelli, W. A., and Cafe-Filho, A. C. (2019). Management of powdery mildew and behaviour of late blight under different irrigation configurations in organic tomato. Crop Protection, 125, 104886.

Examples

data("PowderyMildew")
str(PowderyMildew)

Print fitted model summaries

Description

Print method for objects returned by [fit_lin()] and compatible fitters.

Usage

## S3 method for class 'fit_lin'
print(x, ...)

Arguments

x

An object produced by [fit_lin()] or [fit_nlin()].

...

Further arguments passed to [print()].


Print fitted model summaries with asymptote estimates

Description

Print method for objects returned by [fit_nlin2()].

Usage

## S3 method for class 'fit_nlin2'
print(x, ...)

Arguments

x

An object produced by [fit_nlin2()].

...

Further arguments passed to [print()].


Simulate an exponential disease progress curve

Description

Simulate disease progress data under the exponential epidemic model, with optional replicated observations.

Usage

sim_exponential(N = 10, dt = 1, y0 = 0.01, r, n, alpha = 0.2)

Arguments

N

Total epidemic duration. Must be positive.

dt

Time interval between assessments. Must be positive and less than or equal to 'N'.

y0

Initial disease intensity as a proportion, strictly between 0 and 1.

r

Apparent infection rate. Must be positive.

n

Number of replicated curves. Must be a positive whole number.

alpha

Non-negative noise level applied to replicated observations.

Value

A data frame with simulated disease progress values and replicated noisy observations.

Examples

sim_exponential(N = 30, dt = 5, y0 = 0.01, r = 0.05, n = 4)

Simulate a Gompertz disease progress curve

Description

Simulate disease progress data under the Gompertz epidemic model, with optional replicated observations.

Usage

sim_gompertz(N = 10, dt = 1, y0 = 0.01, r, K = 1, n, alpha = 0.2)

Arguments

N

Total epidemic duration. Must be positive.

dt

Time interval between assessments. Must be positive and less than or equal to 'N'.

y0

Initial disease intensity as a proportion, strictly between 0 and 1.

r

Apparent infection rate. Must be positive.

K

Maximum disease intensity as a proportion. Must be greater than or equal to 'y0' and less than or equal to 1.

n

Number of replicated curves. Must be a positive whole number.

alpha

Non-negative noise level applied to replicated observations.

Value

A data frame with simulated disease progress values and replicated noisy observations.

Examples

sim_gompertz(N = 30, dt = 5, y0 = 0.01, r = 0.05, K = 1, n = 4)

Simulate a logistic disease progress curve

Description

Simulate disease progress data under the logistic epidemic model, with optional replicated observations.

Usage

sim_logistic(N = 10, dt = 1, y0 = 0.01, r, K = 1, n, alpha = 0.2)

Arguments

N

Total epidemic duration. Must be positive.

dt

Time interval between assessments. Must be positive and less than or equal to 'N'.

y0

Initial disease intensity as a proportion, strictly between 0 and 1.

r

Apparent infection rate. Must be positive.

K

Maximum disease intensity as a proportion. Must be greater than or equal to 'y0' and less than or equal to 1.

n

Number of replicated curves. Must be a positive whole number.

alpha

Non-negative noise level applied to replicated observations.

Value

A data frame with simulated disease progress values and replicated noisy observations.

Examples

sim_logistic(N = 30, dt = 5, y0 = 0.01, r = 0.05, K = 1, n = 4)

Simulate a monomolecular disease progress curve

Description

Simulate disease progress data under the monomolecular epidemic model, with optional replicated observations.

Usage

sim_monomolecular(N = 10, dt = 1, y0 = 0.01, r, K = 1, n, alpha = 0.2)

Arguments

N

Total epidemic duration. Must be positive.

dt

Time interval between assessments. Must be positive and less than or equal to 'N'.

y0

Initial disease intensity as a proportion, strictly between 0 and 1.

r

Apparent infection rate. Must be positive.

K

Maximum disease intensity as a proportion. Must be greater than or equal to 'y0' and less than or equal to 1.

n

Number of replicated curves. Must be a positive whole number.

alpha

Non-negative noise level applied to replicated observations.

Value

A data frame with simulated disease progress values and replicated noisy observations.

Examples

sim_monomolecular(N = 30, dt = 5, y0 = 0.01, r = 0.05, K = 1, n = 4)