Skip to contents

Proportion data are common in ecology and evolution, but the measurement process matters. Some proportions are counts of successes out of known trials: germinated seeds out of seeds planted, infected hosts out of hosts checked, or occupied quadrats out of quadrats surveyed. Other proportions are continuous rates already measured on the unit interval, such as leaf damage, vegetation cover, or the fraction of time an animal spends foraging.

This article stays inside the currently implemented drmTMB bounded-response surface:

  • use beta_binomial() for counted successes out of known trials;
  • use beta() for continuous proportions strictly inside (0, 1);
  • keep structural zeros and ones as planned neighbours unless the current family already represents them through binomial sampling.

The source motivation comes from Nakagawa et al. (2026), who treat overdispersed proportions as part of the same location-scale framework as continuous and count responses. They distinguish discrete proportions from continuous proportions, recommend beta-binomial models for successes out of trials when extra-binomial variation is present, and warn that exact 0 or 1 values in continuous beta responses need explicit zero-one processes rather than being hidden by the model.

Choose The Proportion Route

Start from how the response was measured:

Response Example First drmTMB family Boundary status
Counted successes out of trials germinated seeds out of planted seeds beta_binomial() successes can be 0 or all trials through the binomial count process
Continuous proportion strictly inside (0, 1) leaf-area damage, vegetation cover, foraging time fraction beta() exact 0 or 1 is not supported by the strict beta likelihood
Continuous proportion with structural 0 or 1 values 0% disease, 100% cover, complete absence or saturation planned zero-one-inflated beta or ordered beta not a fitted drmTMB tutorial path yet

The distinction is not cosmetic. A denominator-aware model knows that 3 successes out of 6 trials and 30 successes out of 60 trials carry different sampling information. A strict beta model assumes the observation is already a continuous rate and that the boundaries are not part of the observed support.

Beta-Binomial Equation And Syntax

For successes out of known trials, beta_binomial() uses

Yini,piBinomial(ni,pi),piμi,σiBeta(αi,βi),logit(μi)=β0+β1shelteredi+β2moisturei,log(σi)=γ0+γ1shelteredi,ϕi=1/σi2,αi=μiϕi,βi=(1μi)ϕi,E[Yi/ni]=μi,Var(Yi/ni)=μi(1μi)(1+niσi2)ni(1+σi2). \begin{aligned} Y_i \mid n_i, p_i &\sim \operatorname{Binomial}(n_i, p_i),\\ p_i \mid \mu_i, \sigma_i &\sim \operatorname{Beta}(\alpha_i, \beta_i),\\ \operatorname{logit}(\mu_i) &= \beta_0 + \beta_1 \text{sheltered}_i + \beta_2 \text{moisture}_i,\\ \log(\sigma_i) &= \gamma_0 + \gamma_1 \text{sheltered}_i,\\ \phi_i &= 1 / \sigma_i^2,\\ \alpha_i &= \mu_i\phi_i,\\ \beta_i &= (1 - \mu_i)\phi_i,\\ E[Y_i / n_i] &= \mu_i,\\ \operatorname{Var}(Y_i / n_i) &= \frac{\mu_i(1 - \mu_i)(1 + n_i\sigma_i^2)} {n_i(1 + \sigma_i^2)}. \end{aligned}

The matching drmTMB syntax is:

drmTMB(
  bf(cbind(germinated, failed) ~ treatment + moisture, sigma ~ treatment),
  family = beta_binomial(),
  data = seed_trials
)

Read each parameter before interpreting the fitted model:

Symbol or syntax Meaning In the seed-germination example
YiY_i, germinated number of successes seeds that germinated in tray ii
nin_i, germinated + failed known number of trials viable seeds planted in tray ii
μi\mu_i, mu expected success probability germination probability for a seed in that tray
𝛃μ\boldsymbol{\beta}_{\mu} location coefficients on the logit probability scale treatment and moisture effects on germination probability
σi\sigma_i, sigma extra-binomial scale tray-to-tray probability variation beyond binomial sampling
ϕi\phi_i beta precision phi_i = 1 / sigma_i^2, so larger sigma means lower precision
𝛃σ\boldsymbol{\beta}_{\sigma} scale coefficients on the log sigma scale treatment effects on extra-binomial variation

If gamma_1 is the sheltered-treatment coefficient, then

σshelteredσopen=exp(γ1),ϕshelteredϕopen=exp(2γ1). \frac{\sigma_\text{sheltered}}{\sigma_\text{open}} = \exp(\gamma_1), \qquad \frac{\phi_\text{sheltered}}{\phi_\text{open}} = \exp(-2\gamma_1).

That second ratio reverses direction because ϕ=1/σ2\phi = 1 / \sigma^2. In drmTMB, report sigma when you want the direction to mean more or less modelled variation.

Seed Germination Example

Suppose seeds are planted in trays under open and sheltered microsites during a restoration trial. Shelter may increase average germination by reducing heat stress, and it may also make tray outcomes more predictable by buffering microclimate. Each tray has a known number of seeds, so the denominator belongs in the model.

This transparent simulation gives us known structure before fitting:

set.seed(197)
n <- 360
seed_trials <- data.frame(
  treatment = factor(
    rep(c("open", "sheltered"), each = n / 2),
    levels = c("open", "sheltered")
  ),
  moisture = as.numeric(scale(runif(n, 0.1, 0.9))),
  trials = sample(18:32, n, replace = TRUE)
)

sheltered <- as.numeric(seed_trials$treatment == "sheltered")
mu_seed <- plogis(-0.55 + 0.70 * sheltered + 0.45 * seed_trials$moisture)
sigma_seed <- exp(-1.15 - 0.35 * sheltered)
phi_seed <- 1 / sigma_seed^2
tray_probability <- rbeta(
  n,
  shape1 = mu_seed * phi_seed,
  shape2 = (1 - mu_seed) * phi_seed
)

seed_trials$germinated <- rbinom(n, size = seed_trials$trials, prob = tray_probability)
seed_trials$failed <- seed_trials$trials - seed_trials$germinated

head(seed_trials)
#>   treatment    moisture trials germinated failed
#> 1      open  0.79595177     24          4     20
#> 2      open -0.09180242     19          9     10
#> 3      open -1.42799682     29          6     23
#> 4      open -1.54868848     21          3     18
#> 5      open  1.65321214     30         20     10
#> 6      open  0.27586227     20          3     17

Fit the beta-binomial location-scale model:

fit_seed <- drmTMB(
  bf(cbind(germinated, failed) ~ treatment + moisture, sigma ~ treatment),
  family = beta_binomial(),
  data = seed_trials
)

Run diagnostics before interpreting coefficients:

check_drm(fit_seed)
#> <drm_check: 10 checks>
#> ok: 9; notes: 0; warnings: 1; errors: 0
#>                      check  status
#>      optimizer_convergence      ok
#>           optimizer_budget      ok
#>           finite_objective      ok
#>             fixed_gradient warning
#>            sdreport_status      ok
#>  hessian_positive_definite      ok
#>     standard_errors_finite      ok
#>               dropped_rows      ok
#>             positive_scale      ok
#>   fixed_effect_design_size      ok
#>                                                                                   value
#>                                                                                       0
#>                                                 iterations=26; function=37; gradient=27
#>                                                                                   991.7
#>                                                      max=0.001523; component=beta_mu[1]
#>                                                                                      ok
#>                                                                                    TRUE
#>                                                                  range=[0.03677,0.1156]
#>                                                                     nobs=360; dropped=0
#>                                                                              min=0.2520
#>  total_mb=0.06031; max_cols=3; largest=mu; largest_class=matrix; largest_density=0.8333
#>                                                                              message
#>                                                        nlminb convergence code is 0.
#>  Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#>                                             Objective and log-likelihood are finite.
#>         Maximum absolute fixed gradient is > 0.001; largest component is beta_mu[1].
#>                                              TMB::sdreport() completed successfully.
#>                                        sdreport reports a positive-definite Hessian.
#>                                         All fixed-effect standard errors are finite.
#>                   No rows were dropped by model-frame or known-covariance filtering.
#>                                     All fitted scale values are finite and positive.
#>                          Dense fixed-effect design matrices are modest for this fit.

The mu coefficients describe the log-odds of germination. The sigma coefficients describe extra-binomial variation in the latent tray-level success probability:

coef(fit_seed, "mu")
#>        (Intercept) treatmentsheltered           moisture 
#>         -0.5127846          0.6006894          0.4683619
coef(fit_seed, "sigma")
#>        (Intercept) treatmentsheltered 
#>         -1.1893314         -0.1889192

sigma_ratio <- exp(coef(fit_seed, "sigma")["treatmentsheltered"])
c(
  sigma_ratio_sheltered_vs_open = sigma_ratio,
  precision_ratio_sheltered_vs_open = sigma_ratio^(-2)
)
#>     sigma_ratio_sheltered_vs_open.treatmentsheltered 
#>                                            0.8278534 
#> precision_ratio_sheltered_vs_open.treatmentsheltered 
#>                                            1.4591272

For a response-scale summary, predict the expected success probability and the proportion-level standard deviation for trays with the same number of seeds:

new_seed_trays <- data.frame(
  treatment = factor(c("open", "sheltered"), levels = levels(seed_trials$treatment)),
  moisture = c(0, 0),
  trials = c(24, 24)
)

mu_hat <- predict(fit_seed, newdata = new_seed_trays, dpar = "mu")
sigma_hat <- predict(fit_seed, newdata = new_seed_trays, dpar = "sigma")
prop_var <- mu_hat * (1 - mu_hat) *
  (1 + new_seed_trays$trials * sigma_hat^2) /
  (new_seed_trays$trials * (1 + sigma_hat^2))

data.frame(
  treatment = new_seed_trays$treatment,
  trials = new_seed_trays$trials,
  expected_probability = mu_hat,
  expected_successes = new_seed_trays$trials * mu_hat,
  sigma = sigma_hat,
  phi = 1 / sigma_hat^2,
  proportion_sd = sqrt(prop_var)
)
#>   treatment trials expected_probability expected_successes     sigma      phi
#> 1      open     24             0.374541           8.988983 0.3044247 10.79047
#> 2 sheltered     24             0.521962          12.527089 0.2520190 15.74466
#>   proportion_sd
#> 1     0.1697103
#> 2     0.1570892

This table keeps the two parts of the question separate: expected germination probability and extra-binomial scatter around that probability.

Strict Continuous Proportions

Use beta() when the response is a continuous proportion strictly inside the open interval. For example, vegetation cover measured from image analysis may record 0.03, 0.41, or 0.82 rather than successes out of a known number of quadrats. The beta model uses

Yiμi,σiBeta(αi,βi),logit(μi)=β0+β1grazedi+β2moisturei,log(σi)=γ0+γ1grazedi,ϕi=1/σi2,Var(Yi)=μi(1μi)σi21+σi2. \begin{aligned} Y_i \mid \mu_i, \sigma_i &\sim \operatorname{Beta}(\alpha_i, \beta_i),\\ \operatorname{logit}(\mu_i) &= \beta_0 + \beta_1 \text{grazed}_i + \beta_2 \text{moisture}_i,\\ \log(\sigma_i) &= \gamma_0 + \gamma_1 \text{grazed}_i,\\ \phi_i &= 1 / \sigma_i^2,\\ \operatorname{Var}(Y_i) &= \frac{\mu_i(1 - \mu_i)\sigma_i^2}{1 + \sigma_i^2}. \end{aligned}

The fitted syntax is the same one-formula-per-parameter pattern:

drmTMB(
  bf(cover ~ grazing + moisture, sigma ~ grazing),
  family = beta(),
  data = cover_data
)

A small transparent example:

set.seed(198)
n_cover <- 300
cover_data <- data.frame(
  grazing = factor(
    rep(c("ungrazed", "grazed"), each = n_cover / 2),
    levels = c("ungrazed", "grazed")
  ),
  moisture = as.numeric(scale(runif(n_cover, 0.05, 0.95)))
)

grazed <- as.numeric(cover_data$grazing == "grazed")
mu_cover <- plogis(0.35 - 0.75 * grazed + 0.35 * cover_data$moisture)
sigma_cover <- exp(-1.10 + 0.45 * grazed)
phi_cover <- 1 / sigma_cover^2
cover_data$cover <- rbeta(
  n_cover,
  shape1 = mu_cover * phi_cover,
  shape2 = (1 - mu_cover) * phi_cover
)

fit_cover <- drmTMB(
  bf(cover ~ grazing + moisture, sigma ~ grazing),
  family = beta(),
  data = cover_data
)

check_drm(fit_cover)
#> <drm_check: 10 checks>
#> ok: 10; notes: 0; warnings: 0; errors: 0
#>                      check status
#>      optimizer_convergence     ok
#>           optimizer_budget     ok
#>           finite_objective     ok
#>             fixed_gradient     ok
#>            sdreport_status     ok
#>  hessian_positive_definite     ok
#>     standard_errors_finite     ok
#>               dropped_rows     ok
#>             positive_scale     ok
#>   fixed_effect_design_size     ok
#>                                                                                   value
#>                                                                                       0
#>                                                 iterations=28; function=38; gradient=29
#>                                                                                  -93.38
#>                                                     max=0.0007654; component=beta_mu[1]
#>                                                                                      ok
#>                                                                                    TRUE
#>                                                                 range=[0.04484,0.09263]
#>                                                                     nobs=300; dropped=0
#>                                                                              min=0.3531
#>  total_mb=0.05065; max_cols=3; largest=mu; largest_class=matrix; largest_density=0.8333
#>                                                                              message
#>                                                        nlminb convergence code is 0.
#>  Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#>                                             Objective and log-likelihood are finite.
#>        Maximum absolute fixed gradient is <= 0.001; largest component is beta_mu[1].
#>                                              TMB::sdreport() completed successfully.
#>                                        sdreport reports a positive-definite Hessian.
#>                                         All fixed-effect standard errors are finite.
#>                   No rows were dropped by model-frame or known-covariance filtering.
#>                                     All fitted scale values are finite and positive.
#>                          Dense fixed-effect design matrices are modest for this fit.

Here, grazing changes both expected cover and the variation of cover among sampling units:

coef(fit_cover, "mu")
#>   (Intercept) grazinggrazed      moisture 
#>     0.3732633    -0.7837726     0.3590303
coef(fit_cover, "sigma")
#>   (Intercept) grazinggrazed 
#>    -1.0410602     0.3442394

new_cover <- data.frame(
  grazing = factor(c("ungrazed", "grazed"), levels = levels(cover_data$grazing)),
  moisture = c(0, 0)
)

mu_cover_hat <- predict(fit_cover, newdata = new_cover, dpar = "mu")
sigma_cover_hat <- predict(fit_cover, newdata = new_cover, dpar = "sigma")

data.frame(
  grazing = new_cover$grazing,
  expected_cover = mu_cover_hat,
  sigma = sigma_cover_hat,
  phi = 1 / sigma_cover_hat^2,
  cover_sd = sqrt(
    mu_cover_hat * (1 - mu_cover_hat) *
      sigma_cover_hat^2 / (1 + sigma_cover_hat^2)
  )
)
#>    grazing expected_cover     sigma      phi  cover_sd
#> 1 ungrazed      0.5922473 0.3530802 8.021459 0.1636107
#> 2   grazed      0.3987900 0.4981666 4.029497 0.2183348

Do not use this strict beta route if the observed response includes exact 0 or 1 values. A sprayed plot with 0% damage or a quadrat with 100% cover is a boundary-generating process, not an interior beta observation. Those cases need a future zero-one-inflated beta or ordered-beta route, or a deliberate measurement-process decision outside this tutorial.

Current Boundary

The implemented bounded-response path is fixed-effect and univariate. Use beta_binomial() for counted successes and failures, and use beta() for a single continuous proportion that is strictly between 0 and 1:

drmTMB(
  bf(cbind(successes, failures) ~ treatment, sigma ~ treatment),
  family = beta_binomial(),
  data = dat
)
drmTMB(
  bf(proportion ~ treatment, sigma ~ treatment),
  family = beta(),
  data = dat
)

Do not teach the following as fitted proportion examples yet:

  • random effects in mu or sigma;
  • sd(group) ~ ... random-effect scale models;
  • meta_V(V = V), or its meta_known_V(V = V) compatibility alias, with beta or beta-binomial responses;
  • phylo(), spatial(), animal(), or relmat() bounded-response models;
  • bivariate or mixed-response bounded models such as family = c(beta(), gaussian());
  • fixed-effect zoi or coi likelihoods, zero-one-inflated beta, ordered beta, or beta-binomial zero-inflation;
  • denominator shorthand such as successes / trials as a replacement for cbind(successes, failures).

Those are useful future routes, but each needs likelihood code, simulation recovery, diagnostics, and source-map evidence before it becomes tutorial syntax.