Many ecological responses are counts: fledglings per nest, parasites per host, insects in a trap, or soil invertebrates in a quadrat. A Poisson model is the natural baseline, but it fixes the variance equal to the mean. The location-scale count question is sharper: do predictors change the expected abundance, the extra-Poisson variation around that abundance, or the chance of a separate structural zero?
This article stays inside the currently implemented
drmTMB count surface: univariate NB2 models, fixed-effect
zero-inflated NB2 models, and the first ordinary NB2 mu
random-effect path. The worked example below remains fixed-effect so the
mean, overdispersion, and structural-zero pieces are easy to read before
adding group effects.
The source motivation comes from Nakagawa et al. (2026), who use
location-scale models to discuss heteroscedasticity in continuous,
count, and proportion data. Their count section highlights
negative-binomial models for fledglings, insect colony size, parasites,
and soil invertebrates, and it separates overdispersion from
structural-zero processes. drmTMB uses the same scientific
split, but reports the NB2 scale as public sigma rather
than the native size or precision parameter often written as
theta.
Model Equation And Syntax
For an overdispersed count model, nbinom2() uses
The matching drmTMB syntax is:
drmTMB(
bf(
springtails ~ habitat + moisture + offset(log(trap_nights)),
sigma ~ habitat
),
family = nbinom2(),
data = soil_counts
)Read each parameter before interpreting the fitted model:
| Symbol or syntax | Meaning | In the soil-invertebrate example |
|---|---|---|
,
springtails
|
observed non-negative integer count | springtails found in trap |
,
trap_nights
|
sampling effort exposure | number of nights the trap was active |
offset(log(trap_nights)) |
forces expected count to scale with effort | estimates abundance rate per trap night |
,
mu
|
expected count after accounting for effort | expected springtail abundance in a trap |
| location coefficients on the log mean scale | habitat and moisture effects on expected abundance | |
,
sigma
|
NB2 extra-Poisson scale | variation beyond the Poisson expectation |
scale coefficients on the log sigma scale |
habitat effects on extra-Poisson variation | |
,
theta in some papers |
native NB2 size or precision |
size_i = 1 / sigma_i^2, so larger sigma
means smaller size
|
The sigma slope is a variability slope. If
gamma_1 is the restored-habitat coefficient, then
If a paper reports the native NB2 parameter , the direction reverses:
That reversal is why the tutorial keeps saying sigma: in
drmTMB, larger sigma always means more
modelled variation for this family.
A Soil-Invertebrate Example
Suppose springtails are counted from soil traps in degraded and restored grassland plots. Restoration may increase average abundance, but restored plots may also be more patchy while litter and vegetation structure re-establish. A few bare-soil microsites may be true absences where springtails are not available to be trapped.
This transparent simulation gives us known structure before fitting:
set.seed(106)
n <- 360
soil_counts <- data.frame(
habitat = factor(
rep(c("degraded", "restored"), each = n / 2),
levels = c("degraded", "restored")
),
surface = factor(
sample(c("litter", "bare"), n, replace = TRUE, prob = c(0.72, 0.28)),
levels = c("litter", "bare")
),
moisture = as.numeric(scale(runif(n, 0.15, 0.95))),
trap_nights = sample(2:5, n, replace = TRUE)
)
restored <- as.numeric(soil_counts$habitat == "restored")
bare <- as.numeric(soil_counts$surface == "bare")
rate <- exp(log(2.4) + 0.35 * restored + 0.30 * soil_counts$moisture)
mu <- soil_counts$trap_nights * rate
sigma_nb2 <- exp(-0.75 + 0.40 * restored)
zi <- plogis(-3.2 + 1.6 * bare - 0.25 * restored)
structural_zero <- runif(n) < zi
soil_counts$springtails <- ifelse(
structural_zero,
0L,
rnbinom(n, size = 1 / sigma_nb2^2, mu = mu)
)
head(soil_counts)
#> habitat surface moisture trap_nights springtails
#> 1 degraded litter -1.1665372 3 1
#> 2 degraded bare 1.5939065 4 4
#> 3 degraded litter 0.4203317 5 14
#> 4 degraded litter 0.6364855 5 7
#> 5 degraded litter 1.0379937 4 5
#> 6 degraded litter -1.4584601 2 3Start with the NB2 location-scale model before adding a structural-zero submodel:
fit_nb2 <- drmTMB(
bf(
springtails ~ habitat + moisture + offset(log(trap_nights)),
sigma ~ habitat
),
family = nbinom2(),
data = soil_counts
)Run diagnostics before interpreting coefficients:
check_drm(fit_nb2)
#> <drm_check: 10 checks>
#> ok: 10; notes: 0; warnings: 0; errors: 0
#> check status
#> optimizer_convergence ok
#> optimizer_budget ok
#> finite_objective ok
#> fixed_gradient ok
#> sdreport_status ok
#> hessian_positive_definite ok
#> standard_errors_finite ok
#> dropped_rows ok
#> positive_scale ok
#> fixed_effect_design_size ok
#> value
#> 0
#> iterations=17; function=29; gradient=17
#> 1130.
#> max=0.0007836; component=beta_mu[3]
#> ok
#> TRUE
#> range=[0.04334,0.1013]
#> nobs=360; dropped=0
#> min=0.6499
#> total_mb=0.06026; max_cols=3; largest=mu; largest_class=matrix; largest_density=0.8333
#> message
#> nlminb convergence code is 0.
#> Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#> Objective and log-likelihood are finite.
#> Maximum absolute fixed gradient is <= 0.001; largest component is beta_mu[3].
#> TMB::sdreport() completed successfully.
#> sdreport reports a positive-definite Hessian.
#> All fixed-effect standard errors are finite.
#> No rows were dropped by model-frame or known-covariance filtering.
#> All fitted scale values are finite and positive.
#> Dense fixed-effect design matrices are modest for this fit.The NB2 fit answers two questions. The mu coefficients
describe abundance rates, because the offset has already accounted for
trap effort. The sigma coefficients describe extra-Poisson
variation in the count component:
coef(fit_nb2, "mu")
#> (Intercept) habitatrestored moisture
#> 0.7603134 0.3511186 0.2422656
coef(fit_nb2, "sigma")
#> (Intercept) habitatrestored
#> -0.4309409 0.2758195
sigma_ratio <- exp(coef(fit_nb2, "sigma")["habitatrestored"])
c(
sigma_ratio_restored_vs_degraded = sigma_ratio,
variance_multiplier_at_same_mu = sigma_ratio^2
)
#> sigma_ratio_restored_vs_degraded.habitatrestored
#> 1.317610
#> variance_multiplier_at_same_mu.habitatrestored
#> 1.736096The second value is the multiplier on the quadratic extra-Poisson part when two traps have the same expected count. It is not the full variance ratio whenever also changes, because NB2 variance contains both and .
When Zeros Are A Separate Process
The paper example of soil invertebrates in patchy habitats maps
naturally to a zero-inflated count model. Bare-soil microsites can be
true absences, while litter microsites can still produce ordinary
sampling zeros from the NB2 count component. Add a zi
formula when the biological question needs that split:
In this equation, is the structural-zero probability. It is not a scale parameter and should not be interpreted as overdispersion.
fit_zinb2 <- drmTMB(
bf(
springtails ~ habitat + moisture + offset(log(trap_nights)),
sigma ~ habitat,
zi ~ surface
),
family = nbinom2(),
data = soil_counts
)
check_drm(fit_zinb2)
#> <drm_check: 10 checks>
#> ok: 10; notes: 0; warnings: 0; errors: 0
#> check status
#> optimizer_convergence ok
#> optimizer_budget ok
#> finite_objective ok
#> fixed_gradient ok
#> sdreport_status ok
#> hessian_positive_definite ok
#> standard_errors_finite ok
#> dropped_rows ok
#> positive_scale ok
#> fixed_effect_design_size ok
#> value
#> 0
#> iterations=31; function=41; gradient=32
#> 1105.
#> max=0.00004223; component=beta_sigma[2]
#> ok
#> TRUE
#> range=[0.03636,0.4202]
#> nobs=360; dropped=0
#> min=0.4904
#> total_mb=0.08898; max_cols=3; largest=mu; largest_class=matrix; largest_density=0.8333
#> message
#> nlminb convergence code is 0.
#> Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#> Objective and log-likelihood are finite.
#> Maximum absolute fixed gradient is <= 0.001; largest component is beta_sigma[2].
#> TMB::sdreport() completed successfully.
#> sdreport reports a positive-definite Hessian.
#> All fixed-effect standard errors are finite.
#> No rows were dropped by model-frame or known-covariance filtering.
#> All fitted scale values are finite and positive.
#> Dense fixed-effect design matrices are modest for this fit.The zi coefficients are on the log-odds scale.
Prediction gives the structural-zero probability on the response
scale:
coef(fit_zinb2, "zi")
#> (Intercept) surfacebare
#> -2.786077 1.215586
zero_grid <- data.frame(
habitat = factor(c("degraded", "degraded"), levels = levels(soil_counts$habitat)),
surface = factor(c("litter", "bare"), levels = levels(soil_counts$surface)),
moisture = 0,
trap_nights = 3
)
data.frame(
surface = zero_grid$surface,
structural_zero_probability = predict(fit_zinb2, newdata = zero_grid, dpar = "zi")
)
#> surface structural_zero_probability
#> 1 litter 0.05808118
#> 2 bare 0.17214634For the fitted zero-inflated model,
predict(fit_zinb2, dpar = "mu") returns the conditional NB2
count mean, sigma(fit_zinb2) returns the conditional NB2
overdispersion scale, and fitted(fit_zinb2) returns the
unconditional response mean
:
new_traps <- data.frame(
habitat = factor(c("degraded", "restored"), levels = levels(soil_counts$habitat)),
surface = factor(c("litter", "litter"), levels = levels(soil_counts$surface)),
moisture = c(0, 0),
trap_nights = c(3, 3)
)
mu_hat <- predict(fit_zinb2, newdata = new_traps, dpar = "mu")
sigma_hat <- predict(fit_zinb2, newdata = new_traps, dpar = "sigma")
zi_hat <- predict(fit_zinb2, newdata = new_traps, dpar = "zi")
data.frame(
habitat = new_traps$habitat,
conditional_mean = mu_hat,
sigma = sigma_hat,
structural_zero_probability = zi_hat,
unconditional_mean = (1 - zi_hat) * mu_hat,
unconditional_variance =
(1 - zi_hat) * (mu_hat + sigma_hat^2 * mu_hat^2) +
zi_hat * (1 - zi_hat) * mu_hat^2
)
#> habitat conditional_mean sigma structural_zero_probability
#> 1 degraded 6.924234 0.4903942 0.05808118
#> 2 restored 10.124436 0.6257923 0.05808118
#> unconditional_mean unconditional_variance
#> 1 6.522067 20.00548
#> 2 9.536397 52.95495Use AIC only as one check, not as a substitute for the design story. A zero-inflated model needs a plausible structural-zero process such as bare soil, unsuitable host tissue, unsurveyable habitat, or true absence.
AIC(fit_nb2, fit_zinb2)
#> df AIC
#> fit_nb2 5 2270.559
#> fit_zinb2 7 2223.058Current Boundary
The implemented count path is still intentionally narrow.
Fixed-effect Poisson, NB2, zero-inflated Poisson, zero-inflated NB2,
zero-truncated NB2, and hurdle NB2 models are fitted. Ordinary
non-zero-inflated Poisson and NB2 mu models can also fit
unlabelled random intercepts and independent numeric random slopes such
as (1 | site) and (0 + effort | site). The
zero-inflated, hurdle, truncated, and scale-side count routes remain
fixed-effect only.
drmTMB(
bf(count ~ habitat + offset(log(effort)), sigma ~ habitat, zi ~ surface),
family = nbinom2(),
data = dat
)Do not teach the following as fitted count examples yet:
- correlated or labelled NB2
murandom-slope blocks; - random effects in NB2
sigmaorzi; - random effects in zero-inflated Poisson, zero-inflated NB2, truncated NB2, or hurdle NB2 count components;
-
sd(group) ~ ...random-effect scale models; -
meta_V(V = V)ormeta_known_V(V = V)with counts; -
phylo(),spatial(),animal(), orrelmat()count models; - bivariate or mixed-response count models such as
family = c(gaussian(), nbinom2()); - COM-Poisson underdispersion models.
Those are useful future routes, but each needs its own likelihood path, simulation recovery, diagnostics, and source-map update before it becomes tutorial syntax.