Objectives¶
Variational objectives: ELBO, IWAEBound, RenyiBound, VRIWAEBound, paired with the gradient-estimator strategies in Estimators.
objectives
¶
Variational objectives.
An Objective is a torch.nn.Module-callable that, given
(model, guide, x, observations), returns a scalar loss whose
backward() produces a gradient on the model and guide
parameters. The most common objective is ELBO; tighter
multi-sample bounds (IWAEBound,
RenyiBound, VRIWAEBound) trade compute for
bound-tightness.
Every objective accepts a quivers.inference.estimators.GradientEstimator
strategy that decides how the per-particle log-density tensors
are turned into a scalar loss whose gradient is the chosen
estimator. The default is quivers.inference.estimators.Reparameterized.
Particles are a leading torch axis throughout — no Python loop
over the Monte Carlo dimension. This is critical for performance
when num_particles is moderate (8–64); a Python loop would
multiply the per-step cost by K.
References¶
- Standard ELBO: Kingma & Welling 2013, doi:10.48550/arXiv.1312.6114.
- IWAE: Burda, Grosse & Salakhutdinov 2016, doi:10.48550/arXiv.1509.00519.
- Rényi divergence VI: Li & Turner 2016, doi:10.48550/arXiv.1602.02311.
- VR-IWAE: Daudel, Douc & Roueff 2023, doi:10.48550/arXiv.2210.06226.
Objective
¶
Objective(estimator: GradientEstimator | None = None)
Bases: Module, ABC
Base class for variational objectives.
Subclasses implement forward to return a scalar loss
(negated objective). The estimator attribute is the
gradient-estimation strategy applied to the per-particle
log-densities.
Source code in src/quivers/inference/objectives.py
58 59 60 | |
ELBO
¶
ELBO(num_particles: int = 1, estimator: GradientEstimator | None = None)
Bases: Objective
Evidence lower bound objective.
.. math::
\mathcal{L}_{\mathrm{ELBO}}
= \mathbb{E}_{q_\phi(z)} \bigl[ \log p(z, y) - \log q_\phi(z) \bigr].
Returns the negated ELBO so Objective.forward can be
plugged into a minimizer. num_particles averages independent
Monte-Carlo estimates; num_particles == 1 is the standard
reparameterization-trick ELBO.
| PARAMETER | DESCRIPTION |
|---|---|
num_particles
|
Number of independent guide samples per step. Default
TYPE:
|
estimator
|
Gradient-estimator strategy. Default
TYPE:
|
Source code in src/quivers/inference/objectives.py
153 154 155 156 157 158 159 160 161 | |
IWAEBound
¶
IWAEBound(num_particles: int = 8, estimator: GradientEstimator | None = None)
Bases: Objective
Importance-weighted bound (Burda-Grosse-Salakhutdinov 2016).
.. math::
\mathcal{L}_{\mathrm{IWAE}}
= \mathbb{E}\Bigl[\log \frac{1}{K} \sum_{k=1}^{K}
\frac{p(z_k, y)}{q_\phi(z_k)}\Bigr],
a tighter lower bound on :math:\log p(y) than the ELBO.
Approaches the marginal likelihood as :math:K \to \infty.
The default estimator is DoublyReparameterized
because the naive reparameterized gradient's signal-to-noise
ratio for the inference network collapses as :math:K grows
(Tucker-Lawson-Gu-Maddison 2019).
Source code in src/quivers/inference/objectives.py
208 209 210 211 212 213 214 215 216 217 218 219 220 | |
RenyiBound
¶
RenyiBound(alpha: float = 0.5, num_particles: int = 8, estimator: GradientEstimator | None = None)
Bases: Objective
Rényi α-divergence variational bound (Li-Turner 2016).
.. math::
\mathcal{L}_\alpha = \frac{1}{1 - \alpha}
\log \mathbb{E}_q\Bigl[ \bigl(p(z, y) / q_\phi(z)\bigr)^{1-\alpha}\Bigr].
Recovers the ELBO at :math:\alpha = 1 (in the limit) and
the IWAE bound at :math:\alpha = 0. The interesting regime
is :math:\alpha < 0, which gives an upper bound on
:math:\log p(y) and so a tighter posterior-mode estimate
when the variational family is too narrow.
| PARAMETER | DESCRIPTION |
|---|---|
alpha
|
Divergence order.
TYPE:
|
num_particles
|
Number of guide samples per step.
TYPE:
|
Source code in src/quivers/inference/objectives.py
286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 | |
VRIWAEBound
¶
VRIWAEBound(alpha: float = 0.0, num_particles: int = 8, estimator: GradientEstimator | None = None)
Bases: Objective
Variational Rényi-IWAE bound (Daudel-Douc-Roueff 2023).
Unifies ELBO, IWAEBound, and
RenyiBound into a single bound parameterized by
alpha and num_particles:
.. math::
\mathcal{L}_{\mathrm{VR\text{-}IWAE}}
= \frac{1}{1 - \alpha} \,\log\,
\frac{1}{K} \sum_{k=1}^{K} \Bigl(\frac{p}{q}\Bigr)^{1-\alpha}.
Special cases:
alpha = 0, K > 1→ IWAE bound.alpha = 0, K = 1→ ELBO.alpha != 0, K = 1→ Rényi α-VI.
For intermediate alpha the bound interpolates between
"cheap, biased" (high α) and "expensive, tight" (low α).
Source code in src/quivers/inference/objectives.py
360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 | |