Objectives

Variational objectives: ELBO, IWAEBound, RenyiBound, VRIWAEBound, paired with the gradient-estimator strategies in Estimators.

objectives

Variational objectives.

An Objective is a torch.nn.Module-callable that, given (model, guide, x, observations), returns a scalar loss whose backward() produces a gradient on the model and guide parameters. The most common objective is ELBO; tighter multi-sample bounds (IWAEBound, RenyiBound, VRIWAEBound) trade compute for bound-tightness.

Every objective accepts a quivers.inference.estimators.GradientEstimator strategy that decides how the per-particle log-density tensors are turned into a scalar loss whose gradient is the chosen estimator. The default is quivers.inference.estimators.Reparameterized.

Particles are a leading torch axis throughout — no Python loop over the Monte Carlo dimension. This is critical for performance when num_particles is moderate (8–64); a Python loop would multiply the per-step cost by K.

References

  • Standard ELBO: Kingma & Welling 2013, doi:10.48550/arXiv.1312.6114.
  • IWAE: Burda, Grosse & Salakhutdinov 2016, doi:10.48550/arXiv.1509.00519.
  • Rényi divergence VI: Li & Turner 2016, doi:10.48550/arXiv.1602.02311.
  • VR-IWAE: Daudel, Douc & Roueff 2023, doi:10.48550/arXiv.2210.06226.

Objective

Objective(estimator: GradientEstimator | None = None)

Bases: Module, ABC

Base class for variational objectives.

Subclasses implement forward to return a scalar loss (negated objective). The estimator attribute is the gradient-estimation strategy applied to the per-particle log-densities.

Source code in src/quivers/inference/objectives.py
58
59
60
def __init__(self, estimator: GradientEstimator | None = None) -> None:
    super().__init__()
    self.estimator = estimator if estimator is not None else Reparameterized()

ELBO

ELBO(num_particles: int = 1, estimator: GradientEstimator | None = None)

Bases: Objective

Evidence lower bound objective.

.. math::

\mathcal{L}_{\mathrm{ELBO}}
    = \mathbb{E}_{q_\phi(z)} \bigl[ \log p(z, y) - \log q_\phi(z) \bigr].

Returns the negated ELBO so Objective.forward can be plugged into a minimizer. num_particles averages independent Monte-Carlo estimates; num_particles == 1 is the standard reparameterization-trick ELBO.

PARAMETER DESCRIPTION
num_particles

Number of independent guide samples per step. Default 1.

TYPE: int DEFAULT: 1

estimator

Gradient-estimator strategy. Default Reparameterized.

TYPE: GradientEstimator DEFAULT: None

Source code in src/quivers/inference/objectives.py
153
154
155
156
157
158
159
160
161
def __init__(
    self,
    num_particles: int = 1,
    estimator: GradientEstimator | None = None,
) -> None:
    super().__init__(estimator=estimator)
    if num_particles < 1:
        raise ValueError(f"ELBO: num_particles must be >= 1, got {num_particles}")
    self.num_particles = num_particles

IWAEBound

IWAEBound(num_particles: int = 8, estimator: GradientEstimator | None = None)

Bases: Objective

Importance-weighted bound (Burda-Grosse-Salakhutdinov 2016).

.. math::

\mathcal{L}_{\mathrm{IWAE}}
    = \mathbb{E}\Bigl[\log \frac{1}{K} \sum_{k=1}^{K}
        \frac{p(z_k, y)}{q_\phi(z_k)}\Bigr],

a tighter lower bound on :math:\log p(y) than the ELBO. Approaches the marginal likelihood as :math:K \to \infty.

The default estimator is DoublyReparameterized because the naive reparameterized gradient's signal-to-noise ratio for the inference network collapses as :math:K grows (Tucker-Lawson-Gu-Maddison 2019).

Source code in src/quivers/inference/objectives.py
208
209
210
211
212
213
214
215
216
217
218
219
220
def __init__(
    self,
    num_particles: int = 8,
    estimator: GradientEstimator | None = None,
) -> None:
    if estimator is None:
        estimator = DoublyReparameterized()
    super().__init__(estimator=estimator)
    if num_particles < 1:
        raise ValueError(
            f"IWAEBound: num_particles must be >= 1, got {num_particles}"
        )
    self.num_particles = num_particles

RenyiBound

RenyiBound(alpha: float = 0.5, num_particles: int = 8, estimator: GradientEstimator | None = None)

Bases: Objective

Rényi α-divergence variational bound (Li-Turner 2016).

.. math::

\mathcal{L}_\alpha = \frac{1}{1 - \alpha}
    \log \mathbb{E}_q\Bigl[ \bigl(p(z, y) / q_\phi(z)\bigr)^{1-\alpha}\Bigr].

Recovers the ELBO at :math:\alpha = 1 (in the limit) and the IWAE bound at :math:\alpha = 0. The interesting regime is :math:\alpha < 0, which gives an upper bound on :math:\log p(y) and so a tighter posterior-mode estimate when the variational family is too narrow.

PARAMETER DESCRIPTION
alpha

Divergence order. alpha != 1; values close to 1 may be numerically unstable.

TYPE: float DEFAULT: 0.5

num_particles

Number of guide samples per step.

TYPE: int DEFAULT: 8

Source code in src/quivers/inference/objectives.py
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
def __init__(
    self,
    alpha: float = 0.5,
    num_particles: int = 8,
    estimator: GradientEstimator | None = None,
) -> None:
    super().__init__(estimator=estimator)
    if alpha == 1.0:
        raise ValueError(
            "RenyiBound: alpha == 1.0 recovers the ELBO in the "
            "limit but is numerically singular here. Use the "
            "ELBO objective instead."
        )
    if num_particles < 1:
        raise ValueError(
            f"RenyiBound: num_particles must be >= 1, got {num_particles}"
        )
    self.alpha = alpha
    self.num_particles = num_particles

VRIWAEBound

VRIWAEBound(alpha: float = 0.0, num_particles: int = 8, estimator: GradientEstimator | None = None)

Bases: Objective

Variational Rényi-IWAE bound (Daudel-Douc-Roueff 2023).

Unifies ELBO, IWAEBound, and RenyiBound into a single bound parameterized by alpha and num_particles:

.. math::

\mathcal{L}_{\mathrm{VR\text{-}IWAE}}
    = \frac{1}{1 - \alpha} \,\log\,
      \frac{1}{K} \sum_{k=1}^{K} \Bigl(\frac{p}{q}\Bigr)^{1-\alpha}.

Special cases:

  • alpha = 0, K > 1 → IWAE bound.
  • alpha = 0, K = 1 → ELBO.
  • alpha != 0, K = 1 → Rényi α-VI.

For intermediate alpha the bound interpolates between "cheap, biased" (high α) and "expensive, tight" (low α).

Source code in src/quivers/inference/objectives.py
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
def __init__(
    self,
    alpha: float = 0.0,
    num_particles: int = 8,
    estimator: GradientEstimator | None = None,
) -> None:
    super().__init__(estimator=estimator)
    if alpha == 1.0:
        raise ValueError("VRIWAEBound: alpha == 1.0 is singular. Use ELBO instead.")
    if num_particles < 1:
        raise ValueError(
            f"VRIWAEBound: num_particles must be >= 1, got {num_particles}"
        )
    self.alpha = alpha
    self.num_particles = num_particles