`quivers.diagnostics`¶

Adapter to the ArviZ Bayesian-analysis ecosystem. Converts quivers' inference records into xarray.DataTree objects consumable by every ArviZ 1.x function (plot_trace, plot_forest, plot_ppc, loo, compare, hdi, ...).

diagnostics ¶

Adapter to the ArviZ Bayesian-analysis ecosystem.

Quivers' inference layer produces quivers.inference.MCMCResult records and quivers.inference.guides.base.Guide fits. The canonical 2026 Bayesian-analysis library is ArviZ, whose xarray.DataTree-based data model is the lingua franca for posterior-summary, diagnostic, posterior-predictive, model-comparison, and calibration tooling.

The functions in this subpackage convert quivers fits into ArviZ DataTrees and wrap the canonical ArviZ entry points (arviz.compare, arviz.loo, arviz.hdi, arviz.plot_ppc) with shape-aware, quivers-typed signatures. No posterior-analysis primitives are reimplemented here; ArviZ owns the analytics.

to_datatree ¶

to_datatree(posterior: MCMCResult, *, observed_data: Mapping[str, Tensor] | None = None, posterior_predictive: Mapping[str, Tensor] | None = None, log_likelihood: Mapping[str, Tensor] | None = None, constant_data: Mapping[str, Tensor] | None = None, coords: Mapping[str, list] | None = None, dims: Mapping[str, list[str]] | None = None) -> DataTree

Convert an MCMCResult into an ArviZ-style DataTree.

PARAMETER	DESCRIPTION
`posterior`	Sampler output. `posterior.samples[name]` carries an `(num_chains, num_samples, site_shape)` tensor; each becomes a posterior variable. `log_densities` populates `sample_stats/lp`; `acceptance_rates` and `divergence_counts` populate `sample_stats`. TYPE:* `MCMCResult`
`observed_data`	Site name to observed tensor (the original data used at fit time). Becomes the `observed_data` group. TYPE: `Mapping[str, Tensor]` DEFAULT: `None`
`posterior_predictive`	Site name to posterior-predictive draws of shape `(num_chains, num_samples, site_shape)`. Becomes the `posterior_predictive` group. TYPE:* `Mapping[str, Tensor]` DEFAULT: `None`
`log_likelihood`	Site name to per-observation log-likelihood of shape `(num_chains, num_samples, obs_shape)`. Becomes the `log_likelihood` group; required for `arviz.loo` / `arviz.waic`. TYPE:* `Mapping[str, Tensor]` DEFAULT: `None`
`constant_data`	Site name to fixed covariate tensor (e.g. design matrix). Becomes the `constant_data` group. TYPE: `Mapping[str, Tensor]` DEFAULT: `None`
`coords`	Coordinate values per named axis (e.g. `{"Verb": ["eat", "drink", "run"]}`). Forwarded to ArviZ. TYPE: `Mapping[str, list]` DEFAULT: `None`
`dims`	Per-site axis names (e.g. `{"beta": ["Verb"]}`). Forwarded to ArviZ. TYPE: `Mapping[str, list[str]]` DEFAULT: `None`

RETURNS	DESCRIPTION
`DataTree`	Canonical ArviZ DataTree consumable by every plotting and diagnostic function in the ArviZ 1.x API.

Source code in src/quivers/diagnostics/arviz_io.py

def to_datatree(
    posterior: MCMCResult,
    *,
    observed_data: Mapping[str, torch.Tensor] | None = None,
    posterior_predictive: Mapping[str, torch.Tensor] | None = None,
    log_likelihood: Mapping[str, torch.Tensor] | None = None,
    constant_data: Mapping[str, torch.Tensor] | None = None,
    coords: Mapping[str, list] | None = None,
    dims: Mapping[str, list[str]] | None = None,
) -> xr.DataTree:
    """Convert an `MCMCResult` into an ArviZ-style DataTree.

    Parameters
    ----------
    posterior : MCMCResult
        Sampler output.  ``posterior.samples[name]`` carries an
        ``(num_chains, num_samples, *site_shape)`` tensor; each
        becomes a posterior variable.  ``log_densities`` populates
        ``sample_stats/lp``; ``acceptance_rates`` and
        ``divergence_counts`` populate ``sample_stats``.
    observed_data : Mapping[str, torch.Tensor], optional
        Site name to observed tensor (the original data used at
        fit time).  Becomes the ``observed_data`` group.
    posterior_predictive : Mapping[str, torch.Tensor], optional
        Site name to posterior-predictive draws of shape
        ``(num_chains, num_samples, *site_shape)``.  Becomes the
        ``posterior_predictive`` group.
    log_likelihood : Mapping[str, torch.Tensor], optional
        Site name to per-observation log-likelihood of shape
        ``(num_chains, num_samples, *obs_shape)``.  Becomes the
        ``log_likelihood`` group; required for
        `arviz.loo` / `arviz.waic`.
    constant_data : Mapping[str, torch.Tensor], optional
        Site name to fixed covariate tensor (e.g. design matrix).
        Becomes the ``constant_data`` group.
    coords : Mapping[str, list], optional
        Coordinate values per named axis (e.g.
        ``{"Verb": ["eat", "drink", "run"]}``).  Forwarded to ArviZ.
    dims : Mapping[str, list[str]], optional
        Per-site axis names (e.g. ``{"beta": ["Verb"]}``).
        Forwarded to ArviZ.

    Returns
    -------
    xr.DataTree
        Canonical ArviZ DataTree consumable by every plotting and
        diagnostic function in the ArviZ 1.x API.
    """
    data: dict[str, dict] = {}

    posterior_group: dict = {}
    for name, t in posterior.samples.items():
        posterior_group[name] = _tensor_to_numpy(t)
    data["posterior"] = posterior_group

    sample_stats_group: dict = {
        "lp": _tensor_to_numpy(posterior.log_densities),
    }
    # acceptance_rate and diverging are per-chain scalars in
    # quivers; ArviZ expects per-draw arrays.  We broadcast.
    n_chains = posterior.num_chains
    n_samples = posterior.num_samples
    sample_stats_group["mean_acceptance_per_chain"] = (
        posterior.acceptance_rates.detach()
        .cpu()
        .numpy()
        .reshape(n_chains, 1)
        .repeat(n_samples, axis=1)
    )
    sample_stats_group["total_divergences_per_chain"] = (
        posterior.divergence_counts.detach()
        .cpu()
        .numpy()
        .reshape(n_chains, 1)
        .repeat(n_samples, axis=1)
    )
    data["sample_stats"] = sample_stats_group

    if observed_data:
        data["observed_data"] = {
            name: _tensor_to_numpy(t) for name, t in observed_data.items()
        }
    if posterior_predictive:
        data["posterior_predictive"] = {
            name: _tensor_to_numpy(t) for name, t in posterior_predictive.items()
        }
    if log_likelihood:
        data["log_likelihood"] = {
            name: _tensor_to_numpy(t) for name, t in log_likelihood.items()
        }
    if constant_data:
        data["constant_data"] = {
            name: _tensor_to_numpy(t) for name, t in constant_data.items()
        }

    return az.from_dict(
        data,
        coords=dict(coords) if coords is not None else None,
        dims=dict(dims) if dims is not None else None,
    )

to_datatree_any ¶

to_datatree_any(fit, *, observed_data: Mapping[str, Tensor] | None = None, posterior_predictive: Mapping[str, Tensor] | None = None, log_likelihood: Mapping[str, Tensor] | None = None, constant_data: Mapping[str, Tensor] | None = None, coords: Mapping[str, list] | None = None, dims: Mapping[str, list[str]] | None = None) -> DataTree

Dispatch a DataTree conversion on the fit type.

Accepts:

MCMCResult — routes to the original to_datatree chain / draw semantics.
A plain dict of str -> Tensor (SVI / Predictive draws) — routes to to_datatree_from_svi with a single pseudo-chain.
A tuple (samples, log_densities) — same as the dict form but populates sample_stats/lp from the second entry.

Consumers with a custom fit-container type can add a case here or call the specific to_datatree / to_datatree_from_svi entry point directly. The generic dispatch closes issue #43 by giving SVI users a one-line ArviZ export path.

Source code in src/quivers/diagnostics/arviz_io.py

def to_datatree_any(
    fit,
    *,
    observed_data: Mapping[str, torch.Tensor] | None = None,
    posterior_predictive: Mapping[str, torch.Tensor] | None = None,
    log_likelihood: Mapping[str, torch.Tensor] | None = None,
    constant_data: Mapping[str, torch.Tensor] | None = None,
    coords: Mapping[str, list] | None = None,
    dims: Mapping[str, list[str]] | None = None,
) -> xr.DataTree:
    """Dispatch a DataTree conversion on the fit type.

    Accepts:

    * [`MCMCResult`][quivers.inference.MCMCResult] — routes to the
      original `to_datatree` chain / draw semantics.
    * A plain dict of `str -> Tensor` (SVI / Predictive draws) —
      routes to `to_datatree_from_svi` with a single pseudo-chain.
    * A tuple `(samples, log_densities)` — same as the dict form
      but populates `sample_stats/lp` from the second entry.

    Consumers with a custom fit-container type can add a case here
    or call the specific `to_datatree` / `to_datatree_from_svi`
    entry point directly. The generic dispatch closes issue #43 by
    giving SVI users a one-line ArviZ export path.
    """
    if isinstance(fit, MCMCResult):
        return to_datatree(
            fit,
            observed_data=observed_data,
            posterior_predictive=posterior_predictive,
            log_likelihood=log_likelihood,
            constant_data=constant_data,
            coords=coords,
            dims=dims,
        )
    if isinstance(fit, tuple) and len(fit) == 2:
        samples, log_densities = fit
        return to_datatree_from_svi(
            samples,
            log_densities=log_densities,
            observed_data=observed_data,
            posterior_predictive=posterior_predictive,
            log_likelihood=log_likelihood,
            constant_data=constant_data,
            coords=coords,
            dims=dims,
        )
    if isinstance(fit, Mapping):
        return to_datatree_from_svi(
            fit,
            observed_data=observed_data,
            posterior_predictive=posterior_predictive,
            log_likelihood=log_likelihood,
            constant_data=constant_data,
            coords=coords,
            dims=dims,
        )
    raise TypeError(
        f"to_datatree_any: cannot dispatch on fit type "
        f"{type(fit).__name__}; supported: MCMCResult, "
        "Mapping[str, Tensor] (SVI / Predictive), "
        "tuple[Mapping, Tensor] (samples + log-densities)."
    )

to_datatree_from_svi ¶

to_datatree_from_svi(samples: Mapping[str, Tensor], *, log_densities: Tensor | None = None, observed_data: Mapping[str, Tensor] | None = None, posterior_predictive: Mapping[str, Tensor] | None = None, log_likelihood: Mapping[str, Tensor] | None = None, constant_data: Mapping[str, Tensor] | None = None, coords: Mapping[str, list] | None = None, dims: Mapping[str, list[str]] | None = None) -> DataTree

Convert a set of SVI or Predictive draws into an ArviZ-style DataTree with a single pseudo-chain.

Each entry in samples is a (num_samples, *site_shape) tensor (the standard shape produced by Predictive(model, guide, num_samples=...)). A leading chain dimension of size 1 is inserted so the resulting DataTree matches the (chain, draw, *event) shape ArviZ expects. log_densities, when supplied, becomes sample_stats/lp; ArviZ diagnostic functions (loo, waic, plot_trace) work uniformly on the resulting object.

Source code in src/quivers/diagnostics/arviz_io.py

def to_datatree_from_svi(
    samples: Mapping[str, torch.Tensor],
    *,
    log_densities: torch.Tensor | None = None,
    observed_data: Mapping[str, torch.Tensor] | None = None,
    posterior_predictive: Mapping[str, torch.Tensor] | None = None,
    log_likelihood: Mapping[str, torch.Tensor] | None = None,
    constant_data: Mapping[str, torch.Tensor] | None = None,
    coords: Mapping[str, list] | None = None,
    dims: Mapping[str, list[str]] | None = None,
) -> xr.DataTree:
    """Convert a set of SVI or Predictive draws into an ArviZ-style
    DataTree with a single pseudo-chain.

    Each entry in `samples` is a `(num_samples, *site_shape)` tensor
    (the standard shape produced by `Predictive(model, guide,
    num_samples=...)`). A leading chain dimension of size 1 is
    inserted so the resulting DataTree matches the `(chain, draw,
    *event)` shape ArviZ expects. `log_densities`, when supplied,
    becomes `sample_stats/lp`; ArviZ diagnostic functions (`loo`,
    `waic`, `plot_trace`) work uniformly on the resulting object.
    """
    data: dict[str, dict] = {}

    posterior_group: dict = {}
    for name, t in samples.items():
        arr = _tensor_to_numpy(t)
        if arr.ndim == 0:
            arr = arr.reshape(1)
        posterior_group[name] = arr[None, ...]
    data["posterior"] = posterior_group

    if log_densities is not None:
        lp = _tensor_to_numpy(log_densities)
        if lp.ndim == 0:
            lp = lp.reshape(1)
        data["sample_stats"] = {"lp": lp[None, ...]}

    if observed_data:
        data["observed_data"] = {
            name: _tensor_to_numpy(t) for name, t in observed_data.items()
        }
    if posterior_predictive:
        data["posterior_predictive"] = {
            name: _tensor_to_numpy(t)[None, ...]
            for name, t in posterior_predictive.items()
        }
    if log_likelihood:
        data["log_likelihood"] = {
            name: _tensor_to_numpy(t)[None, ...] for name, t in log_likelihood.items()
        }
    if constant_data:
        data["constant_data"] = {
            name: _tensor_to_numpy(t) for name, t in constant_data.items()
        }

    return az.from_dict(
        data,
        coords=dict(coords) if coords is not None else None,
        dims=dict(dims) if dims is not None else None,
    )

compare ¶

compare(fits: Mapping[str, DataTree], *, method: Literal['stacking', 'BB-pseudo-BMA', 'pseudo-BMA'] = 'stacking', var_name: str | None = None, reference: str | None = None) -> object

Rank candidate models by expected log predictive density.

Delegates to arviz.compare, which computes PSIS-LOO via arviz.loo on each fit's log_likelihood group and combines the resulting arviz.stats.ELPDData records into a ranked comparison table.

PARAMETER	DESCRIPTION
`fits`	Per-model fit, each a DataTree produced by `quivers.diagnostics.to_datatree`. Every fit must carry a `log_likelihood` group; without it `arviz.loo` cannot compute elpd. TYPE: `Mapping[str, DataTree]`
`method`	Stacking weight estimator. Default `"stacking"` follows Yao, Vehtari, Simpson, Gelman 2018. TYPE: `"stacking", "BB-pseudo-BMA", or "pseudo-BMA"` DEFAULT: `'stacking'`
`var_name`	Name of the observed variable in `log_likelihood` to compare on; required when a fit's `log_likelihood` group carries multiple variables. TYPE: `str` DEFAULT: `None`
`reference`	Fit name to use as the reference for elpd-difference comparisons. Default is the top-ranked model. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
`DataFrame`	ArviZ ranking table with columns `rank, elpd_loo, p_loo, se, weight, ...` and one row per model.

Source code in src/quivers/diagnostics/comparison.py

def compare(
    fits: Mapping[str, xr.DataTree],
    *,
    method: Literal["stacking", "BB-pseudo-BMA", "pseudo-BMA"] = "stacking",
    var_name: str | None = None,
    reference: str | None = None,
) -> object:
    """Rank candidate models by expected log predictive density.

    Delegates to `arviz.compare`, which computes PSIS-LOO
    via `arviz.loo` on each fit's ``log_likelihood`` group
    and combines the resulting `arviz.stats.ELPDData` records
    into a ranked comparison table.

    Parameters
    ----------
    fits : Mapping[str, xr.DataTree]
        Per-model fit, each a DataTree produced by
        [`quivers.diagnostics.to_datatree`][quivers.diagnostics.to_datatree].  Every fit must
        carry a ``log_likelihood`` group; without it
        `arviz.loo` cannot compute elpd.
    method : "stacking", "BB-pseudo-BMA", or "pseudo-BMA"
        Stacking weight estimator.  Default ``"stacking"`` follows
        [Yao, Vehtari, Simpson, Gelman 2018](https://doi.org/10.1214/17-BA1091).
    var_name : str, optional
        Name of the observed variable in ``log_likelihood`` to
        compare on; required when a fit's ``log_likelihood`` group
        carries multiple variables.
    reference : str, optional
        Fit name to use as the reference for elpd-difference
        comparisons.  Default is the top-ranked model.

    Returns
    -------
    pandas.DataFrame
        ArviZ ranking table with columns ``rank, elpd_loo, p_loo,
        se, weight, ...`` and one row per model.
    """
    return az.compare(
        dict(fits),
        method=method,
        var_name=var_name,
        reference=reference,
    )

posterior_predictive_check ¶

posterior_predictive_check(idata: DataTree, *, observed_name: str, statistic: str | Callable[[ndarray], float] = 'mean', by: str | None = None) -> Mapping[str, float | ndarray | str]

Compute a posterior-predictive p-value (PPP-value) for a user-chosen test statistic.

PARAMETER	DESCRIPTION
`idata`	Fit produced by `quivers.diagnostics.to_datatree`, with both `observed_data` and `posterior_predictive` groups populated. TYPE: `DataTree`
`observed_name`	Name of the observed site (must appear in both groups). TYPE: `str`
`statistic`	Either a key into `STATISTICS` or a user-supplied `Callable[[np.ndarray], float]`. TYPE: `str or callable` DEFAULT: `'mean'`
`by`	If given, computes the statistic per group along the named dim (e.g. `by="Verb"`); the PPP-value becomes a numpy array of shape `(\|by\|,)`. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
`Mapping[str, float or ndarray]`	`{"observed": T(y), "predictive_mean": E[T(y_rep)], "ppp": Pr(T(y_rep) >= T(y))}`. The PPP-value is the canonical posterior-predictive p-value; values near 0 or 1 indicate model mis-fit on the chosen statistic.

Source code in src/quivers/diagnostics/predictive_checks.py

def posterior_predictive_check(
    idata: xr.DataTree,
    *,
    observed_name: str,
    statistic: str | Callable[[np.ndarray], float] = "mean",
    by: str | None = None,
) -> Mapping[str, float | np.ndarray | str]:
    """Compute a posterior-predictive p-value (PPP-value) for a
    user-chosen test statistic.

    Parameters
    ----------
    idata : xr.DataTree
        Fit produced by [`quivers.diagnostics.to_datatree`][quivers.diagnostics.to_datatree],
        with both ``observed_data`` and ``posterior_predictive``
        groups populated.
    observed_name : str
        Name of the observed site (must appear in both groups).
    statistic : str or callable
        Either a key into `STATISTICS` or a user-supplied
        ``Callable[[np.ndarray], float]``.
    by : str, optional
        If given, computes the statistic per group along the named
        dim (e.g. ``by="Verb"``); the PPP-value becomes a numpy
        array of shape ``(|by|,)``.

    Returns
    -------
    Mapping[str, float or numpy.ndarray]
        ``{"observed": T(y), "predictive_mean": E[T(y_rep)],
        "ppp": Pr(T(y_rep) >= T(y))}``.  The PPP-value is the
        canonical [posterior-predictive p-value](https://en.wikipedia.org/wiki/Posterior_predictive_p-value);
        values near 0 or 1 indicate model mis-fit on the chosen
        statistic.
    """
    if callable(statistic):
        stat_fn = statistic
        stat_name = getattr(statistic, "__name__", "user_statistic")
    else:
        if statistic not in STATISTICS:
            raise ValueError(
                f"unknown statistic {statistic!r}; choices are "
                f"{sorted(STATISTICS)} or a user-supplied callable"
            )
        stat_fn = STATISTICS[statistic]
        stat_name = statistic

    observed = idata["observed_data"][observed_name].values
    pp = idata["posterior_predictive"][observed_name].values
    # pp shape: (chain, draw, *site_shape).  Flatten chains for stat
    # computation across all draws.
    chains, draws = pp.shape[0], pp.shape[1]
    pp_flat = pp.reshape(chains * draws, *pp.shape[2:])

    if by is None:
        t_obs = stat_fn(observed)
        t_rep = np.array([stat_fn(pp_flat[i]) for i in range(pp_flat.shape[0])])
        return {
            "statistic": stat_name,
            "observed": t_obs,
            "predictive_mean": float(t_rep.mean()),
            "ppp": float((t_rep >= t_obs).mean()),
        }

    # Per-group statistic.  Locate the `by` dim in the observed array.
    dims = idata["observed_data"][observed_name].dims
    if by not in dims:
        raise ValueError(
            f"posterior_predictive_check: dim {by!r} not found in "
            f"observed_data/{observed_name} dims {dims}"
        )
    group_axis = dims.index(by)
    observed_groups = np.split(observed, observed.shape[group_axis], axis=group_axis)
    pp_axis = group_axis + 2  # +2 for the leading (chain, draw) axes
    pp_groups = np.split(pp_flat, pp_flat.shape[pp_axis - 1], axis=pp_axis - 1)
    t_obs = np.array([stat_fn(g.squeeze(group_axis)) for g in observed_groups])
    t_rep = np.array(
        [
            np.array([stat_fn(g[i].squeeze(pp_axis - 2)) for i in range(g.shape[0])])
            for g in pp_groups
        ]
    )  # (groups, n_draws)
    return {
        "statistic": stat_name,
        "by": by,
        "observed": t_obs,
        "predictive_mean": t_rep.mean(axis=1),
        "ppp": (t_rep >= t_obs[:, None]).mean(axis=1),
    }

quivers.diagnostics¶

diagnostics ¶

to_datatree ¶

to_datatree_any ¶

to_datatree_from_svi ¶

compare ¶

posterior_predictive_check ¶

`quivers.diagnostics`¶