`quivers.diagnostics.comparison`¶

PSIS-LOO model comparison via arviz.compare.

comparison ¶

Model comparison via arviz.compare.

Thin wrapper around the canonical ArviZ entry point with quivers- typed inputs. No information-criterion math lives here; ArviZ's implementations of PSIS-LOO, WAIC, stacking, and pseudo-BMA+ are the source of truth. See the ArviZ team's Exploratory Analysis of Bayesian Models textbook for the methodology.

compare ¶

compare(fits: Mapping[str, DataTree], *, method: Literal['stacking', 'BB-pseudo-BMA', 'pseudo-BMA'] = 'stacking', var_name: str | None = None, reference: str | None = None) -> object

Rank candidate models by expected log predictive density.

Delegates to arviz.compare, which computes PSIS-LOO via arviz.loo on each fit's log_likelihood group and combines the resulting arviz.stats.ELPDData records into a ranked comparison table.

PARAMETER	DESCRIPTION
`fits`	Per-model fit, each a DataTree produced by `quivers.diagnostics.to_datatree`. Every fit must carry a `log_likelihood` group; without it `arviz.loo` cannot compute elpd. TYPE: `Mapping[str, DataTree]`
`method`	Stacking weight estimator. Default `"stacking"` follows Yao, Vehtari, Simpson, Gelman 2018. TYPE: `"stacking", "BB-pseudo-BMA", or "pseudo-BMA"` DEFAULT: `'stacking'`
`var_name`	Name of the observed variable in `log_likelihood` to compare on; required when a fit's `log_likelihood` group carries multiple variables. TYPE: `str` DEFAULT: `None`
`reference`	Fit name to use as the reference for elpd-difference comparisons. Default is the top-ranked model. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
`DataFrame`	ArviZ ranking table with columns `rank, elpd_loo, p_loo, se, weight, ...` and one row per model.

Source code in src/quivers/diagnostics/comparison.py

def compare(
    fits: Mapping[str, xr.DataTree],
    *,
    method: Literal["stacking", "BB-pseudo-BMA", "pseudo-BMA"] = "stacking",
    var_name: str | None = None,
    reference: str | None = None,
) -> object:
    """Rank candidate models by expected log predictive density.

    Delegates to `arviz.compare`, which computes PSIS-LOO
    via `arviz.loo` on each fit's ``log_likelihood`` group
    and combines the resulting `arviz.stats.ELPDData` records
    into a ranked comparison table.

    Parameters
    ----------
    fits : Mapping[str, xr.DataTree]
        Per-model fit, each a DataTree produced by
        [`quivers.diagnostics.to_datatree`][quivers.diagnostics.to_datatree].  Every fit must
        carry a ``log_likelihood`` group; without it
        `arviz.loo` cannot compute elpd.
    method : "stacking", "BB-pseudo-BMA", or "pseudo-BMA"
        Stacking weight estimator.  Default ``"stacking"`` follows
        [Yao, Vehtari, Simpson, Gelman 2018](https://doi.org/10.1214/17-BA1091).
    var_name : str, optional
        Name of the observed variable in ``log_likelihood`` to
        compare on; required when a fit's ``log_likelihood`` group
        carries multiple variables.
    reference : str, optional
        Fit name to use as the reference for elpd-difference
        comparisons.  Default is the top-ranked model.

    Returns
    -------
    pandas.DataFrame
        ArviZ ranking table with columns ``rank, elpd_loo, p_loo,
        se, weight, ...`` and one row per model.
    """
    return az.compare(
        dict(fits),
        method=method,
        var_name=var_name,
        reference=reference,
    )

quivers.diagnostics.comparison¶

comparison ¶

compare ¶

`quivers.diagnostics.comparison`¶