quivers.diagnostics.comparison

PSIS-LOO model comparison via arviz.compare.

comparison

Model comparison via arviz.compare.

Thin wrapper around the canonical ArviZ entry point with quivers- typed inputs. No information-criterion math lives here; ArviZ's implementations of PSIS-LOO, WAIC, stacking, and pseudo-BMA+ are the source of truth. See the ArviZ team's Exploratory Analysis of Bayesian Models textbook for the methodology.

compare

compare(fits: Mapping[str, DataTree], *, method: Literal['stacking', 'BB-pseudo-BMA', 'pseudo-BMA'] = 'stacking', var_name: str | None = None, reference: str | None = None) -> object

Rank candidate models by expected log predictive density.

Delegates to arviz.compare, which computes PSIS-LOO via arviz.loo on each fit's log_likelihood group and combines the resulting arviz.stats.ELPDData records into a ranked comparison table.

PARAMETER DESCRIPTION
fits

Per-model fit, each a DataTree produced by quivers.diagnostics.to_datatree. Every fit must carry a log_likelihood group; without it arviz.loo cannot compute elpd.

TYPE: Mapping[str, DataTree]

method

Stacking weight estimator. Default "stacking" follows Yao, Vehtari, Simpson, Gelman 2018.

TYPE: "stacking", "BB-pseudo-BMA", or "pseudo-BMA" DEFAULT: 'stacking'

var_name

Name of the observed variable in log_likelihood to compare on; required when a fit's log_likelihood group carries multiple variables.

TYPE: str DEFAULT: None

reference

Fit name to use as the reference for elpd-difference comparisons. Default is the top-ranked model.

TYPE: str DEFAULT: None

RETURNS DESCRIPTION
DataFrame

ArviZ ranking table with columns rank, elpd_loo, p_loo, se, weight, ... and one row per model.

Source code in src/quivers/diagnostics/comparison.py
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
def compare(
    fits: Mapping[str, xr.DataTree],
    *,
    method: Literal["stacking", "BB-pseudo-BMA", "pseudo-BMA"] = "stacking",
    var_name: str | None = None,
    reference: str | None = None,
) -> object:
    """Rank candidate models by expected log predictive density.

    Delegates to `arviz.compare`, which computes PSIS-LOO
    via `arviz.loo` on each fit's ``log_likelihood`` group
    and combines the resulting `arviz.stats.ELPDData` records
    into a ranked comparison table.

    Parameters
    ----------
    fits : Mapping[str, xr.DataTree]
        Per-model fit, each a DataTree produced by
        [`quivers.diagnostics.to_datatree`][quivers.diagnostics.to_datatree].  Every fit must
        carry a ``log_likelihood`` group; without it
        `arviz.loo` cannot compute elpd.
    method : "stacking", "BB-pseudo-BMA", or "pseudo-BMA"
        Stacking weight estimator.  Default ``"stacking"`` follows
        [Yao, Vehtari, Simpson, Gelman 2018](https://doi.org/10.1214/17-BA1091).
    var_name : str, optional
        Name of the observed variable in ``log_likelihood`` to
        compare on; required when a fit's ``log_likelihood`` group
        carries multiple variables.
    reference : str, optional
        Fit name to use as the reference for elpd-difference
        comparisons.  Default is the top-ranked model.

    Returns
    -------
    pandas.DataFrame
        ArviZ ranking table with columns ``rank, elpd_loo, p_loo,
        se, weight, ...`` and one row per model.
    """
    return az.compare(
        dict(fits),
        method=method,
        var_name=var_name,
        reference=reference,
    )