bead.behavioral

Behavioral data analytics and extraction utilities for judgment experiment responses.

Analytics

analytics

Behavioral analytics models.

Per-judgment behavioral metrics and participant-level summaries linking slopit behavioral data to bead's item-based experimental structure.

The slopit metric classes (KeystrokeMetrics, FocusMetrics, TimingMetrics, AnalysisFlag) are Pydantic models defined in an upstream package; they appear here as dict[str, JsonValue] payloads preserving the slopit field names. Consumers accessing them by attribute should run slopit.schemas.KeystrokeMetrics.model_validate(payload) on the dict.

AnalysisFlag

Bases: BeadBaseModel

A single behavioral analysis flag.

Attributes:

Name Type Description
type str

Flag identifier.

severity Severity

Severity level.

message str | None

Human-readable description.

metadata dict[str, JsonValue]

Flag-specific metadata.

JudgmentAnalytics

Bases: BeadBaseModel

Behavioral analytics for a single judgment.

Attributes:

Name Type Description
item_id UUID

Item being judged.

participant_id str

Participant identifier.

trial_index int

Position in the session (>= 0).

session_id str

Slopit session identifier.

response_value JsonValue

Participant's response value.

response_time_ms int

Response time in milliseconds.

keystroke_metrics dict[str, JsonValue] | None

Slopit keystroke-derived metrics.

focus_metrics dict[str, JsonValue] | None

Slopit focus / visibility metrics.

timing_metrics dict[str, JsonValue] | None

Slopit timing metrics.

paste_event_count int

Number of paste events.

flags tuple[AnalysisFlag, ...]

Behavioral flags from slopit analyzers.

max_severity Severity | None

Maximum severity across flags.

has_paste_events: bool property

Whether the judgment had any paste events.

is_flagged: bool property

Whether any analysis flags are present.

get_flag_types() -> tuple[str, ...]

Return the flag-type identifiers.

ParticipantBehavioralSummary

Bases: BeadBaseModel

Aggregated behavioral metrics for one participant.

Attributes:

Name Type Description
participant_id str

Participant identifier.

session_id str

Slopit session identifier.

total_judgments int

Total judgments analyzed.

flagged_judgments int

Number of judgments with at least one flag.

mean_response_time_ms float

Mean response time in milliseconds.

mean_iki float | None

Mean inter-keystroke interval.

total_keystrokes int

Total keystrokes.

total_paste_events int

Total paste events.

total_blur_events int

Total window-blur events.

total_blur_duration_ms float

Total time spent with the window blurred (ms).

flag_counts dict[str, int]

Flag-type histogram.

max_severity Severity | None

Maximum severity across the participant's flags.

flag_rate: float property

Proportion of judgments that were flagged (0.0-1.0).

has_paste_events: bool property

Whether the participant had any paste events.

AnalyticsCollection

Bases: BeadBaseModel

Collection of judgment analytics.

Attributes:

Name Type Description
name str

Collection name.

analytics tuple[JudgmentAnalytics, ...]

Per-judgment records.

__len__() -> int

Return the number of analytics records.

with_analytics(analytics: JudgmentAnalytics) -> Self

Return a new collection with analytics appended.

with_many(analytics_list: tuple[JudgmentAnalytics, ...] | list[JudgmentAnalytics]) -> Self

Return a new collection with each record appended.

get_by_participant(participant_id: str) -> tuple[JudgmentAnalytics, ...]

Return analytics belonging to participant_id.

get_by_item(item_id: UUID) -> tuple[JudgmentAnalytics, ...]

Return analytics for item_id.

filter_flagged(min_severity: Severity | None = None, exclude_flagged: bool = False) -> AnalyticsCollection

Filter analytics by flag status.

Parameters:

Name Type Description Default
min_severity Severity | None

Include only analytics with at least one flag at this severity or higher. Severity order is info < low < medium < high.

None
exclude_flagged bool

If true, include only unflagged analytics.

False

get_participant_ids() -> tuple[str, ...]

Return the unique participant identifiers in the collection.

get_participant_summaries() -> tuple[ParticipantBehavioralSummary, ...]

Generate one ParticipantBehavioralSummary per participant.

to_jsonl(path: Path | str) -> None

Write analytics to path as JSONLines.

from_jsonl(path: Path | str, name: str = 'loaded_analytics') -> AnalyticsCollection classmethod

Load analytics from a JSONLines file.

to_dataframe(backend: Literal['pandas', 'polars'] = 'pandas', include_metrics: bool = True, include_flags: bool = True) -> DataFrame

Render the collection as a pandas or polars DataFrame.

Extraction

extraction

Behavioral data extraction from slopit sessions.

This module provides functions for extracting per-judgment behavioral analytics from slopit session data, using slopit's IO loaders and analysis pipeline.

extract_from_trial(trial: SlopitTrial, session: SlopitSession, item_id_key: str = 'item_id') -> JudgmentAnalytics | None

Extract behavioral analytics from a single slopit trial.

Parameters:

Name Type Description Default
trial SlopitTrial

Slopit trial data.

required
session SlopitSession

Parent session for participant context.

required
item_id_key str

Key in platform_data containing the item UUID.

'item_id'

Returns:

Type Description
JudgmentAnalytics | None

Analytics record, or None if item_id not found in trial.

extract_from_session(session: SlopitSession, item_id_key: str = 'item_id') -> list[JudgmentAnalytics]

Extract behavioral analytics from all trials in a slopit session.

Parameters:

Name Type Description Default
session SlopitSession

Slopit session containing trial data.

required
item_id_key str

Key in platform_data containing the item UUID.

'item_id'

Returns:

Type Description
list[JudgmentAnalytics]

Analytics records for trials with valid item_id.

extract_from_file(path: Path | str, item_id_key: str = 'item_id') -> list[JudgmentAnalytics]

Extract behavioral analytics from a slopit session file.

Uses slopit's load_session() to automatically detect format.

Parameters:

Name Type Description Default
path Path | str

Path to session file (JSON or JATOS format).

required
item_id_key str

Key in platform_data containing the item UUID.

'item_id'

Returns:

Type Description
list[JudgmentAnalytics]

Analytics records from the session.

Examples:

>>> analytics = extract_from_file("data/session_001.json")
>>> len(analytics)
50

extract_from_directory(path: Path | str, pattern: str = '*', item_id_key: str = 'item_id', name: str | None = None) -> AnalyticsCollection

Extract behavioral analytics from all session files in a directory.

Uses slopit's load_sessions() to load all files.

Parameters:

Name Type Description Default
path Path | str

Directory containing session files.

required
pattern str

Glob pattern for file matching (default: "*").

'*'
item_id_key str

Key in platform_data containing the item UUID.

'item_id'
name str | None

Name for the collection. Defaults to directory name.

None

Returns:

Type Description
AnalyticsCollection

Collection of analytics from all sessions.

Examples:

>>> collection = extract_from_directory("data/jatos_export/")
>>> print(f"Extracted {len(collection)} analytics records")

analyze_sessions(sessions: list[SlopitSession], analyzers: list[Analyzer] | None = None) -> list[SlopitSession]

Run slopit behavioral analyzers on sessions.

Uses slopit's AnalysisPipeline to process sessions with the specified analyzers.

Parameters:

Name Type Description Default
sessions list[SlopitSession]

Sessions to analyze.

required
analyzers list[Analyzer] | None

Analyzers to run. If None, uses default set: KeystrokeAnalyzer, FocusAnalyzer, PasteAnalyzer, TimingAnalyzer.

None

Returns:

Type Description
list[SlopitSession]

Sessions with analysis flags added.

Examples:

>>> from slopit import load_sessions
>>> sessions = load_sessions("data/")
>>> analyzed = analyze_sessions(sessions)
>>> # Sessions now have analysis flags populated

extract_with_analysis(path: Path | str, pattern: str = '*', item_id_key: str = 'item_id', analyzers: list[Analyzer] | None = None, name: str | None = None) -> AnalyticsCollection

Load sessions, run analysis, and extract analytics in one step.

Convenience function that combines loading, analysis, and extraction.

Parameters:

Name Type Description Default
path Path | str

Path to session file or directory.

required
pattern str

Glob pattern for directory (default: "*").

'*'
item_id_key str

Key in platform_data containing the item UUID.

'item_id'
analyzers list[Analyzer] | None

Analyzers to run. If None, uses default set.

None
name str | None

Name for the collection.

None

Returns:

Type Description
AnalyticsCollection

Collection with analyzed behavioral data.

Examples:

>>> collection = extract_with_analysis("data/jatos_export/")
>>> summaries = collection.get_participant_summaries()
>>> for s in summaries:
...     if s.flag_rate > 0.1:
...         print(f"Participant {s.participant_id}: {s.flag_rate:.1%} flagged")

Merging

merging

Utilities for merging behavioral analytics with judgment data.

This module provides functions for joining behavioral analytics with judgment DataFrames for analysis. All functions support both pandas and polars DataFrames, preserving the input type.

merge_behavioral_analytics(judgments_df: DataFrame, analytics: AnalyticsCollection, item_id_column: str = 'item_id', participant_id_column: str = 'participant_id', include_metrics: bool = True, include_flags: bool = True, how: str = 'left') -> DataFrame

Merge behavioral analytics into a judgments DataFrame.

Preserves input DataFrame type (pandas in -> pandas out, polars in -> polars out).

Parameters:

Name Type Description Default
judgments_df DataFrame

DataFrame containing judgment data.

required
analytics AnalyticsCollection

Collection of behavioral analytics.

required
item_id_column str

Column in judgments_df containing item IDs (default: "item_id").

'item_id'
participant_id_column str

Column in judgments_df containing participant IDs.

'participant_id'
include_metrics bool

If True, include flattened behavioral metrics columns.

True
include_flags bool

If True, include flag-related columns.

True
how str

Merge type: "left", "inner", "outer" (default: "left").

'left'

Returns:

Type Description
DataFrame

Merged DataFrame with behavioral analytics columns added.

Examples:

>>> import pandas as pd
>>> judgments = pd.DataFrame({
...     "item_id": ["uuid1", "uuid2"],
...     "participant_id": ["p1", "p1"],
...     "response": [5, 3],
... })
>>> # merged = merge_behavioral_analytics(judgments, analytics_collection)

filter_flagged_judgments(judgments_df: DataFrame, analytics: AnalyticsCollection, item_id_column: str = 'item_id', participant_id_column: str = 'participant_id', min_severity: Severity | None = None, exclude_flagged: bool = True) -> DataFrame

Filter judgments based on behavioral flags.

Preserves input DataFrame type.

Parameters:

Name Type Description Default
judgments_df DataFrame

DataFrame containing judgment data.

required
analytics AnalyticsCollection

Collection of behavioral analytics.

required
item_id_column str

Column containing item IDs.

'item_id'
participant_id_column str

Column containing participant IDs.

'participant_id'
min_severity Severity | None

Minimum severity level for filtering. If None, any flag counts.

None
exclude_flagged bool

If True, exclude flagged judgments (default). If False, keep only flagged judgments.

True

Returns:

Type Description
DataFrame

Filtered DataFrame.

Examples:

>>> # Keep only unflagged judgments
>>> clean_df = filter_flagged_judgments(judgments, analytics, exclude_flagged=True)
>>> # Keep only high-severity flagged judgments for review
>>> flagged_df = filter_flagged_judgments(
...     judgments, analytics, min_severity="high", exclude_flagged=False
... )

create_analysis_dataframe_with_behavior(judgments_df: DataFrame, participants: ParticipantCollection, analytics: AnalyticsCollection, id_mappings: IDMappingCollection | None = None, external_id_column: str | None = None, participant_id_column: str = 'participant_id', item_id_column: str = 'item_id', metadata_columns: list[str] | None = None, include_metrics: bool = True, include_flags: bool = True) -> DataFrame

Create analysis-ready DataFrame with metadata and behavioral analytics.

Combines both participant and behavioral merging in one step. Preserves input DataFrame type.

Parameters:

Name Type Description Default
judgments_df DataFrame

Raw judgment data.

required
participants ParticipantCollection

Participant collection with metadata.

required
analytics AnalyticsCollection

Behavioral analytics collection.

required
id_mappings IDMappingCollection | None

ID mappings (required if external_id_column is provided).

None
external_id_column str | None

Column with external IDs to resolve.

None
participant_id_column str

Column with participant IDs (after resolution).

'participant_id'
item_id_column str

Column with item IDs.

'item_id'
metadata_columns list[str] | None

Participant metadata columns to include.

None
include_metrics bool

If True, include behavioral metrics columns.

True
include_flags bool

If True, include flag columns.

True

Returns:

Type Description
DataFrame

Analysis-ready DataFrame with both metadata and behavioral data.

Examples:

>>> analysis_df = create_analysis_dataframe_with_behavior(
...     judgments,
...     participants,
...     analytics,
...     id_mappings=mappings,
...     external_id_column="PROLIFIC_PID",
... )

get_exclusion_list(analytics: AnalyticsCollection, min_flag_rate: float = 0.1, min_severity: Severity | None = None) -> list[str]

Get list of participant IDs that should be excluded based on flags.

Identifies participants with flag rates above the threshold.

Parameters:

Name Type Description Default
analytics AnalyticsCollection

Behavioral analytics collection.

required
min_flag_rate float

Minimum proportion of flagged judgments for exclusion (default: 0.1).

0.1
min_severity Severity | None

Only count flags at or above this severity.

None

Returns:

Type Description
list[str]

Participant IDs recommended for exclusion.

Examples:

>>> exclude = get_exclusion_list(analytics, min_flag_rate=0.2)
>>> clean_df = judgments_df[~judgments_df["participant_id"].isin(exclude)]