bead.behavioral¶
Behavioral data analytics and extraction utilities for judgment experiment responses.
Analytics¶
analytics
¶
Behavioral analytics models.
Per-judgment behavioral metrics and participant-level summaries linking slopit behavioral data to bead's item-based experimental structure.
The slopit metric classes (KeystrokeMetrics, FocusMetrics,
TimingMetrics, AnalysisFlag) are Pydantic models defined in an
upstream package; they appear here as dict[str, JsonValue] payloads
preserving the slopit field names. Consumers accessing them by attribute
should run slopit.schemas.KeystrokeMetrics.model_validate(payload)
on the dict.
AnalysisFlag
¶
Bases: BeadBaseModel
A single behavioral analysis flag.
Attributes:
| Name | Type | Description |
|---|---|---|
type |
str
|
Flag identifier. |
severity |
Severity
|
Severity level. |
message |
str | None
|
Human-readable description. |
metadata |
dict[str, JsonValue]
|
Flag-specific metadata. |
JudgmentAnalytics
¶
Bases: BeadBaseModel
Behavioral analytics for a single judgment.
Attributes:
| Name | Type | Description |
|---|---|---|
item_id |
UUID
|
Item being judged. |
participant_id |
str
|
Participant identifier. |
trial_index |
int
|
Position in the session (>= 0). |
session_id |
str
|
Slopit session identifier. |
response_value |
JsonValue
|
Participant's response value. |
response_time_ms |
int
|
Response time in milliseconds. |
keystroke_metrics |
dict[str, JsonValue] | None
|
Slopit keystroke-derived metrics. |
focus_metrics |
dict[str, JsonValue] | None
|
Slopit focus / visibility metrics. |
timing_metrics |
dict[str, JsonValue] | None
|
Slopit timing metrics. |
paste_event_count |
int
|
Number of paste events. |
flags |
tuple[AnalysisFlag, ...]
|
Behavioral flags from slopit analyzers. |
max_severity |
Severity | None
|
Maximum severity across flags. |
ParticipantBehavioralSummary
¶
Bases: BeadBaseModel
Aggregated behavioral metrics for one participant.
Attributes:
| Name | Type | Description |
|---|---|---|
participant_id |
str
|
Participant identifier. |
session_id |
str
|
Slopit session identifier. |
total_judgments |
int
|
Total judgments analyzed. |
flagged_judgments |
int
|
Number of judgments with at least one flag. |
mean_response_time_ms |
float
|
Mean response time in milliseconds. |
mean_iki |
float | None
|
Mean inter-keystroke interval. |
total_keystrokes |
int
|
Total keystrokes. |
total_paste_events |
int
|
Total paste events. |
total_blur_events |
int
|
Total window-blur events. |
total_blur_duration_ms |
float
|
Total time spent with the window blurred (ms). |
flag_counts |
dict[str, int]
|
Flag-type histogram. |
max_severity |
Severity | None
|
Maximum severity across the participant's flags. |
AnalyticsCollection
¶
Bases: BeadBaseModel
Collection of judgment analytics.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Collection name. |
analytics |
tuple[JudgmentAnalytics, ...]
|
Per-judgment records. |
__len__() -> int
¶
Return the number of analytics records.
with_analytics(analytics: JudgmentAnalytics) -> Self
¶
Return a new collection with analytics appended.
with_many(analytics_list: tuple[JudgmentAnalytics, ...] | list[JudgmentAnalytics]) -> Self
¶
Return a new collection with each record appended.
get_by_participant(participant_id: str) -> tuple[JudgmentAnalytics, ...]
¶
Return analytics belonging to participant_id.
get_by_item(item_id: UUID) -> tuple[JudgmentAnalytics, ...]
¶
Return analytics for item_id.
filter_flagged(min_severity: Severity | None = None, exclude_flagged: bool = False) -> AnalyticsCollection
¶
Filter analytics by flag status.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_severity
|
Severity | None
|
Include only analytics with at least one flag at this severity
or higher. Severity order is |
None
|
exclude_flagged
|
bool
|
If true, include only unflagged analytics. |
False
|
get_participant_ids() -> tuple[str, ...]
¶
Return the unique participant identifiers in the collection.
get_participant_summaries() -> tuple[ParticipantBehavioralSummary, ...]
¶
Generate one ParticipantBehavioralSummary per participant.
to_jsonl(path: Path | str) -> None
¶
Write analytics to path as JSONLines.
from_jsonl(path: Path | str, name: str = 'loaded_analytics') -> AnalyticsCollection
classmethod
¶
Load analytics from a JSONLines file.
to_dataframe(backend: Literal['pandas', 'polars'] = 'pandas', include_metrics: bool = True, include_flags: bool = True) -> DataFrame
¶
Render the collection as a pandas or polars DataFrame.
Extraction¶
extraction
¶
Behavioral data extraction from slopit sessions.
This module provides functions for extracting per-judgment behavioral analytics from slopit session data, using slopit's IO loaders and analysis pipeline.
extract_from_trial(trial: SlopitTrial, session: SlopitSession, item_id_key: str = 'item_id') -> JudgmentAnalytics | None
¶
Extract behavioral analytics from a single slopit trial.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trial
|
SlopitTrial
|
Slopit trial data. |
required |
session
|
SlopitSession
|
Parent session for participant context. |
required |
item_id_key
|
str
|
Key in platform_data containing the item UUID. |
'item_id'
|
Returns:
| Type | Description |
|---|---|
JudgmentAnalytics | None
|
Analytics record, or None if item_id not found in trial. |
extract_from_session(session: SlopitSession, item_id_key: str = 'item_id') -> list[JudgmentAnalytics]
¶
Extract behavioral analytics from all trials in a slopit session.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
SlopitSession
|
Slopit session containing trial data. |
required |
item_id_key
|
str
|
Key in platform_data containing the item UUID. |
'item_id'
|
Returns:
| Type | Description |
|---|---|
list[JudgmentAnalytics]
|
Analytics records for trials with valid item_id. |
extract_from_file(path: Path | str, item_id_key: str = 'item_id') -> list[JudgmentAnalytics]
¶
Extract behavioral analytics from a slopit session file.
Uses slopit's load_session() to automatically detect format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path | str
|
Path to session file (JSON or JATOS format). |
required |
item_id_key
|
str
|
Key in platform_data containing the item UUID. |
'item_id'
|
Returns:
| Type | Description |
|---|---|
list[JudgmentAnalytics]
|
Analytics records from the session. |
Examples:
>>> analytics = extract_from_file("data/session_001.json")
>>> len(analytics)
50
extract_from_directory(path: Path | str, pattern: str = '*', item_id_key: str = 'item_id', name: str | None = None) -> AnalyticsCollection
¶
Extract behavioral analytics from all session files in a directory.
Uses slopit's load_sessions() to load all files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path | str
|
Directory containing session files. |
required |
pattern
|
str
|
Glob pattern for file matching (default: "*"). |
'*'
|
item_id_key
|
str
|
Key in platform_data containing the item UUID. |
'item_id'
|
name
|
str | None
|
Name for the collection. Defaults to directory name. |
None
|
Returns:
| Type | Description |
|---|---|
AnalyticsCollection
|
Collection of analytics from all sessions. |
Examples:
>>> collection = extract_from_directory("data/jatos_export/")
>>> print(f"Extracted {len(collection)} analytics records")
analyze_sessions(sessions: list[SlopitSession], analyzers: list[Analyzer] | None = None) -> list[SlopitSession]
¶
Run slopit behavioral analyzers on sessions.
Uses slopit's AnalysisPipeline to process sessions with the specified analyzers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sessions
|
list[SlopitSession]
|
Sessions to analyze. |
required |
analyzers
|
list[Analyzer] | None
|
Analyzers to run. If None, uses default set: KeystrokeAnalyzer, FocusAnalyzer, PasteAnalyzer, TimingAnalyzer. |
None
|
Returns:
| Type | Description |
|---|---|
list[SlopitSession]
|
Sessions with analysis flags added. |
Examples:
>>> from slopit import load_sessions
>>> sessions = load_sessions("data/")
>>> analyzed = analyze_sessions(sessions)
>>> # Sessions now have analysis flags populated
extract_with_analysis(path: Path | str, pattern: str = '*', item_id_key: str = 'item_id', analyzers: list[Analyzer] | None = None, name: str | None = None) -> AnalyticsCollection
¶
Load sessions, run analysis, and extract analytics in one step.
Convenience function that combines loading, analysis, and extraction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path | str
|
Path to session file or directory. |
required |
pattern
|
str
|
Glob pattern for directory (default: "*"). |
'*'
|
item_id_key
|
str
|
Key in platform_data containing the item UUID. |
'item_id'
|
analyzers
|
list[Analyzer] | None
|
Analyzers to run. If None, uses default set. |
None
|
name
|
str | None
|
Name for the collection. |
None
|
Returns:
| Type | Description |
|---|---|
AnalyticsCollection
|
Collection with analyzed behavioral data. |
Examples:
>>> collection = extract_with_analysis("data/jatos_export/")
>>> summaries = collection.get_participant_summaries()
>>> for s in summaries:
... if s.flag_rate > 0.1:
... print(f"Participant {s.participant_id}: {s.flag_rate:.1%} flagged")
Merging¶
merging
¶
Utilities for merging behavioral analytics with judgment data.
This module provides functions for joining behavioral analytics with judgment DataFrames for analysis. All functions support both pandas and polars DataFrames, preserving the input type.
merge_behavioral_analytics(judgments_df: DataFrame, analytics: AnalyticsCollection, item_id_column: str = 'item_id', participant_id_column: str = 'participant_id', include_metrics: bool = True, include_flags: bool = True, how: str = 'left') -> DataFrame
¶
Merge behavioral analytics into a judgments DataFrame.
Preserves input DataFrame type (pandas in -> pandas out, polars in -> polars out).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
judgments_df
|
DataFrame
|
DataFrame containing judgment data. |
required |
analytics
|
AnalyticsCollection
|
Collection of behavioral analytics. |
required |
item_id_column
|
str
|
Column in judgments_df containing item IDs (default: "item_id"). |
'item_id'
|
participant_id_column
|
str
|
Column in judgments_df containing participant IDs. |
'participant_id'
|
include_metrics
|
bool
|
If True, include flattened behavioral metrics columns. |
True
|
include_flags
|
bool
|
If True, include flag-related columns. |
True
|
how
|
str
|
Merge type: "left", "inner", "outer" (default: "left"). |
'left'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Merged DataFrame with behavioral analytics columns added. |
Examples:
>>> import pandas as pd
>>> judgments = pd.DataFrame({
... "item_id": ["uuid1", "uuid2"],
... "participant_id": ["p1", "p1"],
... "response": [5, 3],
... })
>>> # merged = merge_behavioral_analytics(judgments, analytics_collection)
filter_flagged_judgments(judgments_df: DataFrame, analytics: AnalyticsCollection, item_id_column: str = 'item_id', participant_id_column: str = 'participant_id', min_severity: Severity | None = None, exclude_flagged: bool = True) -> DataFrame
¶
Filter judgments based on behavioral flags.
Preserves input DataFrame type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
judgments_df
|
DataFrame
|
DataFrame containing judgment data. |
required |
analytics
|
AnalyticsCollection
|
Collection of behavioral analytics. |
required |
item_id_column
|
str
|
Column containing item IDs. |
'item_id'
|
participant_id_column
|
str
|
Column containing participant IDs. |
'participant_id'
|
min_severity
|
Severity | None
|
Minimum severity level for filtering. If None, any flag counts. |
None
|
exclude_flagged
|
bool
|
If True, exclude flagged judgments (default). If False, keep only flagged judgments. |
True
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Filtered DataFrame. |
Examples:
>>> # Keep only unflagged judgments
>>> clean_df = filter_flagged_judgments(judgments, analytics, exclude_flagged=True)
>>> # Keep only high-severity flagged judgments for review
>>> flagged_df = filter_flagged_judgments(
... judgments, analytics, min_severity="high", exclude_flagged=False
... )
create_analysis_dataframe_with_behavior(judgments_df: DataFrame, participants: ParticipantCollection, analytics: AnalyticsCollection, id_mappings: IDMappingCollection | None = None, external_id_column: str | None = None, participant_id_column: str = 'participant_id', item_id_column: str = 'item_id', metadata_columns: list[str] | None = None, include_metrics: bool = True, include_flags: bool = True) -> DataFrame
¶
Create analysis-ready DataFrame with metadata and behavioral analytics.
Combines both participant and behavioral merging in one step. Preserves input DataFrame type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
judgments_df
|
DataFrame
|
Raw judgment data. |
required |
participants
|
ParticipantCollection
|
Participant collection with metadata. |
required |
analytics
|
AnalyticsCollection
|
Behavioral analytics collection. |
required |
id_mappings
|
IDMappingCollection | None
|
ID mappings (required if external_id_column is provided). |
None
|
external_id_column
|
str | None
|
Column with external IDs to resolve. |
None
|
participant_id_column
|
str
|
Column with participant IDs (after resolution). |
'participant_id'
|
item_id_column
|
str
|
Column with item IDs. |
'item_id'
|
metadata_columns
|
list[str] | None
|
Participant metadata columns to include. |
None
|
include_metrics
|
bool
|
If True, include behavioral metrics columns. |
True
|
include_flags
|
bool
|
If True, include flag columns. |
True
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Analysis-ready DataFrame with both metadata and behavioral data. |
Examples:
>>> analysis_df = create_analysis_dataframe_with_behavior(
... judgments,
... participants,
... analytics,
... id_mappings=mappings,
... external_id_column="PROLIFIC_PID",
... )
get_exclusion_list(analytics: AnalyticsCollection, min_flag_rate: float = 0.1, min_severity: Severity | None = None) -> list[str]
¶
Get list of participant IDs that should be excluded based on flags.
Identifies participants with flag rates above the threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
analytics
|
AnalyticsCollection
|
Behavioral analytics collection. |
required |
min_flag_rate
|
float
|
Minimum proportion of flagged judgments for exclusion (default: 0.1). |
0.1
|
min_severity
|
Severity | None
|
Only count flags at or above this severity. |
None
|
Returns:
| Type | Description |
|---|---|
list[str]
|
Participant IDs recommended for exclusion. |
Examples:
>>> exclude = get_exclusion_list(analytics, min_flag_rate=0.2)
>>> clean_df = judgments_df[~judgments_df["participant_id"].isin(exclude)]