bead.participants

Participant metadata models and collection management.

Models

models

Participant data models.

Participant stores demographic and session metadata. ParticipantIDMapping records the link between an external participant identifier (e.g. Prolific PID) and an internal UUID. The mapping is stored separately so the external id can be deleted for privacy compliance while the internal UUID is retained for analysis.

Participant

Bases: BeadBaseModel

A study participant.

Attributes:

Name Type Description
participant_metadata dict[str, JsonValue]

Demographic and other participant attributes.

study_id str | None

Study identifier.

session_ids tuple[str, ...]

Session identifiers (for longitudinal studies).

consent_timestamp datetime | None

When the participant provided consent.

notes str | None

Free-text notes.

validate_against_spec(spec: ParticipantMetadataSpec) -> tuple[bool, list[str]]

Validate participant_metadata against spec.

Returns (is_valid, error_messages).

get_attribute(key: str, default: JsonValue = None) -> JsonValue

Return participant_metadata[key] if present, else default.

with_attribute(key: str, value: JsonValue) -> Self

Return a new participant with participant_metadata[key] = value.

with_session(session_id: str) -> Self

Return a new participant with session_id appended.

ParticipantIDMapping

Bases: BeadBaseModel

Mapping between an external participant ID and an internal UUID.

Attributes:

Name Type Description
external_id str

External participant identifier (e.g. Prolific PID).

external_source str

Source of the external id ("prolific", "mturk", etc.).

participant_id UUID

Internal participant UUID.

mapping_timestamp datetime

When the mapping was created.

is_active bool

Whether the mapping is active (used for soft-delete).

deactivated() -> Self

Return a new mapping with is_active=False.

Collection

collection

Participant collection with JSONL I/O and DataFrame support.

ParticipantCollection and IDMappingCollection group multiple Participant and ParticipantIDMapping instances respectively, with JSONL serialization and pandas / polars DataFrame conversion.

ParticipantCollection

Bases: BeadBaseModel

Collection of participants.

Attributes:

Name Type Description
name str

Collection name.

participants tuple[Participant, ...]

Member participants.

metadata_spec_name str | None

Name of the metadata spec applied (for documentation).

__len__() -> int

Return the number of participants.

with_participant(participant: Participant) -> Self

Return a new collection with participant appended.

with_participants(participants: tuple[Participant, ...] | list[Participant]) -> Self

Return a new collection with each participant appended.

get_by_id(participant_id: UUID) -> Participant | None

Return the participant whose id matches, or None.

get_by_attribute(key: str, value: JsonValue) -> tuple[Participant, ...]

Return participants whose participant_metadata[key] == value.

validate_all(spec: ParticipantMetadataSpec) -> dict[UUID, list[str]]

Validate every participant against spec.

Returns a mapping from offending participant id to error messages.

to_jsonl(path: Path | str) -> None

Write each participant to path as a JSONL line.

from_jsonl(path: Path | str, name: str = 'loaded_participants') -> ParticipantCollection classmethod

Load participants from path as JSONL.

to_dataframe(backend: Literal['pandas', 'polars'] = 'pandas', include_fields: tuple[str, ...] | None = None, exclude_fields: tuple[str, ...] | None = None, flatten_metadata: bool = True) -> DataFrame

Render the collection as a DataFrame.

Always emits participant_id, created_at, and study_id columns; participant_metadata is flattened by default.

from_dataframe(df: DataFrame, name: str, id_column: str = 'participant_id', metadata_columns: tuple[str, ...] | None = None) -> ParticipantCollection classmethod

Build a collection from a DataFrame of participant rows.

Each row becomes a Participant. The id_column is consumed as the participant UUID when present and parseable; otherwise a new UUID is generated. metadata_columns (if given) restricts which columns flow into participant_metadata.

IDMappingCollection

Bases: BeadBaseModel

Collection of external-to-internal participant ID mappings.

Stored separately from participant data for IRB / privacy compliance.

Attributes:

Name Type Description
name str

Collection name.

source str

Primary external ID source (e.g. "prolific").

mappings tuple[ParticipantIDMapping, ...]

Member mappings.

__len__() -> int

Return the number of mappings.

with_mapping(external_id: str, participant_id: UUID, external_source: str | None = None) -> tuple[Self, ParticipantIDMapping]

Return (new_collection, mapping) with one new mapping appended.

get_participant_id(external_id: str) -> UUID | None

Return the internal UUID for external_id if a live mapping exists.

get_external_id(participant_id: UUID) -> str | None

Return the external id for participant_id if a live mapping exists.

deactivated_all() -> tuple[Self, int]

Return (new_collection, count_deactivated) with all live mappings off.

to_jsonl(path: Path | str) -> None

Write each mapping to path as a JSONL line.

from_jsonl(path: Path | str, name: str = 'loaded_mappings', source: str = 'unknown') -> IDMappingCollection classmethod

Load mappings from path as JSONL.

Merging

merging

Utilities for merging participant metadata with judgment data.

This module provides functions for joining participant metadata with judgment DataFrames for analysis. All functions support both pandas and polars DataFrames, preserving the input type.

merge_participant_metadata(judgments_df: DataFrame, participants: ParticipantCollection, id_column: str = 'participant_id', metadata_columns: list[str] | None = None, how: str = 'left') -> DataFrame

Merge participant metadata into a judgments DataFrame.

Preserves input DataFrame type (pandas in -> pandas out, polars in -> polars out).

Parameters:

Name Type Description Default
judgments_df DataFrame

DataFrame containing judgment data with participant IDs.

required
participants ParticipantCollection

Collection of participants with metadata.

required
id_column str

Column in judgments_df containing participant IDs (default: "participant_id").

'participant_id'
metadata_columns list[str] | None

Specific metadata columns to include. If None, includes all.

None
how str

Merge type: "left", "inner", "outer" (default: "left").

'left'

Returns:

Type Description
DataFrame

Merged DataFrame with participant metadata columns added.

Examples:

>>> import pandas as pd
>>> from bead.participants.collection import ParticipantCollection
>>> from bead.participants.models import Participant
>>> judgments = pd.DataFrame({
...     "participant_id": ["uuid1", "uuid2"],
...     "response": [5, 3],
... })
>>> collection = ParticipantCollection(name="test")
>>> # ... add participants ...
>>> # merged = merge_participant_metadata(judgments, collection)

resolve_external_ids(df: DataFrame, id_mappings: IDMappingCollection, external_id_column: str = 'PROLIFIC_PID', output_column: str = 'participant_id', drop_unresolved: bool = False) -> DataFrame

Resolve external IDs to internal participant UUIDs.

Preserves input DataFrame type.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with external participant IDs.

required
id_mappings IDMappingCollection

Collection of ID mappings.

required
external_id_column str

Column containing external IDs (default: "PROLIFIC_PID").

'PROLIFIC_PID'
output_column str

Column name for resolved UUIDs (default: "participant_id").

'participant_id'
drop_unresolved bool

If True, drop rows with unresolved IDs (default: False).

False

Returns:

Type Description
DataFrame

DataFrame with resolved participant UUIDs.

Examples:

>>> import pandas as pd
>>> from uuid import uuid4
>>> from bead.participants.collection import IDMappingCollection
>>> raw_data = pd.DataFrame({
...     "PROLIFIC_PID": ["ABC123", "DEF456"],
...     "response": [5, 3],
... })
>>> mappings = IDMappingCollection(name="test", source="prolific")
>>> pid = uuid4()
>>> mappings.add_mapping("ABC123", pid)
>>> resolved = resolve_external_ids(raw_data, mappings)
>>> output_column in resolved.columns
True

create_analysis_dataframe(judgments_df: DataFrame, participants: ParticipantCollection, id_mappings: IDMappingCollection | None = None, external_id_column: str | None = None, participant_id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> DataFrame

Create analysis-ready DataFrame with resolved IDs and metadata.

Convenience function that: 1. Resolves external IDs to internal UUIDs (if id_mappings provided) 2. Merges participant metadata 3. Returns a clean DataFrame ready for analysis

Preserves input DataFrame type.

Parameters:

Name Type Description Default
judgments_df DataFrame

Raw judgment data.

required
participants ParticipantCollection

Participant collection with metadata.

required
id_mappings IDMappingCollection | None

ID mappings (required if external_id_column is provided).

None
external_id_column str | None

Column with external IDs to resolve.

None
participant_id_column str

Column with participant IDs (after resolution).

'participant_id'
metadata_columns list[str] | None

Metadata columns to include.

None

Returns:

Type Description
DataFrame

Analysis-ready DataFrame.

Examples:

>>> import pandas as pd
>>> from bead.participants.collection import (
...     ParticipantCollection, IDMappingCollection
... )
>>> raw_judgments = pd.DataFrame({
...     "PROLIFIC_PID": ["ABC123"],
...     "response": [5],
... })
>>> participants = ParticipantCollection(name="test")
>>> mappings = IDMappingCollection(name="test", source="prolific")
>>> # analysis_df = create_analysis_dataframe(
>>> #     raw_judgments,
>>> #     participants,
>>> #     id_mappings=mappings,
>>> #     external_id_column="PROLIFIC_PID",
>>> # )

build_participant_lookup(participants: ParticipantCollection, key_field: str | None = None) -> dict[str, dict[str, str | int | float | bool | None]]

Build a lookup dictionary from participant collection.

Useful for manual merging or custom processing.

Parameters:

Name Type Description Default
participants ParticipantCollection

Collection of participants.

required
key_field str | None

If provided, use this metadata field as the key instead of UUID.

None

Returns:

Type Description
dict[str, dict[str, str | int | float | bool | None]]

Lookup from participant ID (or key_field) to metadata dict.

Examples:

>>> from bead.participants.collection import ParticipantCollection
>>> from bead.participants.models import Participant
>>> collection = ParticipantCollection(name="test")
>>> p = Participant(participant_metadata={"age": 25})
>>> collection.add_participant(p)
>>> lookup = build_participant_lookup(collection)
>>> str(p.id) in lookup
True

Metadata Specification

metadata_spec

Metadata specification for participant attributes.

FieldSpec defines the constraints and display properties for a single participant metadata field. ParticipantMetadataSpec is the schema for the full set of participant fields.

FieldSpec

Bases: BeadBaseModel

Specification for a single metadata field.

Attributes:

Name Type Description
name str

Field name. Must be a valid Python identifier.

field_type Literal['int', 'float', 'str', 'bool']

Data type for the field.

required bool

Whether the field must be supplied.

allowed_values tuple[str | int | float | bool, ...] | None

Exhaustive list of allowed values (categorical fields). None means any value of the correct type is accepted.

range Range[float] | None

Numeric range constraint (numeric fields). Range[int] may be passed since int is a subtype of float for the purpose of bound checking.

label str | None

Display label for forms. None defaults to name title-cased with underscores replaced by spaces.

description str | None

Help text for the field.

validate_value(value: str | int | float | bool | None) -> bool

Return whether value satisfies this spec's constraints.

get_display_label() -> str

Return the display label, falling back to a title-cased name.

ParticipantMetadataSpec

Bases: BeadBaseModel

Schema for participant metadata.

Attributes:

Name Type Description
name str

Spec name (e.g. "prolific_demographics").

version str

Spec version.

fields tuple[FieldSpec, ...]

Field specifications.

get_field(name: str) -> FieldSpec | None

Return the field spec named name, or None.

get_required_fields() -> tuple[FieldSpec, ...]

Return the required field specs.

validate_metadata(metadata: dict[str, str | int | float | bool | None]) -> tuple[bool, list[str]]

Validate metadata against this spec.

Returns (is_valid, error_messages).

to_demographics_config() -> DemographicsConfig

Render the spec as a DemographicsConfig for jsPsych deployment.

validate_field_spec(spec: FieldSpec) -> None

Raise ValueError if spec's constraints contradict its type.

Validates that range is only used with numeric types and that every value in allowed_values matches field_type.