bead.participants¶

Participant metadata models and collection management.

Models¶

`models` ¶

Participant data models.

This module provides Participant and ParticipantIDMapping models for storing participant information with privacy-preserving external ID mapping.

`Participant` ¶

Bases: BeadBaseModel

A study participant with demographic and session metadata.

Inherits UUID, timestamps, version, and metadata from BeadBaseModel. The internal id (UUID) is used for all analysis; external IDs (e.g., Prolific IDs) are stored separately for privacy.

Attributes:

Name	Type	Description
`id`	`UUID`	Internal unique identifier (UUIDv7, inherited from BeadBaseModel).
`created_at`	`datetime`	When participant record was created (inherited).
`modified_at`	`datetime`	When participant record was last modified (inherited).
`participant_metadata`	`dict[str, JsonValue]`	Demographic and other participant attributes (e.g., age, education). Keys should match a ParticipantMetadataSpec for validation.
`study_id`	`str \| None`	Optional study identifier this participant belongs to.
`session_ids`	`list[str]`	Session identifiers for this participant (for longitudinal studies).
`consent_timestamp`	`datetime \| None`	When participant provided consent.
`notes`	`str \| None`	Free-text notes about this participant.

Examples:

>>> participant = Participant(
...     participant_metadata={
...         "age": 25,
...         "education": "bachelors",
...         "native_speaker": True,
...     },
...     study_id="study_001",
... )
>>> participant.participant_metadata["age"]
25
>>> str(participant.id)
'019...'  # UUIDv7

`validate_against_spec(spec: ParticipantMetadataSpec) -> tuple[bool, list[str]]` ¶

Validate participant_metadata against a specification.

Parameters:

Name	Type	Description	Default
`spec`	`ParticipantMetadataSpec`	Specification to validate against.	required

Returns:

Type	Description
`tuple[bool, list[str]]`	(is_valid, list of error messages)

Examples:

>>> from bead.participants.metadata_spec import (
...     FieldSpec, ParticipantMetadataSpec
... )
>>> spec = ParticipantMetadataSpec(
...     name="test",
...     fields=[FieldSpec(name="age", field_type="int", required=True)]
... )
>>> p = Participant(participant_metadata={"age": 25})
>>> p.validate_against_spec(spec)
(True, [])

`get_attribute(key: str, default: JsonValue = None) -> JsonValue` ¶

Get a metadata attribute with optional default.

Parameters:

Name	Type	Description	Default
`key`	`str`	Attribute name.	required
`default`	`JsonValue`	Default value if attribute not found.	`None`

Returns:

Type	Description
`JsonValue`	Attribute value or default.

Examples:

>>> p = Participant(participant_metadata={"age": 25})
>>> p.get_attribute("age")
25
>>> p.get_attribute("unknown", default="N/A")
'N/A'

`set_attribute(key: str, value: JsonValue) -> None` ¶

Set a metadata attribute.

Parameters:

Name	Type	Description	Default
`key`	`str`	Attribute name.	required
`value`	`JsonValue`	Attribute value.	required

Examples:

>>> p = Participant()
>>> p.set_attribute("age", 25)
>>> p.participant_metadata["age"]
25

`add_session(session_id: str) -> None` ¶

Add a session ID to this participant.

Parameters:

Name	Type	Description	Default
`session_id`	`str`	Session identifier to add.	required

Examples:

>>> p = Participant()
>>> p.add_session("session_001")
>>> p.session_ids
['session_001']

`ParticipantIDMapping` ¶

Bases: BeadBaseModel

Mapping between external participant IDs and internal UUIDs.

This model is stored SEPARATELY from participant data for IRB/privacy compliance. The external ID (e.g., Prolific PID) can be deleted while retaining the internal UUID for analysis.

Attributes:

Name	Type	Description
`id`	`UUID`	Unique identifier for this mapping record (inherited).
`external_id`	`str`	External participant identifier (e.g., Prolific PID).
`external_source`	`str`	Source of the external ID (e.g., "prolific", "mturk", "sona").
`participant_id`	`UUID`	Internal participant UUID (references Participant.id).
`mapping_timestamp`	`datetime`	When this mapping was created.
`is_active`	`bool`	Whether this mapping is active (for soft deletion).

Examples:

>>> from uuid import UUID
>>> mapping = ParticipantIDMapping(
...     external_id="PROLIFIC_ABC123",
...     external_source="prolific",
...     participant_id=UUID("01234567-89ab-cdef-0123-456789abcdef"),
... )
>>> mapping.external_source
'prolific'

`validate_non_empty(v: str) -> str` `classmethod` ¶

Validate string fields are non-empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	String value to validate.	required

Returns:

Type	Description
`str`	Validated string.

Raises:

Type	Description
`ValueError`	If string is empty or whitespace only.

`deactivate() -> None` ¶

Soft-delete this mapping (for privacy compliance).

Sets is_active to False without deleting the record. This allows the mapping to be retained for audit purposes while marking it as no longer valid.

Examples:

>>> from uuid import uuid4
>>> mapping = ParticipantIDMapping(
...     external_id="ABC123",
...     external_source="prolific",
...     participant_id=uuid4(),
... )
>>> mapping.is_active
True
>>> mapping.deactivate()
>>> mapping.is_active
False

Collection¶

`collection` ¶

Participant collection with JSONL I/O and DataFrame support.

This module provides ParticipantCollection and IDMappingCollection for managing multiple participants with JSONL serialization and pandas/polars DataFrame conversion for analysis.

`ParticipantCollection` ¶

Bases: BeadBaseModel

Collection of participants with JSONL I/O and DataFrame support.

Provides methods for managing multiple participants, saving/loading from JSONL files, and converting to pandas/polars DataFrames for analysis.

Attributes:

Name	Type	Description
`name`	`str`	Name of this collection.
`participants`	`list[Participant]`	List of participants.
`metadata_spec_name`	`str \| None`	Name of the metadata spec used (for documentation).

Examples:

>>> collection = ParticipantCollection(name="study_001_participants")
>>> participant = Participant(
...     participant_metadata={"age": 25, "education": "bachelors"}
... )
>>> collection.add_participant(participant)
>>> len(collection.participants)
1
>>> collection.to_jsonl("participants.jsonl")

`validate_name(v: str) -> str` `classmethod` ¶

Validate name is non-empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Collection name to validate.	required

Returns:

Type	Description
`str`	Validated collection name.

Raises:

Type	Description
`ValueError`	If name is empty or whitespace only.

`len() -> int` ¶

Return number of participants.

Returns:

Type	Description
`int`	Number of participants in the collection.

`add_participant(participant: Participant) -> None` ¶

Add a participant to the collection.

Parameters:

Name	Type	Description	Default
`participant`	`Participant`	Participant to add.	required

Examples:

>>> collection = ParticipantCollection(name="test")
>>> p = Participant(participant_metadata={"age": 25})
>>> collection.add_participant(p)
>>> len(collection)
1

`add_participants(participants: list[Participant]) -> None` ¶

Add multiple participants to the collection.

Parameters:

Name	Type	Description	Default
`participants`	`list[Participant]`	Participants to add.	required

Examples:

>>> collection = ParticipantCollection(name="test")
>>> ps = [Participant(), Participant()]
>>> collection.add_participants(ps)
>>> len(collection)
2

`get_by_id(participant_id: UUID) -> Participant | None` ¶

Get participant by UUID.

Parameters:

Name	Type	Description	Default
`participant_id`	`UUID`	Participant UUID to find.	required

Returns:

Type	Description
`Participant \| None`	Participant if found, None otherwise.

Examples:

>>> collection = ParticipantCollection(name="test")
>>> p = Participant()
>>> collection.add_participant(p)
>>> found = collection.get_by_id(p.id)
>>> found is not None
True

`get_by_attribute(key: str, value: JsonValue) -> list[Participant]` ¶

Get participants by metadata attribute value.

Parameters:

Name	Type	Description	Default
`key`	`str`	Attribute name.	required
`value`	`JsonValue`	Value to match.	required

Returns:

Type	Description
`list[Participant]`	Participants with matching attribute.

Examples:

>>> collection = ParticipantCollection(name="test")
>>> p1 = Participant(participant_metadata={"age": 25})
>>> p2 = Participant(participant_metadata={"age": 30})
>>> collection.add_participants([p1, p2])
>>> matches = collection.get_by_attribute("age", 25)
>>> len(matches)
1

`validate_all(spec: ParticipantMetadataSpec) -> dict[UUID, list[str]]` ¶

Validate all participants against a specification.

Parameters:

Name	Type	Description	Default
`spec`	`ParticipantMetadataSpec`	Specification to validate against.	required

Returns:

Type	Description
`dict[UUID, list[str]]`	Mapping from participant ID to list of validation errors. Empty dict if all valid.

Examples:

>>> from bead.participants.metadata_spec import (
...     FieldSpec, ParticipantMetadataSpec
... )
>>> spec = ParticipantMetadataSpec(
...     name="test",
...     fields=[FieldSpec(name="age", field_type="int", required=True)]
... )
>>> collection = ParticipantCollection(name="test")
>>> p = Participant(participant_metadata={"age": 25})
>>> collection.add_participant(p)
>>> errors = collection.validate_all(spec)
>>> len(errors)
0

`to_jsonl(path: Path | str) -> None` ¶

Write participants to JSONL file.

Parameters:

Name	Type	Description	Default
`path`	`Path \| str`	Path to output file.	required

Examples:

>>> collection = ParticipantCollection(name="test")
>>> collection.add_participant(Participant())
>>> collection.to_jsonl("/tmp/participants.jsonl")

`from_jsonl(path: Path | str, name: str = 'loaded_participants') -> ParticipantCollection` `classmethod` ¶

Load participants from JSONL file.

Parameters:

Name	Type	Description	Default
`path`	`Path \| str`	Path to JSONL file.	required
`name`	`str`	Name for the collection.	`'loaded_participants'`

Returns:

Type	Description
`ParticipantCollection`	Collection with loaded participants.

Examples:

>>> collection = ParticipantCollection.from_jsonl(
...     "participants.jsonl"
... )

`to_dataframe(backend: Literal['pandas', 'polars'] = 'pandas', include_fields: list[str] | None = None, exclude_fields: list[str] | None = None, flatten_metadata: bool = True) -> DataFrame` ¶

Convert to pandas or polars DataFrame.

Parameters:

Name	Type	Description	Default
`backend`	`Literal['pandas', 'polars']`	DataFrame backend to use (default: "pandas").	`'pandas'`
`include_fields`	`list[str] \| None`	If provided, only include these metadata fields.	`None`
`exclude_fields`	`list[str] \| None`	If provided, exclude these metadata fields.	`None`
`flatten_metadata`	`bool`	If True, flatten participant_metadata into top-level columns.	`True`

Returns:

Type	Description
`DataFrame`	pandas or polars DataFrame with participant data. Always includes 'participant_id' column (as string).

Examples:

>>> collection = ParticipantCollection(name="test")
>>> p = Participant(participant_metadata={"age": 25})
>>> collection.add_participant(p)
>>> df = collection.to_dataframe()
>>> "participant_id" in df.columns
True
>>> "age" in df.columns
True

`from_dataframe(df: DataFrame, name: str, id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> ParticipantCollection` `classmethod` ¶

Create collection from pandas or polars DataFrame.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	pandas or polars DataFrame with participant data.	required
`name`	`str`	Name for the collection.	required
`id_column`	`str`	Column containing participant IDs (default: "participant_id"). If column exists, uses those UUIDs; otherwise generates new ones.	`'participant_id'`
`metadata_columns`	`list[str] \| None`	Columns to include in participant_metadata. If None, includes all columns except id_column.	`None`

Returns:

Type	Description
`ParticipantCollection`	Collection with participants from DataFrame.

Examples:

>>> import pandas as pd
>>> df = pd.DataFrame({
...     "age": [25, 30],
...     "education": ["bachelors", "masters"]
... })
>>> collection = ParticipantCollection.from_dataframe(df, "test")
>>> len(collection)
2

`IDMappingCollection` ¶

Bases: BeadBaseModel

Collection of ID mappings (stored separately for privacy).

This collection should be stored in a SEPARATE file from participant data for IRB/privacy compliance.

Attributes:

Name	Type	Description
`name`	`str`	Name of this mapping collection.
`mappings`	`list[ParticipantIDMapping]`	List of ID mappings.
`source`	`str`	Primary source of external IDs (e.g., "prolific").

Examples:

>>> from uuid import uuid4
>>> collection = IDMappingCollection(name="study_001", source="prolific")
>>> mapping = collection.add_mapping("PROLIFIC_ABC123", uuid4())
>>> collection.get_participant_id("PROLIFIC_ABC123") is not None
True

`validate_non_empty(v: str) -> str` `classmethod` ¶

Validate string fields are non-empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	String to validate.	required

Returns:

Type	Description
`str`	Validated string.

Raises:

Type	Description
`ValueError`	If string is empty or whitespace only.

`len() -> int` ¶

Return number of mappings.

Returns:

Type	Description
`int`	Number of mappings in the collection.

`add_mapping(external_id: str, participant_id: UUID, external_source: str | None = None) -> ParticipantIDMapping` ¶

Create and add a new ID mapping.

Parameters:

Name	Type	Description	Default
`external_id`	`str`	External participant ID.	required
`participant_id`	`UUID`	Internal participant UUID.	required
`external_source`	`str \| None`	Source of external ID (defaults to collection's source).	`None`

Returns:

Type	Description
`ParticipantIDMapping`	The created mapping.

Examples:

>>> from uuid import uuid4
>>> collection = IDMappingCollection(name="test", source="prolific")
>>> mapping = collection.add_mapping("ABC123", uuid4())
>>> mapping.external_source
'prolific'

`get_participant_id(external_id: str) -> UUID | None` ¶

Look up internal participant ID from external ID.

Parameters:

Name	Type	Description	Default
`external_id`	`str`	External ID to look up.	required

Returns:

Type	Description
`UUID \| None`	Internal participant ID if found, None otherwise.

Examples:

>>> from uuid import uuid4
>>> collection = IDMappingCollection(name="test", source="prolific")
>>> pid = uuid4()
>>> collection.add_mapping("ABC123", pid)
>>> collection.get_participant_id("ABC123") == pid
True
>>> collection.get_participant_id("UNKNOWN") is None
True

`get_external_id(participant_id: UUID) -> str | None` ¶

Look up external ID from internal participant ID.

Parameters:

Name	Type	Description	Default
`participant_id`	`UUID`	Internal participant ID to look up.	required

Returns:

Type	Description
`str \| None`	External ID if found, None otherwise.

Examples:

>>> from uuid import uuid4
>>> collection = IDMappingCollection(name="test", source="prolific")
>>> pid = uuid4()
>>> collection.add_mapping("ABC123", pid)
>>> collection.get_external_id(pid)
'ABC123'

`deactivate_all() -> int` ¶

Deactivate all mappings (for bulk privacy removal).

Returns:

Type	Description
`int`	Number of mappings deactivated.

Examples:

>>> from uuid import uuid4
>>> collection = IDMappingCollection(name="test", source="prolific")
>>> collection.add_mapping("ABC123", uuid4())
>>> collection.add_mapping("DEF456", uuid4())
>>> count = collection.deactivate_all()
>>> count
2

`to_jsonl(path: Path | str) -> None` ¶

Write mappings to JSONL file.

Parameters:

Name	Type	Description	Default
`path`	`Path \| str`	Path to output file.	required

Examples:

>>> from uuid import uuid4
>>> collection = IDMappingCollection(name="test", source="prolific")
>>> collection.add_mapping("ABC123", uuid4())
>>> collection.to_jsonl("/tmp/mappings.jsonl")

`from_jsonl(path: Path | str, name: str = 'loaded_mappings', source: str = 'unknown') -> IDMappingCollection` `classmethod` ¶

Load mappings from JSONL file.

Parameters:

Name	Type	Description	Default
`path`	`Path \| str`	Path to JSONL file.	required
`name`	`str`	Name for the collection.	`'loaded_mappings'`
`source`	`str`	External ID source.	`'unknown'`

Returns:

Type	Description
`IDMappingCollection`	Collection with loaded mappings.

Examples:

>>> collection = IDMappingCollection.from_jsonl(
...     "mappings.jsonl", source="prolific"
... )

Merging¶

`merging` ¶

Utilities for merging participant metadata with judgment data.

This module provides functions for joining participant metadata with judgment DataFrames for analysis. All functions support both pandas and polars DataFrames, preserving the input type.

`merge_participant_metadata(judgments_df: DataFrame, participants: ParticipantCollection, id_column: str = 'participant_id', metadata_columns: list[str] | None = None, how: str = 'left') -> DataFrame` ¶

Merge participant metadata into a judgments DataFrame.

Preserves input DataFrame type (pandas in -> pandas out, polars in -> polars out).

Parameters:

Name	Type	Description	Default
`judgments_df`	`DataFrame`	DataFrame containing judgment data with participant IDs.	required
`participants`	`ParticipantCollection`	Collection of participants with metadata.	required
`id_column`	`str`	Column in judgments_df containing participant IDs (default: "participant_id").	`'participant_id'`
`metadata_columns`	`list[str] \| None`	Specific metadata columns to include. If None, includes all.	`None`
`how`	`str`	Merge type: "left", "inner", "outer" (default: "left").	`'left'`

Returns:

Type	Description
`DataFrame`	Merged DataFrame with participant metadata columns added.

Examples:

>>> import pandas as pd
>>> from bead.participants.collection import ParticipantCollection
>>> from bead.participants.models import Participant
>>> judgments = pd.DataFrame({
...     "participant_id": ["uuid1", "uuid2"],
...     "response": [5, 3],
... })
>>> collection = ParticipantCollection(name="test")
>>> # ... add participants ...
>>> # merged = merge_participant_metadata(judgments, collection)

`resolve_external_ids(df: DataFrame, id_mappings: IDMappingCollection, external_id_column: str = 'PROLIFIC_PID', output_column: str = 'participant_id', drop_unresolved: bool = False) -> DataFrame` ¶

Resolve external IDs to internal participant UUIDs.

Preserves input DataFrame type.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame with external participant IDs.	required
`id_mappings`	`IDMappingCollection`	Collection of ID mappings.	required
`external_id_column`	`str`	Column containing external IDs (default: "PROLIFIC_PID").	`'PROLIFIC_PID'`
`output_column`	`str`	Column name for resolved UUIDs (default: "participant_id").	`'participant_id'`
`drop_unresolved`	`bool`	If True, drop rows with unresolved IDs (default: False).	`False`

Returns:

Type	Description
`DataFrame`	DataFrame with resolved participant UUIDs.

Examples:

>>> import pandas as pd
>>> from uuid import uuid4
>>> from bead.participants.collection import IDMappingCollection
>>> raw_data = pd.DataFrame({
...     "PROLIFIC_PID": ["ABC123", "DEF456"],
...     "response": [5, 3],
... })
>>> mappings = IDMappingCollection(name="test", source="prolific")
>>> pid = uuid4()
>>> mappings.add_mapping("ABC123", pid)
>>> resolved = resolve_external_ids(raw_data, mappings)
>>> output_column in resolved.columns
True

`create_analysis_dataframe(judgments_df: DataFrame, participants: ParticipantCollection, id_mappings: IDMappingCollection | None = None, external_id_column: str | None = None, participant_id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> DataFrame` ¶

Create analysis-ready DataFrame with resolved IDs and metadata.

Convenience function that: 1. Resolves external IDs to internal UUIDs (if id_mappings provided) 2. Merges participant metadata 3. Returns a clean DataFrame ready for analysis

Preserves input DataFrame type.

Parameters:

Name	Type	Description	Default
`judgments_df`	`DataFrame`	Raw judgment data.	required
`participants`	`ParticipantCollection`	Participant collection with metadata.	required
`id_mappings`	`IDMappingCollection \| None`	ID mappings (required if external_id_column is provided).	`None`
`external_id_column`	`str \| None`	Column with external IDs to resolve.	`None`
`participant_id_column`	`str`	Column with participant IDs (after resolution).	`'participant_id'`
`metadata_columns`	`list[str] \| None`	Metadata columns to include.	`None`

Returns:

Type	Description
`DataFrame`	Analysis-ready DataFrame.

Examples:

>>> import pandas as pd
>>> from bead.participants.collection import (
...     ParticipantCollection, IDMappingCollection
... )
>>> raw_judgments = pd.DataFrame({
...     "PROLIFIC_PID": ["ABC123"],
...     "response": [5],
... })
>>> participants = ParticipantCollection(name="test")
>>> mappings = IDMappingCollection(name="test", source="prolific")
>>> # analysis_df = create_analysis_dataframe(
>>> #     raw_judgments,
>>> #     participants,
>>> #     id_mappings=mappings,
>>> #     external_id_column="PROLIFIC_PID",
>>> # )

`build_participant_lookup(participants: ParticipantCollection, key_field: str | None = None) -> dict[str, dict[str, str | int | float | bool | None]]` ¶

Build a lookup dictionary from participant collection.

Useful for manual merging or custom processing.

Parameters:

Name	Type	Description	Default
`participants`	`ParticipantCollection`	Collection of participants.	required
`key_field`	`str \| None`	If provided, use this metadata field as the key instead of UUID.	`None`

Returns:

Type	Description
`dict[str, dict[str, str \| int \| float \| bool \| None]]`	Lookup from participant ID (or key_field) to metadata dict.

Examples:

>>> from bead.participants.collection import ParticipantCollection
>>> from bead.participants.models import Participant
>>> collection = ParticipantCollection(name="test")
>>> p = Participant(participant_metadata={"age": 25})
>>> collection.add_participant(p)
>>> lookup = build_participant_lookup(collection)
>>> str(p.id) in lookup
True

Metadata Specification¶

`metadata_spec` ¶

Metadata specification for participant attributes.

This module provides FieldSpec and ParticipantMetadataSpec for defining configurable metadata fields with validation constraints (allowed values, ranges).

`FieldSpec` ¶

Bases: BaseModel

Specification for a single metadata field.

Defines the constraints and display properties for a participant metadata field. Used for validation and for generating demographics forms.

Attributes:

Name	Type	Description
`name`	`str`	Field name (e.g., "age", "education"). Must be valid Python identifier.
`field_type`	`Literal['int', 'float', 'str', 'bool']`	Data type for the field.
`required`	`bool`	Whether this field is required (default: False).
`allowed_values`	`list[str \| int \| float \| bool] \| None`	Exhaustive list of allowed values (for categorical fields). If None, any value of the correct type is accepted.
`range`	`Range[int] \| Range[float] \| None`	Numeric range constraint (for int/float fields).
`label`	`str \| None`	Display label for forms. If None, uses name with underscores replaced.
`description`	`str \| None`	Help text / description for the field.

Examples:

>>> age_spec = FieldSpec(
...     name="age",
...     field_type="int",
...     required=True,
...     range=Range[int](min=18, max=100),
...     label="Age",
...     description="Your age in years"
... )
>>> education_spec = FieldSpec(
...     name="education",
...     field_type="str",
...     required=True,
...     allowed_values=["high_school", "bachelors", "masters", "phd"],
...     label="Highest Education Level"
... )

`validate_name(v: str) -> str` `classmethod` ¶

Validate field name is non-empty and valid identifier.

Parameters:

Name	Type	Description	Default
`v`	`str`	Field name to validate.	required

Returns:

Type	Description
`str`	Validated field name.

Raises:

Type	Description
`ValueError`	If field name is empty or not a valid Python identifier.

`validate_constraints() -> FieldSpec` ¶

Validate that constraints are consistent with field_type.

Returns:

Type	Description
`FieldSpec`	The validated FieldSpec instance.

Raises:

Type	Description
`ValueError`	If constraints are inconsistent with field_type.

`validate_value(value: str | int | float | bool | None) -> bool` ¶

Check if a value satisfies this field's constraints.

Parameters:

Name	Type	Description	Default
`value`	`str \| int \| float \| bool \| None`	Value to validate.	required

Returns:

Type	Description
`bool`	True if value is valid, False otherwise.

Examples:

>>> spec = FieldSpec(
...     name="age",
...     field_type="int",
...     range=Range[int](min=18, max=100)
... )
>>> spec.validate_value(25)
True
>>> spec.validate_value(10)
False

`get_display_label() -> str` ¶

Get display label for forms.

Returns:

Type	Description
`str`	The label if set, otherwise name with underscores replaced by spaces and title-cased.

Examples:

>>> spec = FieldSpec(name="native_speaker", field_type="bool")
>>> spec.get_display_label()
'Native Speaker'
>>> spec = FieldSpec(name="age", field_type="int", label="Your Age")
>>> spec.get_display_label()
'Your Age'

`ParticipantMetadataSpec` ¶

Bases: BaseModel

Specification for participant metadata schema.

Defines the allowed fields and their constraints for participant metadata. Used to validate participant data on ingestion and to generate demographics forms for experiments.

Attributes:

Name	Type	Description
`name`	`str`	Name of this specification (e.g., "prolific_demographics").
`version`	`str`	Version string for this spec.
`fields`	`list[FieldSpec]`	List of field specifications.

Examples:

>>> spec = ParticipantMetadataSpec(
...     name="standard_demographics",
...     version="1.0.0",
...     fields=[
...         FieldSpec(
...             name="age",
...             field_type="int",
...             range=Range[int](min=18, max=100)
...         ),
...         FieldSpec(
...             name="education",
...             field_type="str",
...             allowed_values=["high_school", "bachelors", "masters", "phd"]
...         ),
...         FieldSpec(name="native_speaker", field_type="bool", required=True),
...     ]
... )
>>> spec.get_field("age").range.min
18

`validate_name(v: str) -> str` `classmethod` ¶

Validate spec name is non-empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Spec name to validate.	required

Returns:

Type	Description
`str`	Validated spec name.

Raises:

Type	Description
`ValueError`	If name is empty.

`validate_unique_field_names(v: list[FieldSpec]) -> list[FieldSpec]` `classmethod` ¶

Validate all field names are unique.

Parameters:

Name	Type	Description	Default
`v`	`list[FieldSpec]`	List of field specs to validate.	required

Returns:

Type	Description
`list[FieldSpec]`	Validated list of field specs.

Raises:

Type	Description
`ValueError`	If duplicate field names found.

`get_field(name: str) -> FieldSpec | None` ¶

Get a field specification by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Field name to look up.	required

Returns:

Type	Description
`FieldSpec \| None`	The field spec if found, None otherwise.

Examples:

>>> spec = ParticipantMetadataSpec(
...     name="test",
...     fields=[FieldSpec(name="age", field_type="int")]
... )
>>> spec.get_field("age").field_type
'int'
>>> spec.get_field("unknown") is None
True

`get_required_fields() -> list[FieldSpec]` ¶

Get all required field specifications.

Returns:

Type	Description
`list[FieldSpec]`	List of required fields.

Examples:

>>> spec = ParticipantMetadataSpec(
...     name="test",
...     fields=[
...         FieldSpec(name="age", field_type="int", required=True),
...         FieldSpec(name="nickname", field_type="str", required=False),
...     ]
... )
>>> [f.name for f in spec.get_required_fields()]
['age']

`validate_metadata(metadata: dict[str, str | int | float | bool | None]) -> tuple[bool, list[str]]` ¶

Validate metadata against this specification.

Parameters:

Name	Type	Description	Default
`metadata`	`dict[str, str \| int \| float \| bool \| None]`	Metadata dictionary to validate.	required

Returns:

Type	Description
`tuple[bool, list[str]]`	(is_valid, list of error messages). Empty list if valid.

Examples:

>>> spec = ParticipantMetadataSpec(
...     name="test",
...     fields=[
...         FieldSpec(name="age", field_type="int", required=True),
...     ]
... )
>>> spec.validate_metadata({"age": 25})
(True, [])
>>> spec.validate_metadata({})
(False, ['Missing required field: age'])

`to_demographics_config() -> DemographicsConfig` ¶

Convert this spec to a DemographicsConfig for deployment.

Creates a demographics form configuration that can be used in experiment deployment to collect participant data.

Returns:

Type	Description
`DemographicsConfig`	Demographics configuration for jsPsych deployment.

Examples:

>>> spec = ParticipantMetadataSpec(
...     name="test",
...     fields=[
...         FieldSpec(name="age", field_type="int", required=True),
...     ]
... )
>>> config = spec.to_demographics_config()
>>> config.enabled
True

bead.participants¶

Models¶

models ¶

Participant ¶

validate_against_spec(spec: ParticipantMetadataSpec) -> tuple[bool, list[str]] ¶

get_attribute(key: str, default: JsonValue = None) -> JsonValue ¶

set_attribute(key: str, value: JsonValue) -> None ¶

add_session(session_id: str) -> None ¶

ParticipantIDMapping ¶

validate_non_empty(v: str) -> str classmethod ¶

deactivate() -> None ¶

Collection¶

collection ¶

ParticipantCollection ¶

validate_name(v: str) -> str classmethod ¶

__len__() -> int ¶

add_participant(participant: Participant) -> None ¶

add_participants(participants: list[Participant]) -> None ¶

get_by_id(participant_id: UUID) -> Participant | None ¶

get_by_attribute(key: str, value: JsonValue) -> list[Participant] ¶

validate_all(spec: ParticipantMetadataSpec) -> dict[UUID, list[str]] ¶

to_jsonl(path: Path | str) -> None ¶

from_jsonl(path: Path | str, name: str = 'loaded_participants') -> ParticipantCollection classmethod ¶

to_dataframe(backend: Literal['pandas', 'polars'] = 'pandas', include_fields: list[str] | None = None, exclude_fields: list[str] | None = None, flatten_metadata: bool = True) -> DataFrame ¶

from_dataframe(df: DataFrame, name: str, id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> ParticipantCollection classmethod ¶

IDMappingCollection ¶

validate_non_empty(v: str) -> str classmethod ¶

__len__() -> int ¶

add_mapping(external_id: str, participant_id: UUID, external_source: str | None = None) -> ParticipantIDMapping ¶

get_participant_id(external_id: str) -> UUID | None ¶

get_external_id(participant_id: UUID) -> str | None ¶

deactivate_all() -> int ¶

to_jsonl(path: Path | str) -> None ¶

from_jsonl(path: Path | str, name: str = 'loaded_mappings', source: str = 'unknown') -> IDMappingCollection classmethod ¶

Merging¶

merging ¶

merge_participant_metadata(judgments_df: DataFrame, participants: ParticipantCollection, id_column: str = 'participant_id', metadata_columns: list[str] | None = None, how: str = 'left') -> DataFrame ¶

resolve_external_ids(df: DataFrame, id_mappings: IDMappingCollection, external_id_column: str = 'PROLIFIC_PID', output_column: str = 'participant_id', drop_unresolved: bool = False) -> DataFrame ¶

create_analysis_dataframe(judgments_df: DataFrame, participants: ParticipantCollection, id_mappings: IDMappingCollection | None = None, external_id_column: str | None = None, participant_id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> DataFrame ¶

build_participant_lookup(participants: ParticipantCollection, key_field: str | None = None) -> dict[str, dict[str, str | int | float | bool | None]] ¶

Metadata Specification¶

metadata_spec ¶

FieldSpec ¶

validate_name(v: str) -> str classmethod ¶

validate_constraints() -> FieldSpec ¶

validate_value(value: str | int | float | bool | None) -> bool ¶

get_display_label() -> str ¶

ParticipantMetadataSpec ¶

validate_name(v: str) -> str classmethod ¶

validate_unique_field_names(v: list[FieldSpec]) -> list[FieldSpec] classmethod ¶

get_field(name: str) -> FieldSpec | None ¶

get_required_fields() -> list[FieldSpec] ¶

validate_metadata(metadata: dict[str, str | int | float | bool | None]) -> tuple[bool, list[str]] ¶

to_demographics_config() -> DemographicsConfig ¶

`models` ¶

`Participant` ¶

`validate_against_spec(spec: ParticipantMetadataSpec) -> tuple[bool, list[str]]` ¶

`get_attribute(key: str, default: JsonValue = None) -> JsonValue` ¶

`set_attribute(key: str, value: JsonValue) -> None` ¶

`add_session(session_id: str) -> None` ¶

`ParticipantIDMapping` ¶

`validate_non_empty(v: str) -> str` `classmethod` ¶

`deactivate() -> None` ¶

`collection` ¶

`ParticipantCollection` ¶

`validate_name(v: str) -> str` `classmethod` ¶

`len() -> int` ¶

`add_participant(participant: Participant) -> None` ¶

`add_participants(participants: list[Participant]) -> None` ¶

`get_by_id(participant_id: UUID) -> Participant | None` ¶

`get_by_attribute(key: str, value: JsonValue) -> list[Participant]` ¶

`validate_all(spec: ParticipantMetadataSpec) -> dict[UUID, list[str]]` ¶

`to_jsonl(path: Path | str) -> None` ¶

`from_jsonl(path: Path | str, name: str = 'loaded_participants') -> ParticipantCollection` `classmethod` ¶

`to_dataframe(backend: Literal['pandas', 'polars'] = 'pandas', include_fields: list[str] | None = None, exclude_fields: list[str] | None = None, flatten_metadata: bool = True) -> DataFrame` ¶

`from_dataframe(df: DataFrame, name: str, id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> ParticipantCollection` `classmethod` ¶

`IDMappingCollection` ¶

`validate_non_empty(v: str) -> str` `classmethod` ¶

`len() -> int` ¶

`add_mapping(external_id: str, participant_id: UUID, external_source: str | None = None) -> ParticipantIDMapping` ¶

`get_participant_id(external_id: str) -> UUID | None` ¶

`get_external_id(participant_id: UUID) -> str | None` ¶

`deactivate_all() -> int` ¶

`to_jsonl(path: Path | str) -> None` ¶

`from_jsonl(path: Path | str, name: str = 'loaded_mappings', source: str = 'unknown') -> IDMappingCollection` `classmethod` ¶

`merging` ¶

`merge_participant_metadata(judgments_df: DataFrame, participants: ParticipantCollection, id_column: str = 'participant_id', metadata_columns: list[str] | None = None, how: str = 'left') -> DataFrame` ¶

`resolve_external_ids(df: DataFrame, id_mappings: IDMappingCollection, external_id_column: str = 'PROLIFIC_PID', output_column: str = 'participant_id', drop_unresolved: bool = False) -> DataFrame` ¶

`create_analysis_dataframe(judgments_df: DataFrame, participants: ParticipantCollection, id_mappings: IDMappingCollection | None = None, external_id_column: str | None = None, participant_id_column: str = 'participant_id', metadata_columns: list[str] | None = None) -> DataFrame` ¶

`build_participant_lookup(participants: ParticipantCollection, key_field: str | None = None) -> dict[str, dict[str, str | int | float | bool | None]]` ¶

`metadata_spec` ¶

`FieldSpec` ¶

`validate_name(v: str) -> str` `classmethod` ¶

`validate_constraints() -> FieldSpec` ¶

`validate_value(value: str | int | float | bool | None) -> bool` ¶

`get_display_label() -> str` ¶

`ParticipantMetadataSpec` ¶

`validate_name(v: str) -> str` `classmethod` ¶

`validate_unique_field_names(v: list[FieldSpec]) -> list[FieldSpec]` `classmethod` ¶

`get_field(name: str) -> FieldSpec | None` ¶

`get_required_fields() -> list[FieldSpec]` ¶

`validate_metadata(metadata: dict[str, str | int | float | bool | None]) -> tuple[bool, list[str]]` ¶

`to_demographics_config() -> DemographicsConfig` ¶