Skip to content

schema

Core data schema definitions for the reward modeling data pipeline. Provides structured data models for samples, rewards, and datasets with validation.

BaseDataSet

Bases: BaseModel

Container for managing collections of data samples with metadata.

Provides standardized interface for dataset operations including indexing, iteration, and serialization for consistent data handling across the pipeline.

Attributes:

Name Type Description
datasamples List[DataSample]

Collection of data samples in the dataset

name str

Human-readable identifier for the dataset

metadata Dict[str, Any]

Additional information about dataset origin and processing

Source code in rm_gallery/core/data/schema.py
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
class BaseDataSet(BaseModel):
    """
    Container for managing collections of data samples with metadata.

    Provides standardized interface for dataset operations including indexing,
    iteration, and serialization for consistent data handling across the pipeline.

    Attributes:
        datasamples: Collection of data samples in the dataset
        name: Human-readable identifier for the dataset
        metadata: Additional information about dataset origin and processing
    """

    datasamples: List[DataSample] = Field(
        default_factory=list, description="List of data items"
    )
    name: str = Field(..., description="dataset name")
    metadata: Dict[str, Any] = Field(default_factory=dict, description="metadata")

    def __len__(self) -> int:
        """
        Return the number of samples in the dataset.

        Returns:
            Integer count of data samples
        """
        return len(self.datasamples)

    def __getitem__(
        self, index: Union[int, slice]
    ) -> Union[DataSample, List[DataSample]]:
        """
        Enable index-based and slice-based access to dataset samples.

        Args:
            index: Integer index or slice object for data access

        Returns:
            Single DataSample for integer index, list for slice
        """
        return self.datasamples[index]

    def get_data_samples(self) -> List[DataSample]:
        """
        Retrieve all data samples from the dataset.

        Returns:
            Complete list of DataSample objects in the dataset
        """
        return [data for data in self.datasamples]

    def to_dict(self) -> Dict[str, Any]:
        """
        Convert dataset to dictionary format for serialization.

        Returns:
            Dictionary representation with name, metadata, and serialized samples
        """
        return {
            "name": self.name,
            "metadata": self.metadata,
            "datasamples": [data.model_dump() for data in self.datasamples],
        }

    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "BaseDataSet":
        """
        Create dataset instance from dictionary representation.

        Args:
            data: Dictionary with dataset structure and sample data

        Returns:
            New BaseDataSet instance with restored data
        """
        return cls(
            name=data["name"],
            metadata=data.get("metadata", {}),
            datasamples=[DataSample(**item) for item in data["datasamples"]],
        )

    class Config:
        arbitrary_types_allowed = True

__getitem__(index)

Enable index-based and slice-based access to dataset samples.

Parameters:

Name Type Description Default
index Union[int, slice]

Integer index or slice object for data access

required

Returns:

Type Description
Union[DataSample, List[DataSample]]

Single DataSample for integer index, list for slice

Source code in rm_gallery/core/data/schema.py
158
159
160
161
162
163
164
165
166
167
168
169
170
def __getitem__(
    self, index: Union[int, slice]
) -> Union[DataSample, List[DataSample]]:
    """
    Enable index-based and slice-based access to dataset samples.

    Args:
        index: Integer index or slice object for data access

    Returns:
        Single DataSample for integer index, list for slice
    """
    return self.datasamples[index]

__len__()

Return the number of samples in the dataset.

Returns:

Type Description
int

Integer count of data samples

Source code in rm_gallery/core/data/schema.py
149
150
151
152
153
154
155
156
def __len__(self) -> int:
    """
    Return the number of samples in the dataset.

    Returns:
        Integer count of data samples
    """
    return len(self.datasamples)

from_dict(data) classmethod

Create dataset instance from dictionary representation.

Parameters:

Name Type Description Default
data Dict[str, Any]

Dictionary with dataset structure and sample data

required

Returns:

Type Description
BaseDataSet

New BaseDataSet instance with restored data

Source code in rm_gallery/core/data/schema.py
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "BaseDataSet":
    """
    Create dataset instance from dictionary representation.

    Args:
        data: Dictionary with dataset structure and sample data

    Returns:
        New BaseDataSet instance with restored data
    """
    return cls(
        name=data["name"],
        metadata=data.get("metadata", {}),
        datasamples=[DataSample(**item) for item in data["datasamples"]],
    )

get_data_samples()

Retrieve all data samples from the dataset.

Returns:

Type Description
List[DataSample]

Complete list of DataSample objects in the dataset

Source code in rm_gallery/core/data/schema.py
172
173
174
175
176
177
178
179
def get_data_samples(self) -> List[DataSample]:
    """
    Retrieve all data samples from the dataset.

    Returns:
        Complete list of DataSample objects in the dataset
    """
    return [data for data in self.datasamples]

to_dict()

Convert dataset to dictionary format for serialization.

Returns:

Type Description
Dict[str, Any]

Dictionary representation with name, metadata, and serialized samples

Source code in rm_gallery/core/data/schema.py
181
182
183
184
185
186
187
188
189
190
191
192
def to_dict(self) -> Dict[str, Any]:
    """
    Convert dataset to dictionary format for serialization.

    Returns:
        Dictionary representation with name, metadata, and serialized samples
    """
    return {
        "name": self.name,
        "metadata": self.metadata,
        "datasamples": [data.model_dump() for data in self.datasamples],
    }

DataOutput

Bases: BaseModel

Output response structure containing the final answer and optional reasoning steps.

Encapsulates both the final response and any intermediate reasoning steps for comprehensive evaluation and analysis.

Attributes:

Name Type Description
answer Step

Final response step with complete answer

steps Optional[List[Step]]

Optional list of intermediate reasoning steps

Source code in rm_gallery/core/data/schema.py
56
57
58
59
60
61
62
63
64
65
66
67
68
69
class DataOutput(BaseModel):
    """
    Output response structure containing the final answer and optional reasoning steps.

    Encapsulates both the final response and any intermediate reasoning steps
    for comprehensive evaluation and analysis.

    Attributes:
        answer: Final response step with complete answer
        steps: Optional list of intermediate reasoning steps
    """

    answer: Step = Field(default=...)
    steps: Optional[List[Step]] = Field(default=None, description="steps")

DataSample

Bases: BaseModel

Complete data sample structure for reward modeling training and evaluation.

Represents a single interaction with input context, multiple possible outputs, and associated metadata for comprehensive reward model training.

Attributes:

Name Type Description
unique_id str

Unique identifier for tracking and deduplication

input List[ChatMessage]

Conversation context as list of chat messages

output List[DataOutput]

List of possible responses with evaluations

task_category Optional[str]

Optional categorization for task-specific analysis

source Optional[str]

Origin dataset or system that generated this sample

created_at datetime

Timestamp for temporal tracking

metadata Optional[Dict]

Additional context and debugging information

Source code in rm_gallery/core/data/schema.py
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
class DataSample(BaseModel):
    """
    Complete data sample structure for reward modeling training and evaluation.

    Represents a single interaction with input context, multiple possible outputs,
    and associated metadata for comprehensive reward model training.

    Attributes:
        unique_id: Unique identifier for tracking and deduplication
        input: Conversation context as list of chat messages
        output: List of possible responses with evaluations
        task_category: Optional categorization for task-specific analysis
        source: Origin dataset or system that generated this sample
        created_at: Timestamp for temporal tracking
        metadata: Additional context and debugging information
    """

    unique_id: str = Field(..., description="Unique identifier for the data")
    input: List[ChatMessage] = Field(default_factory=list, description="input")
    output: List[DataOutput] = Field(default_factory=list, description="output")
    task_category: Optional[str] = Field(default=None, description="task category")
    source: Optional[str] = Field(default=None, description="source")
    created_at: datetime = Field(default_factory=datetime.now, description="createdAt")
    metadata: Optional[Dict] = Field(default=None, description="metadata")

    def update(self, sample: "DataSample") -> "DataSample":
        """
        Merge another sample's data into this sample for combining evaluations.

        Updates additional_kwargs and reward details from the source sample
        while preserving the original structure.

        Args:
            sample: Source sample to merge data from

        Returns:
            Self with updated data for method chaining
        """
        self.input[-1].additional_kwargs.update(sample.input[-1].additional_kwargs)
        for i, output in enumerate(self.output):
            output.answer.additional_kwargs.update(
                sample.output[i].answer.additional_kwargs
            )
            output.answer.reward.details.extend(sample.output[i].answer.reward.details)

            if output.steps:
                for j, step in output.steps:
                    step.additional_kwargs.update(
                        sample.output[i].steps[j].additional_kwargs
                    )
                    step.reward.details.extend(sample.output[i].steps[j].reward.details)
        return self

    class Config:
        arbitrary_types_allowed = True
        json_encoders = {datetime: lambda v: v.isoformat()}

update(sample)

Merge another sample's data into this sample for combining evaluations.

Updates additional_kwargs and reward details from the source sample while preserving the original structure.

Parameters:

Name Type Description Default
sample DataSample

Source sample to merge data from

required

Returns:

Type Description
DataSample

Self with updated data for method chaining

Source code in rm_gallery/core/data/schema.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
def update(self, sample: "DataSample") -> "DataSample":
    """
    Merge another sample's data into this sample for combining evaluations.

    Updates additional_kwargs and reward details from the source sample
    while preserving the original structure.

    Args:
        sample: Source sample to merge data from

    Returns:
        Self with updated data for method chaining
    """
    self.input[-1].additional_kwargs.update(sample.input[-1].additional_kwargs)
    for i, output in enumerate(self.output):
        output.answer.additional_kwargs.update(
            sample.output[i].answer.additional_kwargs
        )
        output.answer.reward.details.extend(sample.output[i].answer.reward.details)

        if output.steps:
            for j, step in output.steps:
                step.additional_kwargs.update(
                    sample.output[i].steps[j].additional_kwargs
                )
                step.reward.details.extend(sample.output[i].steps[j].reward.details)
    return self

Reward

Bases: BaseModel

Reward evaluation result for data samples with detailed scoring breakdown.

Stores both overall score and dimension-specific details for comprehensive reward evaluation tracking and analysis.

Attributes:

Name Type Description
score float

Overall aggregated reward score (typically weighted average)

details List[RewardDimensionWithScore]

List of individual reward dimensions with their scores

Source code in rm_gallery/core/data/schema.py
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class Reward(BaseModel):
    """
    Reward evaluation result for data samples with detailed scoring breakdown.

    Stores both overall score and dimension-specific details for comprehensive
    reward evaluation tracking and analysis.

    Attributes:
        score: Overall aggregated reward score (typically weighted average)
        details: List of individual reward dimensions with their scores
    """

    score: float = Field(default=0.0, description="score")
    details: List[RewardDimensionWithScore] = Field(
        default_factory=list, description="details"
    )

Step

Bases: ChatMessage

Individual reasoning step in a multi-step process with evaluation metadata.

Extends ChatMessage to include step-specific labels and reward evaluations. Used for tracking intermediate steps in complex reasoning tasks.

Attributes:

Name Type Description
label Optional[Dict[str, Any]]

Additional labeling information for the step

reward Reward

Reward evaluation specific to this step

Source code in rm_gallery/core/data/schema.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class Step(ChatMessage):
    """
    Individual reasoning step in a multi-step process with evaluation metadata.

    Extends ChatMessage to include step-specific labels and reward evaluations.
    Used for tracking intermediate steps in complex reasoning tasks.

    Attributes:
        label: Additional labeling information for the step
        reward: Reward evaluation specific to this step
    """

    label: Optional[Dict[str, Any]] = Field(default={}, description="label")
    reward: Reward = Field(default=Reward(), description="reward")