schema
Core data schema definitions for the reward modeling data pipeline. Provides structured data models for samples, rewards, and datasets with validation.
BaseDataSet
Bases: BaseModel
Container for managing collections of data samples with metadata.
Provides standardized interface for dataset operations including indexing, iteration, and serialization for consistent data handling across the pipeline.
Attributes:
Name | Type | Description |
---|---|---|
datasamples |
List[DataSample]
|
Collection of data samples in the dataset |
name |
str
|
Human-readable identifier for the dataset |
metadata |
Dict[str, Any]
|
Additional information about dataset origin and processing |
Source code in rm_gallery/core/data/schema.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
__getitem__(index)
Enable index-based and slice-based access to dataset samples.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
Union[int, slice]
|
Integer index or slice object for data access |
required |
Returns:
Type | Description |
---|---|
Union[DataSample, List[DataSample]]
|
Single DataSample for integer index, list for slice |
Source code in rm_gallery/core/data/schema.py
158 159 160 161 162 163 164 165 166 167 168 169 170 |
|
__len__()
Return the number of samples in the dataset.
Returns:
Type | Description |
---|---|
int
|
Integer count of data samples |
Source code in rm_gallery/core/data/schema.py
149 150 151 152 153 154 155 156 |
|
from_dict(data)
classmethod
Create dataset instance from dictionary representation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Any]
|
Dictionary with dataset structure and sample data |
required |
Returns:
Type | Description |
---|---|
BaseDataSet
|
New BaseDataSet instance with restored data |
Source code in rm_gallery/core/data/schema.py
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
|
get_data_samples()
Retrieve all data samples from the dataset.
Returns:
Type | Description |
---|---|
List[DataSample]
|
Complete list of DataSample objects in the dataset |
Source code in rm_gallery/core/data/schema.py
172 173 174 175 176 177 178 179 |
|
to_dict()
Convert dataset to dictionary format for serialization.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dictionary representation with name, metadata, and serialized samples |
Source code in rm_gallery/core/data/schema.py
181 182 183 184 185 186 187 188 189 190 191 192 |
|
DataOutput
Bases: BaseModel
Output response structure containing the final answer and optional reasoning steps.
Encapsulates both the final response and any intermediate reasoning steps for comprehensive evaluation and analysis.
Attributes:
Name | Type | Description |
---|---|---|
answer |
Step
|
Final response step with complete answer |
steps |
Optional[List[Step]]
|
Optional list of intermediate reasoning steps |
Source code in rm_gallery/core/data/schema.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
|
DataSample
Bases: BaseModel
Complete data sample structure for reward modeling training and evaluation.
Represents a single interaction with input context, multiple possible outputs, and associated metadata for comprehensive reward model training.
Attributes:
Name | Type | Description |
---|---|---|
unique_id |
str
|
Unique identifier for tracking and deduplication |
input |
List[ChatMessage]
|
Conversation context as list of chat messages |
output |
List[DataOutput]
|
List of possible responses with evaluations |
task_category |
Optional[str]
|
Optional categorization for task-specific analysis |
source |
Optional[str]
|
Origin dataset or system that generated this sample |
created_at |
datetime
|
Timestamp for temporal tracking |
metadata |
Optional[Dict]
|
Additional context and debugging information |
Source code in rm_gallery/core/data/schema.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
update(sample)
Merge another sample's data into this sample for combining evaluations.
Updates additional_kwargs and reward details from the source sample while preserving the original structure.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
DataSample
|
Source sample to merge data from |
required |
Returns:
Type | Description |
---|---|
DataSample
|
Self with updated data for method chaining |
Source code in rm_gallery/core/data/schema.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
|
Reward
Bases: BaseModel
Reward evaluation result for data samples with detailed scoring breakdown.
Stores both overall score and dimension-specific details for comprehensive reward evaluation tracking and analysis.
Attributes:
Name | Type | Description |
---|---|---|
score |
float
|
Overall aggregated reward score (typically weighted average) |
details |
List[RewardDimensionWithScore]
|
List of individual reward dimensions with their scores |
Source code in rm_gallery/core/data/schema.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
Step
Bases: ChatMessage
Individual reasoning step in a multi-step process with evaluation metadata.
Extends ChatMessage to include step-specific labels and reward evaluations. Used for tracking intermediate steps in complex reasoning tasks.
Attributes:
Name | Type | Description |
---|---|---|
label |
Optional[Dict[str, Any]]
|
Additional labeling information for the step |
reward |
Reward
|
Reward evaluation specific to this step |
Source code in rm_gallery/core/data/schema.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|