Data Format | Twinkle

Message

Mon, 01 Jan 0001 00:00:00 +0000

A message represents a single round of information in a model conversation. The message definition is:


class FunctionCall(TypedDict, total=False):
 name: str
 arguments: Union[str, Dict[str, Any]]

class ToolCall(TypedDict, total=False):
 id: str
 type: Literal['function']
 function: FunctionCall

class Message(TypedDict, total=False):
 role: Literal['system', 'user', 'assistant', 'tool']
 type: str
 content: Union[str, List[Dict[str, str]]]
 tool_calls: List[ToolCall]
 tool_call_id: str
 reasoning_content: str
 images: Optional[List[Union[str, Any]]]
 videos: Optional[List[Union[str, Any]]]
 audios: Optional[List[Union[str, Any]]]

Essentially, Message is a Dict. It contains several fields, with the following being strongly relevant to developers:

role: Message type, including four types: ‘system’, ‘user’, ‘assistant’, ’tool’.
- system: System instruction message, only appears in the 0th message
- user: User input message
- assistant: Model reply message
- tool: Tool call result, similar to user message input to the model
content: Message body, if it contains multimodal information, then placeholders are needed:
- : Image placeholder
- : Video placeholder
- : Audio placeholder

<image>The image shows a grassland with three rabbits on it.

tool_calls: Tool call list, information output by the model to the user, usually parsed from the content corresponding to assistant.
- ToolCall matches the OpenAI chat-completion schema: the outer dict is {type: "function", function: {...}}, with the tool name at function.name. arguments must be a dict at chat-template render time (dispatch also accepts a JSON string).
images: Original image information contained in the message
videos: Original video information contained in the message
audios: Original audio information contained in the message

Trajectory

Mon, 01 Jan 0001 00:00:00 +0000

The raw data structure input to Template after dataset ETL is Trajectory (trajectory). This is a naming method that conforms to AgenticRL, mainly representing the actual performance of the model’s multi-turn conversation.

class Trajectory(TypedDict, total=False):
 messages: List[Message]
 tools: List[Tool]
 user_data: List[Tuple[str, Any]]

messages: A list of Message messages, representing the multi-turn conversations actually conducted by the model, usually alternating between user and assistant.
tools: A list of all available tools for the model in this call
user_data: User-defined data, such as labels in KTO training

For preference alignment training like DPO, preprocessors return {'positive': List[Trajectory], 'negative': List[Trajectory]} format.

Trajectory is the standard interface for all dataset preprocessing outputs and template inputs in Twinkle. The format conversion goes from the original dataset to Trajectory, and then to InputFeature.

Model Input

Mon, 01 Jan 0001 00:00:00 +0000

The class used by Twinkle to represent model input is InputFeature, which is adapted to model structures such as transformers/megatron.

InputType = Union[List[List[int]], List[int], np.ndarray, Any]

class InputFeature(TypedDict, total=False):
 # Text-related fields
 input_ids: InputType
 attention_mask: InputType
 position_ids: InputType
 labels: InputType

InputFeature is essentially a Dict. Its input comes from the output of the Template component.

input_ids: Token list after List[Messages] is nested with a template
attention_mask: Attention mask
position_ids: Position encoding for sample distinction
labels: Training labels, which have already undergone a one-token left shift

In the case of packing or padding_free, fields such as input_ids are concatenated from lists of multiple samples. In multimodal scenarios, InputFeature contains other multimodal fields.

InputFeature is the standard interface for all template outputs and model inputs in Twinkle.

Model Output

Mon, 01 Jan 0001 00:00:00 +0000

The class used by Twinkle to represent model output is ModelOutput, which is adapted to model structures such as transformers/megatron.

class ModelOutput(TypedDict, total=False):
 logits: OutputType
 loss: OutputType

ModelOutput is essentially a Dict. Its fields come from the model’s output and loss calculation.

logits: Generally [BatchSize * SequenceLength * VocabSize] size, used with labels to calculate loss
loss: Actual residual

ModelOutput is the standard interface for all model outputs in Twinkle.

Sampling Output

Mon, 01 Jan 0001 00:00:00 +0000

Sampling output is a data format used to represent input parameters and return results of the sampling process.

SamplingParams

Sampling parameters are used to control the model’s sampling behavior.

@dataclass
class SamplingParams:
 max_tokens: Optional[int] = None
 seed: Optional[int] = None
 stop: Union[str, Sequence[str], Sequence[int], None] = None
 temperature: float = 1.0
 top_k: int = -1
 top_p: float = 1.0
 repetition_penalty: float = 1.0

max_tokens: Maximum number of tokens to generate
seed: Random seed
stop: Stop sequences, can be a string, sequence of strings, or sequence of token ids
temperature: Temperature parameter controlling sampling randomness. 0 means greedy sampling
top_k: Top-K sampling parameter, -1 means not used
top_p: Top-P (nucleus) sampling parameter
repetition_penalty: Repetition penalty coefficient

Conversion Methods

SamplingParams provides conversion methods to adapt to different inference engines:

# Convert to vLLM's SamplingParams
vllm_params = params.to_vllm(num_samples=4, logprobs=True, prompt_logprobs=0)

# Convert to transformers' generate parameters
gen_kwargs = params.to_transformers(tokenizer=tokenizer)

SampleResponse

Sample response is the result data structure returned by the sampler.

@dataclass
class SampleResponse:
 trajectories: List[Trajectory]
 logprobs: Optional[List[List[float]]] = None
 prompt_logprobs: Optional[List[List[float]]] = None
 stop_reason: Optional[List[StopReason]] = None

trajectories: List of generated trajectories
logprobs: Log probabilities of generated tokens
prompt_logprobs: Log probabilities of prompt tokens
stop_reason: Stop reason, can be “length” (reached max length) or “stop” (encountered stop sequence)

Usage example:

from twinkle.data_format import SamplingParams, SampleResponse
from twinkle.sampler import vLLMSampler

sampler = vLLMSampler(model_id='ms://Qwen/Qwen3.5-4B')
params = SamplingParams(max_tokens=512, temperature=0.7, top_p=0.9)
response: SampleResponse = sampler.sample(trajectories, sampling_params=params, num_samples=4)

# Access generated trajectories
for traj in response.trajectories:
 print(traj.messages)

Model Output

Mon, 01 Jan 0001 00:00:00 +0000

Detailed type definition for model output.

OutputType

OutputType defines the data types supported by model output:

OutputType = Union[np.ndarray, 'torch.Tensor', List[Any]]

Supports NumPy arrays, PyTorch tensors, or lists of any type.

ModelOutput

ModelOutput is the standard class used by Twinkle to represent model output. This class is adapted for model structures such as transformers/megatron.

class ModelOutput(TypedDict, total=False):
 logits: OutputType
 loss: OutputType

ModelOutput is essentially a Dict. Its fields come from the model’s output and loss calculation.

logits: Generally [BatchSize * SequenceLength * VocabSize] size, used with labels to calculate loss
loss: Actual residual

ModelOutput is the standard interface for all model outputs in Twinkle.

Usage example:

from twinkle.data_format import ModelOutput

# In the model's forward method
def forward(self, inputs):
 ...
 return ModelOutput(
 logits=logits,
 loss=loss
 )

Note: ModelOutput is defined using TypedDict, meaning it’s a regular dict at runtime but provides type hints during type checking.