<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>LLM Tools | Twinkle</title><link>https://modelscope.github.io/twinkle-web/tags/llm-tools/</link><atom:link href="https://modelscope.github.io/twinkle-web/tags/llm-tools/index.xml" rel="self" type="application/rss+xml"/><description>LLM Tools</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Mon, 01 Jun 2026 00:00:00 +0000</lastBuildDate><image><url>https://modelscope.github.io/twinkle-web/media/logo_hu_fedc6a0bfe689b18.png</url><title>LLM Tools</title><link>https://modelscope.github.io/twinkle-web/tags/llm-tools/</link></image><item><title>TUI &amp; Auto-Research: An AI Agent for Training Control</title><link>https://modelscope.github.io/twinkle-web/blog/tui-auto-research/</link><pubDate>Mon, 01 Jun 2026 00:00:00 +0000</pubDate><guid>https://modelscope.github.io/twinkle-web/blog/tui-auto-research/</guid><description>&lt;p&gt;Twinkle ships a terminal-based UI (TUI) powered by an embedded LLM agent that can autonomously start, monitor, pause, and debug ML training runs. This post covers the architecture of the TUI, the agent loop, and the tool system that makes &amp;ldquo;auto-research&amp;rdquo; possible.&lt;/p&gt;
&lt;h2 id="architecture-overview"&gt;Architecture Overview&lt;/h2&gt;
&lt;p&gt;The TUI is built on
and consists of four panels in a 2x3 grid layout:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Panel&lt;/th&gt;
&lt;th&gt;Position&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;StatusBar&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Top, full width&lt;/td&gt;
&lt;td&gt;Run ID, model, step counter, training state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MetricsPanel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Middle left&lt;/td&gt;
&lt;td&gt;Real-time loss/reward/grad_norm charts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LogPanel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Right, spanning 2 rows&lt;/td&gt;
&lt;td&gt;Streaming stdout from training process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ChatPanel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bottom left&lt;/td&gt;
&lt;td&gt;Natural language interaction with the agent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-css" data-lang="css"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;Screen&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;grid-size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;grid-rows&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;auto&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;fr&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;fr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;grid-columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;fr&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;fr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="the-agent-loop"&gt;The Agent Loop&lt;/h2&gt;
&lt;p&gt;At the heart of the TUI is &lt;code&gt;AgentLoop&lt;/code&gt; — an async tool-calling agent that uses any &lt;strong&gt;OpenAI-compatible API&lt;/strong&gt; (local Ollama, cloud API, etc.):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AgentLoop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;llm_base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;http://localhost:11434/v1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;llm_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;qwen3.5&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;llm_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;not-needed&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The loop follows a standard ReAct pattern:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;User sends a message via ChatPanel&lt;/li&gt;
&lt;li&gt;Agent calls LLM with conversation history + tool schemas&lt;/li&gt;
&lt;li&gt;LLM either responds directly or generates tool calls&lt;/li&gt;
&lt;li&gt;Tools are executed, results fed back to LLM&lt;/li&gt;
&lt;li&gt;Repeat until LLM produces a final text response (max 10 rounds)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Key design decisions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Streaming&lt;/strong&gt;: Tokens are streamed to the UI in real-time. If tool calls are detected mid-stream, &lt;code&gt;on_stream_reset&lt;/code&gt; discards partial output&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;History pruning&lt;/strong&gt;: Conversation is capped at 50 messages (excluding system prompt), with cuts always at &lt;code&gt;user&lt;/code&gt; message boundaries to avoid breaking tool-call sequences&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Async skills loading&lt;/strong&gt;: Skills are loaded in the background — the agent is usable immediately, skills are injected via &lt;code&gt;inject_skills()&lt;/code&gt; when ready&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="tool-system"&gt;Tool System&lt;/h2&gt;
&lt;p&gt;The agent has access to 15+ tools organized into categories:&lt;/p&gt;
&lt;h3 id="training-lifecycle"&gt;Training Lifecycle&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;start_server&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Launch Ray cluster + Twinkle Server (GPU partition, config generation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;shutdown_server&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stop server and release GPU resources&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;start_training&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Write training script, launch process, begin monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pause_training&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SIGKILL client process (server retains state)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;resume_training&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Re-launch client script from saved state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stop_training&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Graceful stop with checkpoint saving&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;update_script&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Archive current script, write new version&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="discovery--search"&gt;Discovery &amp;amp; Search&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_training_runs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List active and historical runs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_training_status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Get run state + recent metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;search_models&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search ModelScope Hub for models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;search_datasets&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search ModelScope Hub for datasets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_supported_models&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Query server for available models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_cluster_info&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Detect GPU resources (Ray or nvidia-smi)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="visualization"&gt;Visualization&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;zoom_metrics&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pan/zoom the metrics chart&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;select_metrics&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Choose which metrics to display (max 4)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;select_run&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch monitoring to a different run&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="server-startup-pipeline"&gt;Server Startup Pipeline&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;start_server&lt;/code&gt; tool orchestrates a complete server deployment:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Hardware detection&lt;/strong&gt; — &lt;code&gt;nvidia-smi&lt;/code&gt; GPU count&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPU allocation&lt;/strong&gt; — Partition GPUs between training model and sampler/teacher models&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Config generation&lt;/strong&gt; — Auto-generate &lt;code&gt;server_config.yaml&lt;/code&gt; with Ray Serve applications&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray cluster start&lt;/strong&gt; — Multi-node GPU partitioning with separate raylets per role&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Server launch&lt;/strong&gt; — &lt;code&gt;python -m twinkle.server launch --config ...&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Health check&lt;/strong&gt; — Poll &lt;code&gt;/api/v1/healthz&lt;/code&gt; + sampler engine readiness&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The config generator supports &lt;strong&gt;multi-model topology&lt;/strong&gt;: one training model + N sampler/teacher models, with GPU sorting by size (largest PG deploys first to avoid scheduling deadlock).&lt;/p&gt;
&lt;h2 id="skills-system"&gt;Skills System&lt;/h2&gt;
&lt;p&gt;The TUI supports extensible &lt;strong&gt;skills&lt;/strong&gt; — pluggable capabilities loaded from three sources:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Bundled skills&lt;/strong&gt; — shipped with the &lt;code&gt;twinkle_client&lt;/code&gt; package&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Local skills&lt;/strong&gt; — user-defined in &lt;code&gt;~/.cache/twinkle/tui/skills/local/&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Community skills&lt;/strong&gt; — fetched from ModelScope (with 10s timeout)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Skills are loaded asynchronously after the agent starts, so the TUI is interactive immediately.&lt;/p&gt;
&lt;h2 id="trainingruntime-script-side-integration"&gt;TrainingRuntime: Script-Side Integration&lt;/h2&gt;
&lt;p&gt;Training scripts integrate with the TUI via &lt;code&gt;TrainingRuntime&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;twinkle_client.tui.runtime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TrainingRuntime&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;rt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TrainingRuntime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;grpo-gsm8k&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Qwen/Qwen3.5-4B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;register_graceful_shutdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dataloader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dataloader&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_metrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Step &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;, loss=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s1"&gt;.4f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;finish&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Key features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;metrics.jsonl&lt;/strong&gt; — structured metrics with auto-timestamp, streamed to TUI in real-time&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Graceful shutdown&lt;/strong&gt; — SIGTERM handler saves checkpoint (LoRA weights + optimizer state + dataloader position)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Auto-resume&lt;/strong&gt; — &lt;code&gt;get_resume_info()&lt;/code&gt; reads last saved step from &lt;code&gt;meta.json&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Script archival&lt;/strong&gt; — each &lt;code&gt;update_script&lt;/code&gt; call archives &lt;code&gt;train.py&lt;/code&gt; as &lt;code&gt;train_v{N}.py&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="getting-started"&gt;Getting Started&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Start TUI with local LLM&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;twinkle tui --llm-base-url http://localhost:11434/v1 --llm-model qwen3.5
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Or with a specific run&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;twinkle tui --run-id my-grpo-run
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The TUI turns ML training into a conversation — describe what you want to train, and the agent handles server setup, script writing, monitoring, and troubleshooting.&lt;/p&gt;</description></item></channel></rss>