LLM Response Refinement Tutorial¶
1. Overview¶
This tutorial demonstrates how to use the LLMRefinement class for iterative improvement of LLM responses using reward model feedback.
For more advanced usage, such as iterative refinement with comprehensive evaluation to correct datasamples, see data_correction.
Key Concepts:
Iterative Refinement: Repeatedly improve responses through feedback loops
Reward Model Feedback: Use reward model assessments to guide improvements
Response Evolution: Maintain response history to enable refinement
Dynamic Prompting: Construct prompts based on feedback and history
2. Setup¶
First, let's import necessary modules:
# Import core modules
import sys
sys.path.append("../../..")
from rm_gallery.core.data.schema import DataSample, DataOutput, Step, ChatMessage
from rm_gallery.core.model.message import MessageRole
from rm_gallery.core.model.openai_llm import OpenaiLLM
from rm_gallery.core.reward.registry import RewardRegistry
from rm_gallery.core.reward.base import BaseLLMReward
from rm_gallery.core.reward.refinement import LLMRefinement
from loguru import logger
import uuid
3. Create Sample Input¶
Let's start by creating a sample input to work with.
# Create a sample input
sample = DataSample(
unique_id="refinement_demo",
input=[
ChatMessage(
role=MessageRole.USER,
content="Explain quantum computing in simple terms"
)
],
output=[] # We'll generate responses later
)
4. Initialize Reward¶
We'll initialize our reward.
# Initialize LLM for response generation
llm = OpenaiLLM(model="qwen3-8b", enable_thinking=True)
# Initialize reward model
reward: BaseLLMReward = RewardRegistry.get("base_helpfulness_listwise")(
name="helpfulness",
llm=llm
)
5. Run Refinement Process¶
We will give two examples of how to run the refinement process.
5.1. Run in Reward¶
result = refined_sample = reward.refine(sample, max_iterations=3)
print("\n🏆 Final Refined Response:")
print(result)
5.2. Run in Refinement¶
# Create refinement module
refiner = LLMRefinement(
llm=llm,
reward=reward,
max_iterations=3
)
result = refiner.run(sample)
print("\n🏆 Final Refined Response:")
print(result)
6. Detailed Analysis¶
Let's look at what happens during each iteration of the refinement process.
def detailed_run(sample: DataSample, max_iterations: int = 3):
"""Run refinement process with detailed output for each iteration."""
# Initial response generation
response = llm.chat(sample.input)
sample.output.append(DataOutput(answer=Step(
role=MessageRole.ASSISTANT,
content=response.content
)))
print("Initial Response:")
print(response.content)
print("\n" + "-" * 50 + "\n")
# Iterative refinement loop
for i in range(max_iterations):
# Generate feedback
feedback = refiner._generate_feedback(sample)
# Print iteration details
print(f"Iteration {i+1}/{max_iterations}:")
print("Feedback Received:", feedback)
# Generate refined response
sample = refiner._generate_response(sample, feedback)
print("Refined Response:")
print(sample.output[-1].answer.content)
print("\n" + "-" * 50 + "\n")
return sample.output[-1].answer.content
# Run with detailed analysis
sample = DataSample(
unique_id="detailed_run_demo",
input=[
ChatMessage(
role=MessageRole.USER,
content="What are the benefits of regular exercise?"
)
],
output=[] # We'll generate responses later
)
detailed_run(sample)
7. Real-world Applications¶
The refinement approach can be applied in various scenarios such as:
- Academic writing assistance
- Technical documentation improvement
- Educational content creation
- Code explanation refinement
- Research summarization
- Business communication optimization
For production environments, you might want to:
- Implement caching for intermediate responses
- Add comprehensive error handling
- Set up detailed logging
- Implement batch processing capabilities