This guide will help you get started with ReMe using the Model Context Protocol (MCP) interface for seamless integration with MCP-compatible clients.
🚀 What You'll Learn
- How to set up and configure ReMe MCP server
- How to connect to the server using Python MCP clients
- How to use task memory operations through MCP
- How to build memory-enhanced agents with MCP integration
📋 Prerequisites
- Python 3.12+
- LLM API access (OpenAI or compatible)
- Embedding model API access
- MCP-compatible client (Claude Desktop, or custom MCP client)
🛠️ Installation
Option 1: Install from PyPI (Recommended)
pip install reme-ai
Option 2: Install from Source
git clone https://github.com/modelscope/ReMe.git
cd ReMe
pip install .
⚙️ Environment Setup
Create a .env
file in your project directory:
FLOW_EMBEDDING_API_KEY=sk-xxxx
FLOW_EMBEDDING_BASE_URL=https://xxxx/v1
FLOW_LLM_API_KEY=sk-xxxx
FLOW_LLM_BASE_URL=https://xxxx/v1
🚀 Building an MCP Server with ReMe
ReMe provides a flexible framework for building MCP servers that can communicate using either STDIO or SSE (Server-Sent Events) transport protocols.
Starting the MCP Server
Option 1: STDIO Transport (Recommended for MCP clients)
reme \
backend=mcp \
mcp.transport=stdio \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=local
Option 2: SSE Transport (Server-Sent Events)
reme \
backend=mcp \
mcp.transport=sse \
http_service.port=8001 \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=local
The SSE server will start on http://localhost:8002/sse
Configuring MCP Server for Claude Desktop
To integrate with Claude Desktop, add the following configuration to your claude_desktop_config.json
:
{
"mcpServers": {
"reme": {
"command": "reme",
"args": [
"backend=mcp",
"mcp.transport=stdio",
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=local_file"
]
}
}
}
This configuration:
- Registers a new MCP server named "reme"
- Specifies the command to launch the server (
reme
) - Configures the server to use STDIO transport
- Sets the LLM and embedding models to use
- Configures the vector store backend
Advanced Server Configuration Options
For more advanced use cases, you can configure the server with additional parameters:
# Full configuration example
reme \
backend=mcp \
mcp.transport=stdio \
http_service.host=0.0.0.0 \
http_service.port=8002 \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=elasticsearch \
🔌 Using Python Client to Call MCP Services
The ReMe framework provides a Python client for interacting with MCP services. This section focuses specifically on
using the summary_task_memory
and retrieve_task_memory
tools.
Setting Up the Python MCP Client
First, install the required packages:
pip install fastmcp dotenv
Then, create a basic client connection:
import asyncio
from fastmcp import Client
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# MCP server URL (for SSE transport)
MCP_URL = "http://0.0.0.0:8002/sse/"
WORKSPACE_ID = "my_workspace"
async def main():
async with Client(MCP_URL) as client:
# Your MCP operations will go here
pass
if __name__ == "__main__":
asyncio.run(main())
Using the Task Memory Summarizer
The summary_task_memory
tool transforms conversation trajectories into valuable task memories:
async def run_summary(client, messages):
"""
Generate a summary of conversation messages and create task memories
Args:
client: MCP client instance
messages: List of message objects from a conversation
Returns:
None
"""
try:
result = await client.call_tool(
"summary_task_memory",
arguments={
"workspace_id": "my_workspace",
"trajectories": [
{"messages": messages, "score": 1.0}
]
}
)
# Parse the response
import json
response_data = json.loads(result.content)
# Extract memory list from response
memory_list = response_data.get("metadata", {}).get("memory_list", [])
print(f"Created memories: {memory_list}")
# Optionally save memories to file
with open("task_memory.jsonl", "w") as f:
f.write(json.dumps(memory_list, indent=2, ensure_ascii=False))
except Exception as e:
print(f"Error running summary: {e}")
Using the Task Memory Retriever
The retrieve_task_memory
tool allows you to retrieve relevant memories based on a query:
async def run_retrieve(client, query):
"""
Retrieve relevant task memories based on a query
Args:
client: MCP client instance
query: The query to retrieve relevant memories
Returns:
String containing the retrieved memory answer
"""
try:
result = await client.call_tool(
"retrieve_task_memory",
arguments={
"workspace_id": "my_workspace",
"query": query,
}
)
# Parse the response
import json
response_data = json.loads(result.content)
# Extract and return the answer
answer = response_data.get("answer", "")
print(f"Retrieved memory: {answer}")
return answer
except Exception as e:
print(f"Error retrieving memory: {e}")
return ""
Complete Memory-Augmented Agent Example
Here's a complete example showing how to build a memory-augmented agent using the MCP client:
import json
import asyncio
from fastmcp import Client
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# API configuration
MCP_URL = "http://0.0.0.0:8002/sse/"
WORKSPACE_ID = "test_workspace"
async def run_agent(client, query):
"""Run the agent with a specific query"""
result = await client.call_tool(
"react",
arguments={"query": query}
)
response_data = json.loads(result.content)
answer = response_data.get("answer", "")
messages = response_data.get("messages", [])
return messages
async def run_summary(client, messages):
"""Generate task memories from conversation"""
result = await client.call_tool(
"summary_task_memory",
arguments={
"workspace_id": WORKSPACE_ID,
"trajectories": [
{"messages": messages, "score": 1.0}
]
}
)
response_data = json.loads(result.content)
memory_list = response_data.get("metadata", {}).get("memory_list", [])
return memory_list
async def run_retrieve(client, query):
"""Retrieve relevant task memories"""
result = await client.call_tool(
"retrieve_task_memory",
arguments={
"workspace_id": WORKSPACE_ID,
"query": query,
}
)
response_data = json.loads(result.content)
answer = response_data.get("answer", "")
return answer
async def memory_augmented_workflow():
"""Complete memory-augmented agent workflow"""
query1 = "Analyze Xiaomi Corporation"
query2 = "Analyze the company Tesla."
async with Client(MCP_URL) as client:
# Step 1: Build initial memories with query2
print(f"Building memories with: '{query2}'")
messages = await run_agent(client, query=query2)
# Step 2: Summarize conversation to create memories
print("Creating memories from conversation")
memory_list = await run_summary(client, messages)
print(f"Created {len(memory_list)} memories")
# Step 3: Retrieve relevant memories for query1
print(f"Retrieving memories for: '{query1}'")
retrieved_memory = await run_retrieve(client, query1)
# Step 4: Run agent with memory-augmented query
print("Running memory-augmented agent")
augmented_query = f"{retrieved_memory}\n\nUser Question:\n{query1}"
final_messages = await run_agent(client, query=augmented_query)
# Extract the agent's final answer
final_answer = ""
for msg in final_messages:
if msg.get("role") == "assistant" and msg.get("content"):
final_answer = msg.get("content")
break
print(f"Memory-augmented response: {final_answer}")
# Run the workflow
if __name__ == "__main__":
asyncio.run(memory_augmented_workflow())
Managing Vector Store with MCP
You can also manage your vector store through MCP:
async def manage_vector_store(client):
# Delete a workspace
await client.call_tool(
"vector_store",
arguments={
"workspace_id": WORKSPACE_ID,
"action": "delete",
}
)
# Dump memories to disk
await client.call_tool(
"vector_store",
arguments={
"workspace_id": WORKSPACE_ID,
"action": "dump",
"path": "./backups/",
}
)
# Load memories from disk
await client.call_tool(
"vector_store",
arguments={
"workspace_id": WORKSPACE_ID,
"action": "load",
"path": "./backups/",
}
)
🐛 Common Issues and Troubleshooting
MCP Server Won't Start
- Check if the required ports are available (for SSE transport)
- Verify your API keys in
.env
file - Ensure Python version is 3.12+
- Check MCP transport configuration
MCP Client Connection Issues
- For STDIO: Ensure the command path is correct in your MCP client config
- For SSE: Verify the server URL and port accessibility
- Check firewall settings for SSE connections
No Memories Retrieved
- Make sure you've run the summarizer tool first to create memories
- Check if workspace_id matches between operations
- Verify vector store backend is properly configured
API Connection Errors
- Confirm LLM_BASE_URL and API keys are correct
- Test API access independently
- Check network connectivity