Give Your AI Agent Ears

One command turns FunASR into a speech backend for any AI framework. OpenAI-compatible API, MCP server, self-hosted, 170x realtime.

Quick Start

Two lines to add speech recognition to your agent stack:

# Install & start $ pip install funasr fastapi uvicorn python-multipart $ funasr-server --device cuda --port 8000 # Now any OpenAI SDK client works from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed") result = client.audio.transcriptions.create( model="sensevoice", file=open("audio.wav", "rb") ) print(result.text)

Integration Methods

REST API

OpenAI-Compatible Server

Drop-in replacement for /v1/audio/transcriptions. Works with LangChain, AutoGen, CrewAI, Dify, Flowise, Open WebUI — any framework using OpenAI audio API.

funasr-server --device cuda --port 8000
MCP

Model Context Protocol

Add FunASR as a tool in Claude Code, Cursor, or Windsurf. Your AI assistant can transcribe any audio file directly.

// claude_desktop_config.json {"mcpServers": {"funasr": { "command": "python", "args": ["funasr_mcp.py"] }}}
WebSocket

Streaming Server

Real-time voice input via WebSocket. 2Pass mode: instant partial results + high-accuracy final correction. For voice agents that need low latency.

docker run -p 10095:10095 \ funasr-runtime-sdk-online-cpu
Python

Direct SDK

Use FunASR directly as a Python function in your agent code. No server needed for single-process applications.

from funasr import AutoModel model = AutoModel(model="iic/SenseVoiceSmall") text = model.generate(input="a.wav")[0]["text"]

Compatible Frameworks

FrameworkStarsIntegrationMethod
LangChain137KOpenAI audio toolChange base_url
Dify142KSTT providerOpenAI-compatible endpoint
Open WebUI138KSpeech-to-textOpenAI-compatible endpoint
AutoGen47KAgent tool functionOpenAI SDK
Flowise30KSTT nodeOpenAI-compatible endpoint
Claude / CursorAudio transcription toolMCP Server
Pipecat12KSTT serviceWebSocket / OpenAI
LiveKit AgentsSTT pluginWebSocket streaming

Available Models

ModelSpeed (GPU)Speed (CPU)LanguagesBest For
sensevoice170x realtime17x realtimezh/en/ja/ko/yueGeneral + emotion
paraformer120x realtime15x realtimezh/enChinese production
fun-asr-nano17x realtime3.6x realtime31 languagesMultilingual + LLM

Get Started

Two commands. No config files. No Docker required.

pip install funasr fastapi uvicorn python-multipart funasr-server --device cuda
OpenAI API Docs MCP Server Docs