Use Cases

Pick the shortest path from evaluation to production. FunASR covers local transcription, private OpenAI-compatible APIs, Kubernetes services, agent voice input, streaming services, vLLM acceleration, subtitles, and batch processing.

Choose a path Production recipes Model hints Share results

Choose the right path

Goal	Start here	Why it matters
Try FunASR in a browser	Colab quickstart	Run a public sample and upload your own audio before setting up a local environment.
Transcribe one file locally	Tutorial · Model selection	Verify install, model choice, and model download in minutes.
Compare accuracy and speed	Benchmark report	Review long-audio speed and CER before choosing a model.
Migrate from Whisper/cloud ASR	Migration guide · Benchmark example	Map existing pipelines to FunASR, benchmark representative audio, and plan a safe rollout.
Build a private speech API	OpenAI-compatible API · Kubernetes template · JS/TS recipes · Gradio demo · Security guide · Workflow recipes	Reuse OpenAI-style clients, Dify, n8n, HTTP workflow nodes, and a Gradio browser UI while planning TLS, auth, upload limits, and logs at the service boundary.
Add speech input to agents	MCP server	Connect local ASR to Claude, Cursor, desktop tools, and internal assistants.
Choose a deployment path	Deployment matrix	Compare Python API, OpenAI API, Docker Compose, Kubernetes, WebSocket, vLLM, MCP, subtitles, batch jobs, and Triton.
Serve streaming ASR	Realtime examples	Handle live captioning, meetings, and call-center style workloads.
Accelerate LLM-based ASR	vLLM guide	Use tensor parallel decoding and streaming service support for Fun-ASR-Nano.
Generate subtitles	Subtitle generator	Create SRT/VTT files from audio or video, with speaker labels when needed.
Process many recordings	Batch ASR example	Build repeatable offline jobs for archives, meetings, and datasets.

Production recipes

Private transcription API

Use this path when an application already speaks OpenAI-style APIs or when audio cannot leave your environment.

pip install funasr fastapi uvicorn python-multipart
funasr-server --model sensevoice --device cuda

curl http://localhost:8000/v1/audio/transcriptions \
  -F file=@sample.wav \
  -F model=sensevoice \
  -F response_format=verbose_json

Agent speech input

Start from the MCP server when you want to talk to coding agents, internal assistants, or workflow tools.

pip install funasr
python examples/mcp_server/funasr_mcp.py

# Set FUNASR_DEVICE=cuda for GPU inference

Streaming workloads

Pair ASR with VAD, punctuation, and speaker diarization when partial transcripts need to be readable by humans.

Validate with real audio: background noise, long silence, overlapping speakers, and different microphone quality.

Benchmark before migration

Compare FunASR against Whisper or cloud ASR using your own sample set. Track throughput, CPU viability, download size, and deployment complexity together.

Open the migration guide · Public benchmark

Model selection hints

For a deeper comparison of SenseVoice, Paraformer, Fun-ASR-Nano, streaming runtime, and OpenAI API aliases, use the model selection guide.

Need	Good first choice	Notes
Fast multilingual transcription	SenseVoice-Small	Strong default for local demos and private APIs.
Mandarin production ASR	Paraformer-Large	Mature choice for Chinese speech recognition.
LLM-based ASR experiments	Fun-ASR-Nano	Pair with vLLM when throughput matters.
Speaker-aware transcripts	SenseVoice or Paraformer with `spk_model="cam++"`	Useful for meetings, interviews, and customer calls.
Live audio	Runtime WebSocket service	Validate chunking, VAD, and endpointing with real traffic.