Use Cases

Pick the shortest path from evaluation to production. FunASR covers local transcription, private OpenAI-compatible APIs, Kubernetes services, agent voice input, streaming services, vLLM acceleration, subtitles, and batch processing.

Choose the right path

GoalStart hereWhy it matters
Try FunASR in a browserColab quickstartRun a public sample and upload your own audio before setting up a local environment.
Transcribe one file locallyTutorial · Model selectionVerify install, model choice, and model download in minutes.
Compare accuracy and speedBenchmark reportReview long-audio speed and CER before choosing a model.
Migrate from Whisper/cloud ASRMigration guide · Benchmark exampleMap existing pipelines to FunASR, benchmark representative audio, and plan a safe rollout.
Build a private speech APIOpenAI-compatible API · Kubernetes template · JS/TS recipes · Gradio demo · Security guide · Workflow recipesReuse OpenAI-style clients, Dify, n8n, HTTP workflow nodes, and a Gradio browser UI while planning TLS, auth, upload limits, and logs at the service boundary.
Add speech input to agentsMCP serverConnect local ASR to Claude, Cursor, desktop tools, and internal assistants.
Choose a deployment pathDeployment matrixCompare Python API, OpenAI API, Docker Compose, Kubernetes, WebSocket, vLLM, MCP, subtitles, batch jobs, and Triton.
Serve streaming ASRRealtime examplesHandle live captioning, meetings, and call-center style workloads.
Accelerate LLM-based ASRvLLM guideUse tensor parallel decoding and streaming service support for Fun-ASR-Nano.
Generate subtitlesSubtitle generatorCreate SRT/VTT files from audio or video, with speaker labels when needed.
Process many recordingsBatch ASR exampleBuild repeatable offline jobs for archives, meetings, and datasets.

Production recipes

Private transcription API

Use this path when an application already speaks OpenAI-style APIs or when audio cannot leave your environment.

pip install funasr fastapi uvicorn python-multipart
funasr-server --model sensevoice --device cuda

curl http://localhost:8000/v1/audio/transcriptions \
  -F file=@sample.wav \
  -F model=sensevoice \
  -F response_format=verbose_json

Agent speech input

Start from the MCP server when you want to talk to coding agents, internal assistants, or workflow tools.

pip install funasr
python examples/mcp_server/funasr_mcp.py

# Set FUNASR_DEVICE=cuda for GPU inference

Streaming workloads

Pair ASR with VAD, punctuation, and speaker diarization when partial transcripts need to be readable by humans.

Validate with real audio: background noise, long silence, overlapping speakers, and different microphone quality.

Benchmark before migration

Compare FunASR against Whisper or cloud ASR using your own sample set. Track throughput, CPU viability, download size, and deployment complexity together.

Open the migration guide · Public benchmark

Model selection hints

For a deeper comparison of SenseVoice, Paraformer, Fun-ASR-Nano, streaming runtime, and OpenAI API aliases, use the model selection guide.

NeedGood first choiceNotes
Fast multilingual transcriptionSenseVoice-SmallStrong default for local demos and private APIs.
Mandarin production ASRParaformer-LargeMature choice for Chinese speech recognition.
LLM-based ASR experimentsFun-ASR-NanoPair with vLLM when throughput matters.
Speaker-aware transcriptsSenseVoice or Paraformer with spk_model="cam++"Useful for meetings, interviews, and customer calls.
Live audioRuntime WebSocket serviceValidate chunking, VAD, and endpointing with real traffic.

Share your result

If FunASR works well in your project, share the use case, model, device, processing speed, audio domain, and a public demo or benchmark summary when possible.

Share a showcase issue, submit a migration benchmark report, or start a discussion. Concrete usage and benchmark reports help new users choose the right path and help maintainers prioritize docs and examples.