Model Registration Guide
How to add new models to FunASR using the registry system.
Architecture Overview
FunASR uses a registry pattern to decouple model discovery from model implementation. The core flow is:
User: AutoModel(model="ModelName") → download_model(): fetch from ModelScope/HuggingFace, read config.yaml → tables.model_classes["ModelName"]: lookup registered class → model_class(**config): instantiate model → load_pretrained_model(): load weights from model.pt → model.eval().to(device): ready for inference
Registry Categories
| Category | Purpose | Example |
|---|---|---|
model_classes | ASR/VAD/PUNC/SPK models | Paraformer, SenseVoiceSmall, FsmnVADStreaming |
frontend_classes | Audio feature extraction | WavFrontend, WavFrontendOnline |
tokenizer_classes | Text tokenization | SentencepiecesTokenizer, CharTokenizer |
encoder_classes | Encoder modules | SANMEncoder, ConformerEncoder |
decoder_classes | Decoder modules | ParaformerSANMDecoder |
dataset_classes | Training datasets | AudioDataset, SenseVoiceCTCDataset |
batch_sampler_classes | Batch sampling strategies | DynamicBatchLocalShuffleSampler |
Viewing the Registry
from funasr.register import tables
# Print all registered classes
tables.print()
# Print specific category
tables.print("model")
Step-by-Step: Register a New Model
1Create Model Directory
Create a new directory under funasr/models/your_model/:
funasr/models/your_model/ ├── __init__.py # empty ├── model.py # main model class └── (other files) # encoder, decoder, utils, etc.
2Implement the Model Class
Your model class must implement 3 methods: __init__, forward, inference.
import torch.nn as nn
from funasr.register import tables
@tables.register("model_classes", "YourModelName")
class YourModel(nn.Module):
def __init__(self, **kwargs):
super().__init__()
# Build your model architecture
# kwargs contains everything from config.yaml + runtime params
def forward(self, speech, speech_lengths, text, text_lengths, **kwargs):
# Training forward pass
# Return: (loss, stats_dict, weight)
...
def inference(self, data_in, data_lengths=None, key=None,
tokenizer=None, frontend=None, **kwargs):
# Inference: process audio, return results
# data_in: list of audio (numpy/tensor/path)
# Return: ([{"key": ..., "text": ..., ...}], meta_data_dict)
...
3Create config.yaml
This defines model architecture and all components:
# config.yaml
model: YourModelName # must match @tables.register key
model_conf:
hidden_size: 512
num_layers: 6
frontend: WavFrontend # reuse existing frontend
frontend_conf:
fs: 16000
n_mels: 80
frame_length: 25
frame_shift: 10
cmvn_file: null
tokenizer: SentencepiecesTokenizer
tokenizer_conf:
bpemodel: null
# Training config (optional)
dataset: AudioDataset
dataset_conf:
batch_size: 32
batch_type: example
4Create configuration.json (for Hub distribution)
This file resolves relative paths when model is downloaded from Hub:
{
"framework": "pytorch",
"task": "auto-speech-recognition",
"model": {"type": "funasr"},
"file_path_metas": {
"init_param": "model.pt",
"config": "config.yaml",
"tokenizer_conf": {"bpemodel": "tokenizer.model"},
"frontend_conf": {"cmvn_file": "am.mvn"}
}
}
file_path_metas maps config fields to filenames. AutoModel prepends the model download directory to each path automatically.
5Upload to Hub
Upload to ModelScope or HuggingFace with this directory structure:
your-model-repo/
├── config.yaml
├── configuration.json
├── model.pt # trained weights
├── am.mvn # CMVN file (if needed)
├── tokenizer.model # BPE model (if needed)
└── example/
└── test.wav # example audio for demo
6Test
from funasr import AutoModel # From Hub model = AutoModel(model="your-org/your-model", device="cuda:0") res = model.generate(input="test.wav") print(res) # From local path model = AutoModel(model="/path/to/your-model-repo", device="cuda:0") res = model.generate(input="test.wav") print(res)
Inference Method Contract
The inference() method is called by AutoModel. It must follow this contract:
Input
| Parameter | Type | Description |
|---|---|---|
data_in | list | Batch of audio data (numpy arrays, tensors, or file paths) |
data_lengths | tensor/None | Length of each sample (optional) |
key | list | Identifier for each sample |
tokenizer | object | Tokenizer instance from config |
frontend | object | Frontend instance from config |
**kwargs | dict | All config params + user params from generate() |
Output
Must return a tuple: (results_list, meta_data_dict)
# results_list: list of dicts, one per sample
[
{"key": "sample_id", "text": "recognized text", "timestamp": [[0, 100], ...]},
...
]
# meta_data_dict: timing info for RTF calculation
{"batch_data_time": 5.5, "load_data": "0.01", "extract_feat": "0.02"}
Standalone Repository (Remote Code)
For code secrecy or independent release, your model can live in a separate repo:
from funasr import AutoModel
# trust_remote_code=True loads model class from remote_code path
model = AutoModel(
model="your-org/your-model",
trust_remote_code=True,
remote_code="./model.py", # local or URL
hub="hf",
)
res = model.generate(input="audio.wav")
trust_remote_code=True, FunASR dynamically loads the model class from the specified file. The model does NOT need to be integrated into the FunASR source tree.
Direct Inference (without AutoModel)
from model import YourModel m, kwargs = YourModel.from_pretrained(model="your-org/your-model") m.eval() res = m.inference(data_in=["audio.wav"], **kwargs) print(res)
Real Examples
Integrated Model (in FunASR source tree)
| Model | Code | Config |
|---|---|---|
| Paraformer | funasr/models/paraformer/model.py | config.yaml in Hub |
| SenseVoice | funasr/models/sense_voice/model.py | config.yaml in Hub |
| FSMN-VAD | funasr/models/fsmn_vad_streaming/model.py | config.yaml in Hub |
| CAM++ | funasr/models/campplus/model.py | config.yaml in Hub |
| Qwen3-ASR | funasr/models/qwen3_asr/model.py | Uses qwen-asr package |
Standalone Model (separate repo)
| Model | Repo | Usage |
|---|---|---|
| Fun-ASR-Nano | FunAudioLLM/Fun-ASR | trust_remote_code=True, remote_code="./model.py" |
| SenseVoice (standalone) | FunAudioLLM/SenseVoice | trust_remote_code=True, remote_code="./model.py" |
Key Rules
- Model isolation: Each model in its own directory. No cross-model imports.
- Reuse shared components: Frontend, tokenizer, dataset — use existing registered ones when possible.
- Don't modify existing models: Register new ones instead.
- Always call
super().__init__(): Required for PyTorchnn.Module. inference()must return tuple:(results_list, meta_data)- Support
batch_size=1: At minimum, handle single-sample inference.
Troubleshooting
"ModelName is not registered"
The model file was not imported. Debug by importing directly:
from funasr.models.your_model.model import * # If this fails, fix the import error first
"module 'xxx' not found"
Missing dependency. Add it to your requirements or install it.
Weights mismatch warnings
Warning, miss key in ckpt: ... means your model class defines layers that aren't in the checkpoint. This is OK if those layers are optional (e.g., CTC decoder not in open-source weights).