video_captioning_from_summarizer_mapper¶
Mapper to generate video captions by summarizing several kinds of generated texts (captions from video/audio/frames, tags from audio/frames, …)
通过总结多种生成的文本(来自视频/音频/帧的字幕、来自音频/帧的标签等)来生成视频字幕的映射器。
Type 算子类型: mapper
Tags 标签: cpu, hf, multimodal
🔧 Parameter Configuration 参数配置¶
name 参数名 |
type 类型 |
default 默认值 |
desc 说明 |
---|---|---|---|
|
<class ‘str’> |
|
the summarizer model used to summarize texts |
|
<class ‘bool’> |
|
|
|
<class ‘bool’> |
|
whether to consider the video |
|
<class ‘bool’> |
|
whether to consider the video |
|
<class ‘bool’> |
|
whether to consider the |
|
<class ‘bool’> |
|
whether to consider the video |
|
<class ‘bool’> |
|
whether to consider the video |
|
typing.Optional[typing.Dict] |
|
the arg dict for video captioning from |
|
typing.Optional[typing.Dict] |
|
the arg dict for video captioning from |
|
typing.Optional[typing.Dict] |
|
the arg dict for video tagging from audio |
|
typing.Optional[typing.Dict] |
|
the arg dict for video tagging from |
|
typing.Annotated[int, Gt(gt=0)] |
|
max number N of tags from sampled frames to keep. |
|
<class ‘bool’> |
|
whether to keep the original sample. If |
|
|
extra args |
|
|
|
extra args |
📊 Effect demonstration 效果演示¶
not available 暂无