data_juicer.utils.asset_utils module¶
- data_juicer.utils.asset_utils.load_words_asset(words_dir: str, words_type: str)[源代码]¶
Load words from a asset file named words_type, if not find a valid asset file, then download it from ASSET_LINKS cached by data_juicer team.
- 参数:
words_dir -- directory that stores asset file(s)
words_type -- name of target words assets
- 返回:
a dict that stores words assets, whose keys are language names, and the values are lists of words