data_juicer.config¶
- data_juicer.config.init_configs(args: List[str] | None = None, which_entry: object = None, load_configs_only=False)[源代码]¶
- initialize the jsonargparse parser and parse configs from one of:
POSIX-style commands line args;
config files in yaml (json and jsonnet supersets);
environment variables
hard-coded defaults
- 参数:
args -- list of params, e.g., ['--config', 'cfg.yaml'], default None.
which_entry -- which entry to init configs (executor/analyzer)
load_configs_only -- whether to load the configs only, not including backing up config files, display them, and setting up logger.
- 返回:
a global cfg object used by the DefaultExecutor or Analyzer
- data_juicer.config.get_init_configs(cfg: Namespace | Dict, load_configs_only: bool = True)[源代码]¶
set init configs of data-juicer for cfg
- data_juicer.config.export_config(cfg: Namespace, path: str, format: str = 'yaml', skip_none: bool = True, skip_check: bool = True, overwrite: bool = False, multifile: bool = True)[源代码]¶
Save the config object, some params are from jsonargparse
- 参数:
cfg -- cfg object to save (Namespace type)
path -- the save path
format -- 'yaml', 'json', 'json_indented', 'parser_mode'
skip_none -- Whether to exclude entries whose value is None.
skip_check -- Whether to skip parser checking.
overwrite -- Whether to overwrite existing files.
multifile -- Whether to save multiple config files by using the __path__ metas.
- 返回:
- data_juicer.config.merge_config(ori_cfg: Namespace, new_cfg: Namespace)[源代码]¶
Merge configuration from new_cfg into ori_cfg
- 参数:
ori_cfg -- the original configuration object, whose type is expected as namespace from jsonargparse
new_cfg -- the configuration object to be merged, whose type is expected as dict or namespace from jsonargparse
- 返回:
cfg_after_merge