data_juicer.utils.lazy_loader module¶
A LazyLoader class for on-demand module loading with uv integration.
- class data_juicer.utils.lazy_loader.LazyLoader(module_name: str, package_name: str = None, package_url: str = None, auto_install: bool = True)[源代码]¶
基类:
ModuleType
Lazily import a module, mainly to avoid pulling in large dependencies. Uses uv for fast dependency installation when available.
- classmethod get_package_name(module_name: str) str [源代码]¶
Convert a module name to its corresponding package name.
- 参数:
module_name -- The name of the module (e.g., 'cv2', 'PIL')
- 返回:
The corresponding package name (e.g., 'opencv-python', 'Pillow')
- 返回类型:
str
- classmethod get_all_dependencies()[源代码]¶
Get all dependencies, prioritizing uv.lock if available. Falls back to pyproject.toml if uv.lock is not found or fails to parse.
- 返回:
- A dictionary mapping module names to their full package specifications
e.g. {'numpy': 'numpy>=1.26.4,<2.0.0', 'pandas': 'pandas>=2.0.0'}
- 返回类型:
dict
- classmethod check_packages(package_specs, pip_args=None)[源代码]¶
Check if packages are installed and install them if needed.
- 参数:
package_specs -- A list of package specifications to check/install. Can be package names or URLs (e.g., 'torch' or 'git+https://github.com/...')
pip_args -- Optional list of additional arguments to pass to pip install command (e.g., ['--no-deps', '--upgrade'])
- __init__(module_name: str, package_name: str = None, package_url: str = None, auto_install: bool = True)[源代码]¶
Initialize the LazyLoader.
- 参数:
module_name -- The name of the module to import (e.g., 'cv2', 'ray.data', 'torchvision.models')
package_name -- The name of the pip package to install (e.g., 'opencv-python', 'ray', 'torchvision') If None, will use the base module name (e.g., 'ray' for 'ray.data')
package_url -- The URL to install the package from (e.g., git+https://github.com/...)
auto_install -- Whether to automatically install missing dependencies