base
Base data module framework providing abstract interfaces for data pipeline components. Defines common structure and behavior for all data processing modules in the system.
BaseDataModule
Bases: BaseModule
Abstract base class for all data processing modules in the pipeline.
Provides common interface and metadata management for data operations. All concrete data modules must inherit from this class and implement the run method.
Attributes:
Name | Type | Description |
---|---|---|
module_type |
DataModuleType
|
Type classification of the data module from DataModuleType enum |
name |
str
|
Unique identifier for the module instance |
config |
Optional[Dict[str, Any]]
|
Module-specific configuration parameters |
metadata |
Optional[Dict[str, Any]]
|
Additional metadata for tracking and debugging |
Source code in rm_gallery/core/data/base.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
get_module_info()
Retrieve comprehensive module information for debugging and monitoring.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict containing module type, name, configuration, and metadata |
Dict[str, Any]
|
Used for pipeline introspection and debugging |
Source code in rm_gallery/core/data/base.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
run(input_data, **kwargs)
abstractmethod
Execute the module's data processing logic.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_data
|
Union[BaseDataSet, List[DataSample]]
|
Input dataset or list of data samples to process |
required |
**kwargs
|
Additional runtime parameters specific to the module |
{}
|
Returns:
Type | Description |
---|---|
Processed data in the form of BaseDataSet or List[DataSample] |
Raises:
Type | Description |
---|---|
NotImplementedError
|
If not implemented by concrete subclass |
Source code in rm_gallery/core/data/base.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
|
DataModuleType
Bases: Enum
Enumeration of supported data module types for categorizing processing components.
Each type represents a distinct stage in the data pipeline: - BUILD: Orchestrates the entire data pipeline workflow - LOAD: Ingests data from external sources - GENERATE: Creates new data samples programmatically - PROCESS: Transforms and filters existing data - ANNOTATION: Adds labels and metadata to data - EXPORT: Outputs data to various target formats
Source code in rm_gallery/core/data/base.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|