data_juicer.ops.mapper.python_lambda_mapper module¶
- class data_juicer.ops.mapper.python_lambda_mapper.PythonLambdaMapper(lambda_str: str = '', batched: bool = False, **kwargs)[source]¶
Bases:
Mapper
Mapper for applying a Python lambda function to data samples.
This operator allows users to define a custom transformation using a Python lambda function. The lambda function is applied to each sample, and the result must be a dictionary. If the batched parameter is set to True, the lambda function will process a batch of samples at once. If no lambda function is provided, the identity function is used, which returns the input sample unchanged. The operator validates the lambda function to ensure it has exactly one argument and compiles it safely.
- __init__(lambda_str: str = '', batched: bool = False, **kwargs)[source]¶
Initialization method.
- Parameters:
lambda_str – A string representation of the lambda function to be executed on data samples. If empty, the identity function is used.
batched – A boolean indicating whether to process input data in batches.
kwargs – Additional keyword arguments passed to the parent class.