data_juicer.ops.mapper.python_lambda_mapper module

class data_juicer.ops.mapper.python_lambda_mapper.PythonLambdaMapper(lambda_str: str = '', batched: bool = False, **kwargs)[source]

Bases: Mapper

Mapper for executing Python lambda function on data samples.

__init__(lambda_str: str = '', batched: bool = False, **kwargs)[source]

Initialization method.

Parameters:
  • lambda_str – A string representation of the lambda function to be executed on data samples. If empty, the identity function is used.

  • batched – A boolean indicating whether to process input data in batches.

  • kwargs – Additional keyword arguments passed to the parent class.

process_single(sample)[source]

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample

process_batched(samples)[source]