data_juicer.ops.filter.general_field_filter module¶
- class data_juicer.ops.filter.general_field_filter.GeneralFieldFilter(filter_condition: str = '', *args, **kwargs)[source]¶
Bases:
Filter
Filter to keep samples based on a general field filter condition. The filter condition is a string that can include logical operators and chain comparisons.
- __init__(filter_condition: str = '', *args, **kwargs)[source]¶
Initialization method. :param filter_condition: The filter condition as a string.
It can include logical operators (and/or) and chain comparisons. For example: “10 < num <= 30 and text != ‘nothing here’ and __dj__meta__.a == 3”.
- compute_stats_single(sample, context=False)[source]¶
Compute stats for the sample which is used as a metric to decide whether to filter this sample.
- Parameters:
sample – input sample.
context – whether to store context information of intermediate vars in the sample temporarily.
- Returns:
sample with computed stats