data_juicer.ops.mapper.remove_table_text_mapper module¶
- class data_juicer.ops.mapper.remove_table_text_mapper.RemoveTableTextMapper(min_col: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=2), Le(le=20)])] = 2, max_col: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=2), Le(le=20)])] = 20, *args, **kwargs)[源代码]¶
基类:
Mapper
Mapper to remove table texts from text samples.
Regular expression is used to remove tables in the range of column number of tables.
- __init__(min_col: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=2), Le(le=20)])] = 2, max_col: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=2), Le(le=20)])] = 20, *args, **kwargs)[源代码]¶
Initialization method.
- 参数:
min_col -- The min number of columns of table to remove.
max_col -- The max number of columns of table to remove.
args -- extra args
kwargs -- extra args