data_juicer.ops.mapper.whitespace_normalization_mapper module#
- class data_juicer.ops.mapper.whitespace_normalization_mapper.WhitespaceNormalizationMapper(*args, **kwargs)[源代码]#
基类:
MapperMapper to normalize different kinds of whitespaces to whitespace ' ' (0x20) in text samples.
Different kinds of whitespaces can be found here: https://en.wikipedia.org/wiki/Whitespace_character