Skip to main content
Ctrl+K

Data Juicer

  • DOCS
  • API
  • Sandbox
  • Hub
  • Agents
  • GitHub
English 简体中文
main v1.5.1 v1.5.0 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0
  • DOCS
  • API
  • Sandbox
  • Hub
  • Agents
  • GitHub
English 简体中文
main v1.5.1 v1.5.0 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0
  • data_juicer.ops.mapper.clean_copyright_mapper module

data_juicer.ops.mapper.clean_copyright_mapper module#

class data_juicer.ops.mapper.clean_copyright_mapper.CleanCopyrightMapper(*args, **kwargs)[source]#

Bases: Mapper

Mapper to clean copyright comments at the beginning of the text samples.

__init__(*args, **kwargs)[source]#

Initialization method.

Parameters:
  • args – extra args

  • kwargs – extra args

process_batched(samples)[source]#
On this page
  • CleanCopyrightMapper
    • CleanCopyrightMapper.__init__()
    • CleanCopyrightMapper.process_batched()

This Page

  • Show Source

© Copyright 2024, Data-Juicer Team.

Created using Sphinx 9.0.4.

Built with the PyData Sphinx Theme 0.16.1.