data_juicer.ops.grouper.key_value_grouper module#
- class data_juicer.ops.grouper.key_value_grouper.KeyValueGrouper(group_by_keys: List[str] | None = None, *args, **kwargs)[source]#
Bases:
GrouperGroup samples to batched samples according values in given keys.
- __init__(group_by_keys: List[str] | None = None, *args, **kwargs)[source]#
Initialization method.
- Parameters:
group_by_keys โ group samples according values in the keys. Support for nested keys such as โ__dj__stats__.text_lenโ. It is [self.text_key] in default.
args โ extra args
kwargs โ extra args