data_juicer.ops.mapper.imgdiff_difference_caption_generator_mapper module#
- class data_juicer.ops.mapper.imgdiff_difference_caption_generator_mapper.Difference_Caption_Generator_Mapper(mllm_mapper_args: Dict | None = {}, image_text_matching_filter_args: Dict | None = {}, text_pair_similarity_filter_args: Dict | None = {}, *args, **kwargs)[source]#
Bases:
MapperA fused operator for OPs that is used to run sequential OPs on the same batch to allow fine-grained control on data processing.
- __init__(mllm_mapper_args: Dict | None = {}, image_text_matching_filter_args: Dict | None = {}, text_pair_similarity_filter_args: Dict | None = {}, *args, **kwargs)[source]#
Base class that conducts data editing.
- Parameters:
text_key â the key name of field that stores sample texts to be processed.
image_key â the key name of field that stores sample image list to be processed
audio_key â the key name of field that stores sample audio list to be processed
video_key â the key name of field that stores sample video list to be processed
query_key â the key name of field that stores sample queries
response_key â the key name of field that stores responses
history_key â the key name of field that stores history of queries and responses