data_juicer.ops.mapper.imgdiff_difference_area_generator_mapper module#
- data_juicer.ops.mapper.imgdiff_difference_area_generator_mapper.compare_text_index(text1, text2)[source]#
- data_juicer.ops.mapper.imgdiff_difference_area_generator_mapper.iou_filter(samples, iou_thresh)[source]#
- class data_juicer.ops.mapper.imgdiff_difference_area_generator_mapper.Difference_Area_Generator_Mapper(image_pair_similarity_filter_args: Dict | None = {}, image_segment_mapper_args: Dict | None = {}, image_text_matching_filter_args: Dict | None = {}, *args, **kwargs)[source]#
Bases:
MapperA fused operator for OPs that is used to run sequential OPs on the same batch to allow fine-grained control on data processing.
- __init__(image_pair_similarity_filter_args: Dict | None = {}, image_segment_mapper_args: Dict | None = {}, image_text_matching_filter_args: Dict | None = {}, *args, **kwargs)[source]#
Base class that conducts data editing.
- Parameters:
text_key â the key name of field that stores sample texts to be processed.
image_key â the key name of field that stores sample image list to be processed
audio_key â the key name of field that stores sample audio list to be processed
video_key â the key name of field that stores sample video list to be processed
query_key â the key name of field that stores sample queries
response_key â the key name of field that stores responses
history_key â the key name of field that stores history of queries and responses