data_juicer.ops.mapper.dialog_llm_input_utils module#
Helpers for dialog LLM mappers (intent / topic / sentiment / intensity).
- data_juicer.ops.mapper.dialog_llm_input_utils.build_dialog_turns_for_prompt(sample: dict, *, history_key: str, query_key: str, response_key: str) List[Tuple[str, str]][source]#
Build (user, assistant) turns for dialog LLM mappers.
Does not mutate
sample. Merge rules matchdialog_quality_llm_utils._normalize_dialog_tail: after normalize, the last turn lives in bothdialog_history[-1]andquery/response, so those fields must not be appended again (would duplicate the final exchange; older code that mutateddialog_historyin place corrupted downstream rows).
- data_juicer.ops.mapper.dialog_llm_input_utils.clip_text_for_dialog_prompt(text: str, max_chars: int, note: str = 'truncated') str[source]#
Truncate long
textfor API prompts whenmax_chars> 0.Agent traces often concatenate tool outputs into
response; formatter limits elsewhere do not apply to these mappers’history_keypayloads.