data_juicer.utils.agent_output_locale module#
Preferred output language helpers for agent / dialog LLM operators.
YAML / op kwargs use preferred_output_lang (e.g. zh, en, zh-CN).
Normalized to zh or en for prompt selection. JSON keys stay English
where required for parsing; free-text fields follow this locale.
- data_juicer.utils.agent_output_locale.normalize_preferred_output_lang(value: str | None) str[源代码]#
Return
zhoren(defaultenif missing/unknown).
- data_juicer.utils.agent_output_locale.dialog_score_json_instruction(lang: str) str[源代码]#
Instruction block for 1–5 + reason JSON (dialog / trace quality mappers).
- data_juicer.utils.agent_output_locale.rubric_reason_language_clause(lang: str) str[源代码]#
Append to system prompt: rubric may be English;
reasonfollows locale.
- data_juicer.utils.agent_output_locale.llm_filter_free_text_language_appendix(lang: str | None) str[源代码]#
Append to LLMAnalysisFilter
system_promptfor rationale / tags language.
- data_juicer.utils.agent_output_locale.agent_insight_system_prompt(lang: str) str[源代码]#
System prompt for
agent_insight_llm_mapper.