data_juicer.ops.mapper.agent_insight_llm_mapper module#
- class data_juicer.ops.mapper.agent_insight_llm_mapper.AgentInsightLLMMapper(*args, **kwargs)[源代码]#
基类:
MapperSynthesize stats + LLM eval text into
meta.agent_insight_llm(JSON).Intended to run after filters/mappers that populate
statsandagent_bad_case_signal_mapper. Userun_for_tiersto limit API cost.Output is best-effort JSON; raw model text is stored in
meta.agent_insight_llm_rawif parsing fails.- __init__(api_model: str = 'gpt-4o', *, api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, query_key: str = 'query', response_key: str = 'response', query_preview_max_chars: int = 500, response_preview_max_chars: int = 500, run_for_tiers: List[str] | None = None, try_num: Annotated[int, Gt(gt=0)] = 2, model_params: Dict = {}, sampling_params: Dict = {}, preferred_output_lang: str = 'en', **kwargs)[源代码]#
Base class that conducts data editing.
- 参数:
text_key -- the key name of field that stores sample texts to be processed.
image_key -- the key name of field that stores sample image list to be processed
audio_key -- the key name of field that stores sample audio list to be processed
video_key -- the key name of field that stores sample video list to be processed
image_bytes_key -- the key name of field that stores sample image bytes list to be processed
query_key -- the key name of field that stores sample queries
response_key -- the key name of field that stores responses
history_key -- the key name of field that stores history of queries and responses