data_juicer.ops.mapper.agent_skill_insight_mapper module#
- class data_juicer.ops.mapper.agent_skill_insight_mapper.AgentSkillInsightMapper(*args, **kwargs)[源代码]#
基类:
MapperSummarize agent_tool_types and agent_skill_types into insights via LLM.
Reads
meta[agent_tool_types]andmeta[agent_skill_types](fromagent_dialog_normalize_mapper), calls the API for 3–5 concrete capability phrases (about 10 Chinese characters or ~4–8 English words each; avoid vague 'read/write / processing'), and stores them inmeta[agent_skill_insights]. Run after normalize. Overridesystem_promptfor locale-specific label style.- __init__(api_model: str = 'gpt-4o', *, tool_types_key: str = 'agent_tool_types', skill_types_key: str = 'agent_skill_types', insights_key: str = 'agent_skill_insights', api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, try_num: Annotated[int, Gt(gt=0)] = 2, model_params: Dict = {}, sampling_params: Dict = {}, preferred_output_lang: str = 'en', **kwargs)[源代码]#
Base class that conducts data editing.
- 参数:
text_key -- the key name of field that stores sample texts to be processed.
image_key -- the key name of field that stores sample image list to be processed
audio_key -- the key name of field that stores sample audio list to be processed
video_key -- the key name of field that stores sample video list to be processed
image_bytes_key -- the key name of field that stores sample image bytes list to be processed
query_key -- the key name of field that stores sample queries
response_key -- the key name of field that stores responses
history_key -- the key name of field that stores history of queries and responses