data_juicer_agents.tools.op_manager.op_retrieval module#

data_juicer_agents.tools.op_manager.op_retrieval.fast_text_encoder(text: str) str[source]#

Fast encoding using xxHash algorithm

data_juicer_agents.tools.op_manager.op_retrieval.init_dj_func_info()[source]#

Initialize dj_func_info at agent startup

data_juicer_agents.tools.op_manager.op_retrieval.refresh_dj_func_info()[source]#

Refresh dj_func_info during agent runtime (for manual updates)

data_juicer_agents.tools.op_manager.op_retrieval.get_dj_func_info()[source]#

Get current dj_func_info (lifecycle-aware)

async data_juicer_agents.tools.op_manager.op_retrieval.retrieve_ops_lm(user_query, limit=20)[source]#

Tool retrieval using language model - returns list of tool names

data_juicer_agents.tools.op_manager.op_retrieval.retrieve_ops_vector(user_query, limit=20)[source]#

Tool retrieval using vector search with smart caching - returns list of tool names

async data_juicer_agents.tools.op_manager.op_retrieval.retrieve_ops(user_query: str, limit: int = 20, mode: str = 'auto') list[source]#

Tool retrieval with configurable mode

Parameters:
  • user_query – User query string

  • limit – Maximum number of tools to retrieve

  • mode – Retrieval mode - “llm”, “vector”, or “auto” (default: “auto”) - “llm”: Use language model only - “vector”: Use vector search only - “auto”: Try LLM first, fallback to vector search on failure

Returns:

List of tool names