Data-Juicer Agents: Towards Agentic Data Processing#
A Suite of Agents for Agentic Data Processing. Built on Data-Juicer (DJ) and AgentScope.
įŽäŊ䏿 | English
đī¸ Overview Doc âĸ âĄī¸ Quick Start Doc âĸ >_ CLI Doc âĸ đ§ Tools Doc âĸ đ¯ Roadmap
News#
đ [2026-03-11] Major refactor and upgrade of
data_juicer_agentscompleted.The project architecture and CLI/session capabilities were comprehensively redesigned for better maintainability and extensibility.
đī¸ Overview | âĄī¸ Quick Start | >_ CLI Doc | đ§ Tools | đ¯ Roadmap
Try processing data by chatting with the agent!
đ[2026-01-15] Q&A Copilot has been deployed on the official Doc Site | DingTalk | Discord of Data-Juicer. Feel free to ask Juicer anything related to the Data-Juicer ecosystem!
đ Deploy-ready codes | đŦ More demos | đ¯ Roadmap.
Roadmap#
The long-term vision of DJ-Agents is to enable a development-free data processing lifecycle, allowing developers to focus on what to do rather than how to do it.
To achieve this vision, we are tackling two fundamental challenges:
Agents: How to design and build powerful agents specialized in data processing
Services & Tools: How to package these agents into ready-to-use, out-of-the-box products
We continuously iterate on both directions, and the roadmap may evolve accordingly as our understanding and capabilities improve.
Agents#
Data-Juicer Data Processing Agent (DJ Process Agent) & Data-Juicer Code Development Agent (DJ Dev Agent)We have stopped building scenario-specific data processing agents, and instead are building data processing
toolsfor general-purpose agents. From there:Hard-orchestrate these tools into
capabilities, exposed as thedjxCLISoft-orchestrate them through prompts, packaged as
skillsRely on agent self-orchestration to support conversational data processing
Services & Tools#
Q&A Copilot: a Q&A assistant for the Data-Juicer ecosystem
[2026-01-15]: already deployed on the official Doc Site of Data-Juicer | DingTalk | Discord
InteRecipe: interactive data recipe construction through natural language
[2026-03-11]: the current
./interactive_recipeonly shows workflow-based examples. Thedj-agentsCLI entry is already built and supports interactive data-recipe construction through natural language in the TUI. We are developing a frontend tool (studio) on top of this foundation as the next upgrade.
Priority Items#
DJ Skills: use prompt-based soft orchestration to package
toolsintoskillsfor general-purpose agents.InteRecipe Studio: support interactive data recipe construction through natural language, with multi-dimensional data and result views.
Plan Tool: extend support for fuller Data-Juicer capability coverage, DJ Hub recipe matching, and more.
Dev Tool: stabilization testing and optimization
Long-term Directions#
Continue building tools and skills for broader data-processing scenarios, enabling wider and more flexible applications.
RAG
Embodied Intelligence
Data Lakehouse architectures
Common Issues#
Q: How to get DashScope API key? A: Visit DashScope official website to register an account and apply for an API key.