data_juicer_agents.tools.plan.build_system_spec#

build_system_spec tool package.

class data_juicer_agents.tools.plan.build_system_spec.BuildSystemSpecInput(*, np: int | None = None, executor_type: str | None = None, custom_operator_paths: List[str] = <factory>, **extra_data: Any)[source]#

Bases: BaseModel

Input for building system spec.

Core parameters are exposed directly for common use cases. All other system parameters can be passed as additional kwargs. Use list_system_config tool to discover all available options.

model_config = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

np: int | None#
executor_type: str | None#
custom_operator_paths: List[str]#
data_juicer_agents.tools.plan.build_system_spec.build_system_spec(*, custom_operator_paths: Iterable[Any] | None = None, np: int | None = None, executor_type: str | None = None, **kwargs: Any) Dict[str, Any][source]#

Build system spec with complete config dynamically loaded from Data-Juicer.

This function now loads ALL system configuration fields from Data-Juicer, ensuring automatic sync with any upstream changes.

Parameters:
  • custom_operator_paths – Optional list of custom operator paths

  • np – Optional number of processes

  • executor_type – Optional executor type

  • **kwargs – Any additional system config options (must be valid DJ system config fields — unknown keys will raise ValueError)

Returns:

Dict containing the built system spec and validation results