data_juicer_agents.tools.context.list_dataset_formatters#

list_dataset_formatters tool package.

class data_juicer_agents.tools.context.list_dataset_formatters.ListDatasetFormattersInput(*, include_ray: bool = True)[source]#

Bases: BaseModel

Input for list_dataset_formatters.

Discovers which dataset formatters (dynamic data generators) are available in the current Data-Juicer installation. Use this BEFORE build_dataset_spec when you need to configure the dataset_source.generated field for dynamic dataset generation (e.g., EmptyFormatter for creating empty datasets).

include_ray: bool#
model_config = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

data_juicer_agents.tools.context.list_dataset_formatters.list_dataset_formatters(*, include_ray: bool = True) Dict[str, Any][source]#

List available dataset formatters from Data-Juicer.

Discovers which dataset formatters (dynamic data generators) are registered in the current Data-Juicer installation by comparing OPSearcher results with and without formatter inclusion.

Parameters:

include_ray – Whether to include Ray-specific formatters.

Returns:

Dict with ‘formatters’ list and metadata.