data_juicer_agents.tools.context.inspect_dataset#

inspect_dataset tool package.

class data_juicer_agents.tools.context.inspect_dataset.GenericOutput(*, ok: bool = True)[source]#

Bases: BaseModel

ok: bool#
model_config = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class data_juicer_agents.tools.context.inspect_dataset.InspectDatasetInput(*, dataset_path: str, sample_size: Annotated[int, Ge(ge=1)] = 20)[source]#

Bases: BaseModel

dataset_path: str#
sample_size: int#
model_config = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

data_juicer_agents.tools.context.inspect_dataset.inspect_dataset_schema(dataset_path: str, sample_size: int = 20) Dict[str, Any][source]#

Inspect a small sample of a dataset and infer keys/modality for planning.