data_juicer_agents.tools.context.inspect_dataset#
inspect_dataset tool package.
- class data_juicer_agents.tools.context.inspect_dataset.GenericOutput(*, ok: bool = True)[源代码]#
基类:
BaseModel- ok: bool#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class data_juicer_agents.tools.context.inspect_dataset.InspectDatasetInput(*, dataset_source: DatasetSource, sample_size: Annotated[int, Ge(ge=1)] = 20)[源代码]#
基类:
BaseModel- dataset_source: DatasetSource#
- sample_size: int#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- data_juicer_agents.tools.context.inspect_dataset.inspect_dataset_schema(dataset_source=None, sample_size: int = 20) Dict[str, Any][源代码]#
Inspect a small sample of a dataset and infer keys/modality for planning.
Accepts a DatasetSource object that encapsulates the dataset path and config. When dataset_source is None, returns a friendly error dict instead of raising.