data_juicer_agents.tools.context.inspect_dataset.logic module#
Lightweight dataset probing utilities for planning-time schema inference.
- data_juicer_agents.tools.context.inspect_dataset.logic.inspect_dataset_schema(dataset_source=None, sample_size: int = 20) Dict[str, Any][源代码]#
Inspect a small sample of a dataset and infer keys/modality for planning.
Accepts a DatasetSource object that encapsulates the dataset path and config. When dataset_source is None, returns a friendly error dict instead of raising.