data_juicer_agents.tools.context.inspect_dataset.logic module#

Lightweight dataset probing utilities for planning-time schema inference.

data_juicer_agents.tools.context.inspect_dataset.logic.inspect_dataset_schema(dataset_path: str, sample_size: int = 20) Dict[str, Any][源代码]#

Inspect a small sample of a dataset and infer keys/modality for planning.