data_juicer.core.executor.dag_execution_mixin module#

DAG Execution Mixin for Data-Juicer Executors

This mixin provides DAG execution planning and monitoring that can be integrated into existing executors to provide intelligent pipeline analysis and execution tracking.

class data_juicer.core.executor.dag_execution_mixin.DAGExecutionMixin[source]#

Bases: object

Mixin that provides DAG-based execution planning and monitoring.

This mixin can be integrated into any executor to provide: - DAG execution planning - Execution monitoring tied to DAG nodes - Event logging with DAG context

__init__()[source]#

Initialize the DAG execution mixin.

get_dag_execution_status() Dict[str, Any][source]#

Get DAG execution status.

visualize_dag_execution_plan() str[source]#

Get visualization of the DAG execution plan.

get_dag_execution_plan_path() str[source]#

Get the path to the saved DAG execution plan.

reconstruct_dag_state_from_events(job_id: str) Dict[str, Any] | None[source]#

Reconstruct DAG execution state from event logs.

This method has been decomposed into smaller, focused methods for better maintainability and testability.

Parameters:

job_id – The job ID to analyze

Returns:

Dictionary containing reconstructed DAG state and resumption information

resume_dag_execution(job_id: str, dataset, ops: List) bool[source]#

Resume DAG execution from the last known state.

Parameters:
  • job_id – The job ID to resume

  • dataset – The dataset to process

  • ops – List of operations to execute

Returns:

True if resumption was successful, False otherwise