data_juicer.utils.job.stopper module#

DataJuicer Job Stopper

A utility to stop DataJuicer jobs by reading event logs to find process and thread IDs, then terminating those specific processes and threads.

class data_juicer.utils.job.stopper.JobStopper(job_id: str, base_dir: str = 'outputs/partition-checkpoint-eventlog')[source]#

Bases: object

Stop DataJuicer jobs using event log-based process discovery.

__init__(job_id: str, base_dir: str = 'outputs/partition-checkpoint-eventlog')[source]#
terminate_process_gracefully(proc, timeout: int = 10) bool[source]#

Terminate a process gracefully with timeout.

cleanup_job_resources() None[source]#

Clean up job resources and update job summary.

stop_job(force: bool = False, timeout: int = 30) Dict[str, Any][source]#

Stop the DataJuicer job using event log-based process discovery.

data_juicer.utils.job.stopper.stop_job(job_id: str, base_dir: str = 'outputs/partition-checkpoint-eventlog', force: bool = False, timeout: int = 30) Dict[str, Any][source]#

Stop a DataJuicer job using event log-based process discovery.

data_juicer.utils.job.stopper.main()[source]#

Main function for command-line usage.