data_juicer_sandbox.hooks module#

class data_juicer_sandbox.hooks.BaseHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:object

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#
run(context_infos: ContextInfos)[源代码]#
hook(**kwargs)[源代码]#
specify_dj_and_extra_configs(allow_fail=False)[源代码]#
class data_juicer_sandbox.hooks.ProbeViaAnalyzerHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

The hook to probe dataset via Data-Juicer Analyzer.

Input:
  • A data-juicer config.

Output:
  • the path to export the analyzed dataset.

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for probing the data via Analyzer

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.ProbeViaModelInferHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for probing the data via Model Infer

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.GeneralProbeHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#
hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.RefineRecipeViaKSigmaHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for refining the recipe via K Sigma

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.RefineRecipeViaModelFeedbackHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for refining the recipe via Model Feedback

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.ProcessDataHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for processing the data via Data-Juicer

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.DataPoolManipulationHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

Hook for data pool manipulation, including construction, combination, ranking, etc.

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#
hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.GeneralDataExecutorHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#
hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.TrainModelHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for model training

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.InferModelHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for model training

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.EvaluateDataHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for data evaluation

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
class data_juicer_sandbox.hooks.EvaluateModelHook(job_cfg, watcher, *args, **kwargs)[源代码]#

基类:BaseHook

__init__(job_cfg, watcher, *args, **kwargs)[源代码]#

Initialize the hook for model evaluation

参数:
  • job_cfg -- the job configs

  • watcher -- for watching the result

hook(**kwargs)[源代码]#
data_juicer_sandbox.hooks.register_hook(job_cfg, watcher)[源代码]#