data_juicer.utils.unittest_utils module#

data_juicer.utils.unittest_utils.TEST_TAG(*tags)[source]#

Tags for test case. Currently, standalone, ray are supported.

data_juicer.utils.unittest_utils.set_clear_model_flag(flag)[source]#
data_juicer.utils.unittest_utils.set_from_fork_flag(flag)[source]#
class data_juicer.utils.unittest_utils.DataJuicerTestCaseBase(methodName='runTest')[source]#

Bases: TestCase

classmethod setUpClass()[source]#

Hook method for setting up class fixture before running tests in the class.

classmethod tearDownClass(hf_model_name=None) None[source]#

Hook method for deconstructing the class fixture after running all tests in the class.

setUp()[source]#

Hook method for setting up the test fixture before exercising it.

tearDown() None[source]#

Hook method for deconstructing the test fixture after testing it.

generate_dataset(data) DJDataset[source]#

Generate dataset for a specific executor.

Parameters:
  • type (str, optional) – “standalone” or “ray”.

  • "standalone". (Defaults to)

run_single_op(dataset: DJDataset, op, column_names)[source]#

Run operator in the specific executor.

assertDatasetEqual(first, second)[source]#
assertListOfDictEqual(first: List[Dict], second: List[Dict], ignore_order=True)[source]#

Assert two list of dicts are equal

data_juicer.utils.unittest_utils.get_diff_files(prefix_filter=['data_juicer/', 'tests/'])[source]#

Get git diff files in target dirs except the __init__.py files

data_juicer.utils.unittest_utils.find_corresponding_test_file(file_path)[source]#
data_juicer.utils.unittest_utils.get_partial_test_cases()[source]#