data_juicer.utils.unittest_utils module

data_juicer.utils.unittest_utils.TEST_TAG(*tags)[source]

Tags for test case. Currently, standalone, ray are supported.

data_juicer.utils.unittest_utils.set_clear_model_flag(flag)[source]
data_juicer.utils.unittest_utils.set_from_fork_flag(flag)[source]
class data_juicer.utils.unittest_utils.DataJuicerTestCaseBase(methodName='runTest')[source]

Bases: TestCase

classmethod setUpClass()[source]

Hook method for setting up class fixture before running tests in the class.

classmethod tearDownClass(hf_model_name=None) None[source]

Hook method for deconstructing the class fixture after running all tests in the class.

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown() None[source]

Hook method for deconstructing the test fixture after testing it.

generate_dataset(data) DJDataset[source]

Generate dataset for a specific executor.

Parameters:
  • type (str, optional) – “standalone” or “ray”.

  • "standalone". (Defaults to)

run_single_op(dataset: DJDataset, op, column_names)[source]

Run operator in the specific executor.

assertDatasetEqual(first, second)[source]
data_juicer.utils.unittest_utils.get_diff_files(prefix_filter=['data_juicer/', 'tests/'])[source]

Get git diff files in target dirs except the __init__.py files

data_juicer.utils.unittest_utils.find_corresponding_test_file(file_path)[source]
data_juicer.utils.unittest_utils.get_partial_test_cases()[source]