matchbench.dataset package

Submodules

matchbench.dataset.load module

matchbench.dataset.load.list_datasets()

TODO: List all the datasets.

matchbench.dataset.load.load_datasets(list_of_datasets, local_dir=None, has_pairs=True)

Load datasets from the online huggingface repo or the local director.

Parameters:
  • list_of_datasets (str or List) – The (list of) dataset(s) to be loaded.

  • local_dir (str or Dict, optional, defaults to None) – The local director path where the dataset(s) stored in.

  • has_pairs (Bool, optional, defaults to True) – If the dataset has ground truth or train data pairs or not.

Returns:

If inputting only one dataset, return a (pairs, source, target) tuple. If inputting a list of datasets, return a tuple list.

Return type:

(List of) tuple(s)

Module contents