General utilities

mlcg.utils contains useful tools for diverse use cases in the mlcg ecosystems such as reading and using yaml files, converting tensors to tuples and others

mlcg.utils.load_yaml(fn)[source]

Load a yaml file using ruamel.yaml

mlcg.utils.dump_yaml(fn, data)[source]

dump a dictionary into a yaml file using ruamel.yaml

mlcg.utils.tensor2tuple(x)[source]

Helper function that flattens tensors and returns them as tuples

Parameters:

x (Tensor) – Input tensor

Returns:

Output tuple

Return type:

x

mlcg.utils.make_splits(dataset_len, val_ratio, test_ratio, seed=None, filename=None, splits=None, order=None)[source]

Function for making train, validation, and test sets and then optionally saving them to disk using numpy.savez. Splits are returned as torch tensors.

Parameters:
  • dataset_len (int) – Dataset length

  • val_ratio (float) – Ratio of validation set size to dataset size

  • test_ratio (float) – Ratio of test set size to dataset set size

  • filename (Optional[str]) – Filename for the numpy zipped archive to save the splits, with the keys “idx_train”, “idx_val”, and “idx_test”. If None, the splits are not saved.

  • splits (Optional[str]) – Filename from which pre-specified splits may be loaded. Must be a valid numpy zipped archive file with the keys “idx_train”, “idx_val”, “idx_test”.

  • order (Optional[List[int]]) – If specified, the dataset is not shuffled and the sets are sequentially along the order list in the order (train, validation, test)

Return type:

Tuple[Tensor, Tensor, Tensor]

Returns:

  • idx_train – The indices of training examples in the dataset

  • idx_val – The indices of validation examples in the dataset

  • idx_test – The indices of test examples in the dataset

mlcg.utils.download_url(url, folder, log=True)[source]

Downloads the content of an URL to a specific folder.

Parameters:
  • url (string) – The url.

  • folder (string) – The folder.

  • log (bool, optional) – If False, will not print anything to the console. (default: True)

Adtapted from torch_geometric.data.download.py