Common Utilities#

This module contains common utilities and functions used throughout the atomworks.ml package.

atomworks.ml.common.as_list(value: Any) list[source]#

Convert a value to a list.

Handles various types using duck typing:
  • Iterable objects (lists, tuples, strings, etc.): converted to list

  • Single values: wrapped in a list

atomworks.ml.common.generate_example_id(dataset_names: list[str], pdb_id: str, assembly_id: str, query_pn_unit_iids: list) str[source]#

Generate a unique example ID from a DataFrame row.

This unique ID is helpful for debugging and to track performance on specific examples.

An example can be uniquely defined by, in order:
  1. a composed list of dataset names (e.g., [pdb, pn_unit] to indicate the pn_unit dataset nested within the PDB dataset)

  2. pdb_id (or any group-level identifier, if using a non-PDB dataset), within the dataset specified by (1)

  3. assembly_id

  4. query_pn_unit_iids

atomworks.ml.common.parse_example_id(example_id: str) dict[source]#

Parse the example ID into its components: dataset names, pdb_id, assembly_id, and query_pn_unit_iids.

Parameters:

example_id (str) – The example ID string generated by the generate_example_id function.

Returns:

A dictionary containing the parsed components.

Return type:

dict