Common Utilities#
This module contains common utilities and functions used throughout the atomworks.ml package.
- atomworks.ml.common.as_list(value: Any) list [source]#
Convert a value to a list.
- Handles various types using duck typing:
Iterable objects (lists, tuples, strings, etc.): converted to list
Single values: wrapped in a list
- atomworks.ml.common.generate_example_id(dataset_names: list[str], pdb_id: str, assembly_id: str, query_pn_unit_iids: list) str [source]#
Generate a unique example ID from a DataFrame row.
This unique ID is helpful for debugging and to track performance on specific examples.
- An example can be uniquely defined by, in order:
a composed list of dataset names (e.g., [pdb, pn_unit] to indicate the pn_unit dataset nested within the PDB dataset)
pdb_id (or any group-level identifier, if using a non-PDB dataset), within the dataset specified by (1)
assembly_id
query_pn_unit_iids
- atomworks.ml.common.parse_example_id(example_id: str) dict [source]#
Parse the example ID into its components: dataset names, pdb_id, assembly_id, and query_pn_unit_iids.
- Parameters:
example_id (str) – The example ID string generated by the generate_example_id function.
- Returns:
A dictionary containing the parsed components.
- Return type:
dict