I/O Utilities#

General utility functions for working with CIF files in Biotite.

atomworks.io.utils.io_utils.get_structure(file_obj: CIFFile | PDBFile | BinaryCIFFile | CIFBlock, *, extra_fields: list[str] | Literal['all'] = [], include_bonds: bool = True, model: int | None = None, altloc: Literal['first', 'occupancy', 'all'] | str = 'first', add_bond_types_from_struct_conn: list[str] = ['covale'], fix_bond_types: bool = True) → AtomArrayStack | AtomArray[source]#

Load example structure into Biotite’s AtomArray or AtomArrayStack using the specified fields and assumptions.

Parameters:

file_obj (-) – The file object to load with Biotite.
extra_fields (-) – List of extra fields to include as AtomArray annotations. If “all”, all fields in the ‘atom_site’ category of the file will be included.
include_bonds (-) – Whether to include bonds in the structure. These will not be affected by the issue where spurious bonds are added due to uninformative label_seq_ids.
model (-) – The model number to use for loading the structure.
altloc (-) – The altloc ID to use for loading the structure. If a string is provided, it will be used as the altloc ID to filter the structure by and it is assumed that that altloc ID is present in the file. If it is not present, an error will be raised.
add_bond_types_from_struct_conn (-) – A list of bond types to add to the structure from the struct_conn category. Defaults to [“covale”]. This means that we will only add covalent bonds to the structure (excluding metal coordination and disulfide bonds).
fix_bond_types (-) – Whether to correct for nucleophilic additions on atoms involved in inter-residue bonds.

Returns:

The loaded structure with the specified fields and assumptions.

Return type:

AtomArray | AtomArrayStack

Reference:: Biotite documentation (https://www.biotite-python.org/apidoc/biotite.structure.io.pdbx.get_structure.html#biotite.structure.io.pdbx.get_structure)

atomworks.io.utils.io_utils.load_any(file_or_buffer: PathLike | StringIO | BytesIO, file_type: Literal['cif', 'mmcif', 'pdbx', 'pdb', 'pdb1', 'bcif'] | None = None, *, extra_fields: list[str] | Literal['all'] = [], include_bonds: bool = True, model: int | None = None, altloc: Literal['first', 'occupancy', 'all'] = 'occupancy') → AtomArrayStack | AtomArray[source]#

Convenience function for loading a structure from a file or buffer.

Parameters:

file_or_buffer (-) – Path to the file or buffer to load the structure from.
file_type (-) – Type of the file to load. If None, it will be inferred.
extra_fields (-) – List of extra fields to include as AtomArray annotations. If “all”, all fields in the ‘atom_site’ category of the file will be included.
include_bonds (-) – Whether to include bonds in the structure.
model (-) – The model number to use for loading the structure. If None, all models will be loaded.
altloc (-) – The altloc ID to use for loading the structure.

Returns:

The loaded structure with the specified fields and assumptions.

Return type:

AtomArrayStack

Reference:: Biotite documentation (https://www.biotite-python.org/apidoc/biotite.structure.io.pdbx.get_structure.html#biotite.structure.io.pdbx.get_structure)

Reads any of the allowed file types into the appropriate Biotite file object.

Parameters:

path_or_buffer (PathLike | io.StringIO | io.BytesIO) – The path to the file or a buffer to read from. If a buffer, it’s highly recommended to specify the file_type.
file_type (Literal["cif", "pdb", "bcif"], optional) – Type of the file. If None, it will be inferred from the file extension. When using a buffer, the file type must be specified.

Returns:

The loaded file object.

Return type:

pdbx.CIFFile | biotite_pdb.PDBFile | pdbx.BinaryCIFFile

Raises:

ValueError – If the file type is unsupported or cannot be determined.

atomworks.io.utils.io_utils.to_cif_buffer(structure: AtomArray, *, id: str = 'unknown_id', author: str = 'unknown_user', date: str | None = None, time: str | None = None, include_entity_poly: bool = False, include_nan_coords: bool = True, include_bonds: bool = True, extra_fields: list[str] | Literal['all'] = [], extra_categories: dict[str, dict[str, float | int | str | list | ndarray]] | None = None, _allow_ambiguous_bond_annotations: bool = False, as_bcif: bool = False) → StringIO | BytesIO[source]#

Convert an AtomArray structure to a CIF formatted StringIO buffer.

Parameters:

structure (AtomArray) – The atomic structure to be converted.
id (str) – The ID of the entry. This will be used as the data block name.
author (str) – The author of the entry.
date (str) – The date of the entry.
time (str) – The time of the entry.
include_entity_poly (bool) – Whether to write entity_poly category in the CIF file.
include_nan_coords (bool) – Whether to write NaN coordinates in the CIF file.
include_bonds (bool) – Whether to write bonds in the CIF file.
extra_fields (list[str] | Literal["all"]) – Additional atom_array annotations to include in the CIF file.
extra_categories (dict[str, dict[str, float | int | str | list | np.ndarray]] | None, optional) – Additional CIF categories to include in data block. These must be a dict of form {category_name: {column_name: value}}. Example: {“reflns”: {“pdbx_reflns_number_d_mean”: 1.0}, “my_metadata”: {“hi”: np.arange(10)}}
_allow_ambiguous_bond_annotations (bool, optional) – Private argument, not meant for public use. If True, allows ambiguous bond annotations.

Returns:

A buffer containing the CIF/BCIF formatted string/bytes representation of the structure.

Return type:

StringIO | BytesIO

atomworks.io.utils.io_utils.to_cif_file(structure: AtomArray, path: PathLike, *, file_type: Literal['cif', 'bcif', 'cif.gz'] | None = None, id: str | None = None, author: str = 'unknown_user', date: str | None = None, time: str | None = None, include_entity_poly: bool = True, include_nan_coords: bool = True, include_bonds: bool = True, extra_fields: list[str] | Literal['all'] = [], extra_categories: dict[str, dict[str, float | int | str | list | ndarray]] | None = None, _allow_ambiguous_bond_annotations: bool = False) → PathLike[source]#

Convert an AtomArray structure to a CIF/BCIF formatted file.

Parameters:

structure (AtomArray) – The atomic structure to be converted.
path (os.PathLike) – The file path where the CIF formatted structure will be saved.
file_type (Literal["cif", "bcif", "cif.gz"] | None) – The file type to save the structure as. If None, the file type will be inferred from the path.
id (str | None) – The ID of the entry. This will be used as the data block name. If None, the data block name will be inferred from the path.
author (str) – The author of the entry.
date (str) – The date of the entry.
time (str) – The time of the entry.
include_entity_poly (bool) – Whether to write entity_poly category in the CIF file.
include_nan_coords (bool) – Whether to write NaN coordinates in the CIF file.
include_bonds (bool) – Whether to write bonds in the CIF file.
extra_fields (list[str] | Literal["all"]) – Additional atom_array annotations to include in the CIF file.
extra_categories (dict[str, dict[str, float | int | str | list | np.ndarray]] | None, optional) – Additional CIF categories to include in data block. These must be a dict of form {category_name: {column_name: value}}. Example: {“reflns”: {“pdbx_reflns_number_d_mean”: 1.0}, “my_metadata”: {“hi”: np.arange(10)}}

Returns:

The file path where the CIF formatted structure was saved.

Return type:

str

Raises:

IOError – If there’s an issue writing to the specified file path.

atomworks.io.utils.io_utils.to_cif_string(structure: AtomArray, *, id: str = 'unknown_id', author: str = 'unknown_user', date: str | None = None, time: str | None = None, include_entity_poly: bool = False, include_nan_coords: bool = True, include_bonds: bool = True, extra_fields: list[str] | Literal['all'] = [], extra_categories: dict[str, dict[str, float | int | str | list | ndarray]] | None = None, as_bcif: bool = False, _allow_ambiguous_bond_annotations: bool = False) → str | bytes[source]#

Convert an AtomArray structure to a CIF formatted string.

Parameters:

structure (AtomArray) – The atomic structure to be converted.
id (str) – The ID of the entry. This will be used as the data block name.
author (str) – The author of the entry.
date (str) – The date of the entry.
time (str) – The time of the entry.
include_entity_poly (bool) – Whether to write entity_poly category in the CIF file.
include_nan_coords (bool) – Whether to write NaN coordinates in the CIF file.
include_bonds (bool) – Whether to write bonds in the CIF file.
extra_fields (list[str] | Literal["all"]) – Additional atom_array annotations to include in the CIF file.
extra_categories (dict[str, dict[str, float | int | str | list | np.ndarray]] | None, optional) – Additional CIF categories to include in data block. These must be a dict of form {category_name: {column_name: value}}. Example: {“reflns”: {“pdbx_reflns_number_d_mean”: 1.0}, “my_metadata”: {“hi”: np.arange(10)}}

Returns:

The CIF/BCIF formatted string/bytes representation of the structure.

Return type:

str | bytes

I/O Utilities#

This Page