constants#

Constants used in the atomworks.io package.

atomworks.io.constants.AA_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE LINKING', 'D-PEPTIDE NH3 AMINO TERMINUS', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE LINKING', 'L-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING', 'PEPTIDE-LIKE'})#

Set of amino acid-like chemical component types. All uppercase.

atomworks.io.constants.AA_OR_NA_CHEM_COMP_TYPES: Final[frozenset[str]] = frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE LINKING', 'D-PEPTIDE NH3 AMINO TERMINUS', 'DNA LINKING', 'DNA OH 3 PRIME TERMINUS', 'DNA OH 5 PRIME TERMINUS', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'L-DNA LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE LINKING', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-RNA LINKING', 'PEPTIDE LINKING', 'PEPTIDE-LIKE', 'RNA LINKING', 'RNA OH 3 PRIME TERMINUS', 'RNA OH 5 PRIME TERMINUS'})#

Amino acid or DNA/RNA-like chemical component types.

atomworks.io.constants.AF3_EXCLUDED_LIGANDS: Final[list[str]] = ['144', '15P', '1PE', '2F2', '2JC', '3HR', '3SY', '7N5', '7PE', '9JE', 'AAE', 'ABA', 'ACE', 'ACN', 'ACT', 'ACY', 'AZI', 'BAM', 'BCN', 'BCT', 'BDN', 'BEN', 'BME', 'BO3', 'BTB', 'BTC', 'BU1', 'C8E', 'CAD', 'CAQ', 'CBM', 'CCN', 'CIT', 'CL', 'CLR', 'CM', 'CMO', 'CO3', 'CPT', 'CXS', 'D10', 'DEP', 'DIO', 'DMS', 'DN', 'DOD', 'DOX', 'EDO', 'EEE', 'EGL', 'EOH', 'EOX', 'EPE', 'ETF', 'FCY', 'FJO', 'FLC', 'FMT', 'FW5', 'GOL', 'GSH', 'GTT', 'GYF', 'HED', 'IHP', 'IHS', 'IMD', 'IOD', 'IPA', 'IPH', 'LDA', 'MB3', 'MEG', 'MES', 'MLA', 'MLI', 'MOH', 'MPD', 'MRD', 'MSE', 'MYR', 'N', 'NA', 'NH2', 'NH4', 'NHE', 'NO3', 'O4B', 'OHE', 'OLA', 'OLC', 'OMB', 'OME', 'OXA', 'P6G', 'PE3', 'PE4', 'PEG', 'PEO', 'PEP', 'PG0', 'PG4', 'PGE', 'PGR', 'PLM', 'PO4', 'POL', 'POP', 'PVO', 'SAR', 'SCN', 'SEO', 'SIN', 'SO4', 'SPD', 'SPM', 'SR', 'STE', 'STO', 'STU', 'TAR', 'TBU', 'TME', 'TRS', 'UNK', 'UNL', 'UNX', 'UPL', 'URE']#

A list of CCD codes of ligands that were excluded in AF3.

Reference:
atomworks.io.constants.AF3_EXCLUDED_LIGANDS_REGEX: Final[str] = '(?:^|,)\\s*(?:144|15P|1PE|2F2|2JC|3HR|3SY|7N5|7PE|9JE|AAE|ABA|ACE|ACN|ACT|ACY|AZI|BAM|BCN|BCT|BDN|BEN|BME|BO3|BTB|BTC|BU1|C8E|CAD|CAQ|CBM|CCN|CIT|CL|CLR|CM|CMO|CO3|CPT|CXS|D10|DEP|DIO|DMS|DN|DOD|DOX|EDO|EEE|EGL|EOH|EOX|EPE|ETF|FCY|FJO|FLC|FMT|FW5|GOL|GSH|GTT|GYF|HED|IHP|IHS|IMD|IOD|IPA|IPH|LDA|MB3|MEG|MES|MLA|MLI|MOH|MPD|MRD|MSE|MYR|N|NA|NH2|NH4|NHE|NO3|O4B|OHE|OLA|OLC|OMB|OME|OXA|P6G|PE3|PE4|PEG|PEO|PEP|PG0|PG4|PGE|PGR|PLM|PO4|POL|POP|PVO|SAR|SCN|SEO|SIN|SO4|SPD|SPM|SR|STE|STO|STU|TAR|TBU|TME|TRS|UNK|UNL|UNX|UPL|URE)\\s*(?:,|$)'#

A regex pattern that matches any of the ligands in AF3_EXCLUDED_LIGANDS. Used for filtering out ligands from the assembled dataframes.

atomworks.io.constants.ATOMIC_NUMBER_TO_ELEMENT: Final[mappingproxy[int | str, str]] = mappingproxy({1: 'H', 2: 'HE', 3: 'LI', 4: 'BE', 5: 'B', 6: 'C', 7: 'N', 8: 'O', 9: 'F', 10: 'NE', 11: 'NA', 12: 'MG', 13: 'AL', 14: 'SI', 15: 'P', 16: 'S', 17: 'CL', 18: 'AR', 19: 'K', 20: 'CA', 21: 'SC', 22: 'TI', 23: 'V', 24: 'CR', 25: 'MN', 26: 'FE', 27: 'CO', 28: 'NI', 29: 'CU', 30: 'ZN', 31: 'GA', 32: 'GE', 33: 'AS', 34: 'SE', 35: 'BR', 36: 'KR', 37: 'RB', 38: 'SR', 39: 'Y', 40: 'ZR', 41: 'NB', 42: 'MO', 43: 'TC', 44: 'RU', 45: 'RH', 46: 'PD', 47: 'AG', 48: 'CD', 49: 'IN', 50: 'SN', 51: 'SB', 52: 'TE', 53: 'I', 54: 'XE', 55: 'CS', 56: 'BA', 57: 'LA', 58: 'CE', 59: 'PR', 60: 'ND', 61: 'PM', 62: 'SM', 63: 'EU', 64: 'GD', 65: 'TB', 66: 'DY', 67: 'HO', 68: 'ER', 69: 'TM', 70: 'YB', 71: 'LU', 72: 'HF', 73: 'TA', 74: 'W', 75: 'RE', 76: 'OS', 77: 'IR', 78: 'PT', 79: 'AU', 80: 'HG', 81: 'TL', 82: 'PB', 83: 'BI', 84: 'PO', 85: 'AT', 86: 'RN', 87: 'FR', 88: 'RA', 89: 'AC', 90: 'TH', 91: 'PA', 92: 'U', 93: 'NP', 94: 'PU', 95: 'AM', 96: 'CM', 97: 'BK', 98: 'CF', 99: 'ES', 100: 'FM', 101: 'MD', 102: 'NO', 103: 'LR', 104: 'RF', 105: 'DB', 106: 'SG', 107: 'BH', 108: 'HS', 109: 'MT', 110: 'DS', 111: 'RG', 112: 'CN', 113: 'NH', 114: 'FL', 115: 'MC', 116: 'LV', 117: 'TS', 118: 'OG', 0: 'X', '1': 'H', '2': 'HE', '3': 'LI', '4': 'BE', '5': 'B', '6': 'C', '7': 'N', '8': 'O', '9': 'F', '10': 'NE', '11': 'NA', '12': 'MG', '13': 'AL', '14': 'SI', '15': 'P', '16': 'S', '17': 'CL', '18': 'AR', '19': 'K', '20': 'CA', '21': 'SC', '22': 'TI', '23': 'V', '24': 'CR', '25': 'MN', '26': 'FE', '27': 'CO', '28': 'NI', '29': 'CU', '30': 'ZN', '31': 'GA', '32': 'GE', '33': 'AS', '34': 'SE', '35': 'BR', '36': 'KR', '37': 'RB', '38': 'SR', '39': 'Y', '40': 'ZR', '41': 'NB', '42': 'MO', '43': 'TC', '44': 'RU', '45': 'RH', '46': 'PD', '47': 'AG', '48': 'CD', '49': 'IN', '50': 'SN', '51': 'SB', '52': 'TE', '53': 'I', '54': 'XE', '55': 'CS', '56': 'BA', '57': 'LA', '58': 'CE', '59': 'PR', '60': 'ND', '61': 'PM', '62': 'SM', '63': 'EU', '64': 'GD', '65': 'TB', '66': 'DY', '67': 'HO', '68': 'ER', '69': 'TM', '70': 'YB', '71': 'LU', '72': 'HF', '73': 'TA', '74': 'W', '75': 'RE', '76': 'OS', '77': 'IR', '78': 'PT', '79': 'AU', '80': 'HG', '81': 'TL', '82': 'PB', '83': 'BI', '84': 'PO', '85': 'AT', '86': 'RN', '87': 'FR', '88': 'RA', '89': 'AC', '90': 'TH', '91': 'PA', '92': 'U', '93': 'NP', '94': 'PU', '95': 'AM', '96': 'CM', '97': 'BK', '98': 'CF', '99': 'ES', '100': 'FM', '101': 'MD', '102': 'NO', '103': 'LR', '104': 'RF', '105': 'DB', '106': 'SG', '107': 'BH', '108': 'HS', '109': 'MT', '110': 'DS', '111': 'RG', '112': 'CN', '113': 'NH', '114': 'FL', '115': 'MC', '116': 'LV', '117': 'TS', '118': 'OG', '0': 'X'})#

Case-sensitive.

Type:

Map atomic numbers (int/str) to their canonical UPPERCASE 2 letter element names. WARNING

atomworks.io.constants.BIOTITE_BOND_TYPE_TO_BOND_ORDER: Final[mappingproxy[BondType, int]] = mappingproxy({<BondType.ANY: 0>: 1, <BondType.SINGLE: 1>: 1, <BondType.DOUBLE: 2>: 2, <BondType.TRIPLE: 3>: 3, <BondType.QUADRUPLE: 4>: 4, <BondType.AROMATIC_SINGLE: 5>: 1, <BondType.AROMATIC_DOUBLE: 6>: 2, <BondType.AROMATIC_TRIPLE: 7>: 3})#

Mapping from Biotite bond types to bond orders.

atomworks.io.constants.BIOTITE_DEFAULT_ANNOTATIONS: Final[tuple[str, ...]] = ('chain_id', 'res_id', 'res_name', 'atom_name', 'hetero', 'element')#

The default mandatory annotations for Biotite AtomArrays.

atomworks.io.constants.CARBOHYDRATE_D_CHEM_TYPES: Final[frozenset[str]] = frozenset({'D-SACCHARIDE', 'D-SACCHARIDE, ALPHA LINKING', 'D-SACCHARIDE, BETA LINKING'})#

Set of carbohydrate-D (right-handed saccharides) chemical component types. All uppercase.

atomworks.io.constants.CARBOHYDRATE_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'D-SACCHARIDE', 'D-SACCHARIDE, ALPHA LINKING', 'D-SACCHARIDE, BETA LINKING', 'L-SACCHARIDE', 'L-SACCHARIDE, ALPHA LINKING', 'L-SACCHARIDE, BETA LINKING', 'SACCHARIDE'})#

Set of carbohydrate-like chemical component types. All uppercase.

atomworks.io.constants.CARBOHYDRATE_L_CHEM_TYPES: Final[frozenset[str]] = frozenset({'L-SACCHARIDE', 'L-SACCHARIDE, ALPHA LINKING', 'L-SACCHARIDE, BETA LINKING'})#

Set of carbohydrate-L (left-handed saccharides) chemical component types. All uppercase.

atomworks.io.constants.CCD_MIRROR_PATH: Final[str] = None#

A path to a carbon-copy mirror of the CCD ligands in the RCSB CCD.

atomworks.io.constants.CHEM_COMP_TYPES: Final[tuple[str, ...]] = ('D-BETA-PEPTIDE, C-GAMMA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE NH3 AMINO TERMINUS', 'D-PEPTIDE LINKING', 'D-SACCHARIDE', 'D-SACCHARIDE, ALPHA LINKING', 'D-SACCHARIDE, BETA LINKING', 'DNA OH 3 PRIME TERMINUS', 'DNA OH 5 PRIME TERMINUS', 'DNA LINKING', 'L-DNA LINKING', 'L-RNA LINKING', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'L-SACCHARIDE', 'L-SACCHARIDE, ALPHA LINKING', 'L-SACCHARIDE, BETA LINKING', 'RNA OH 3 PRIME TERMINUS', 'RNA OH 5 PRIME TERMINUS', 'RNA LINKING', 'NON-POLYMER', 'OTHER', 'PEPTIDE LINKING', 'PEPTIDE-LIKE', 'SACCHARIDE')#

Allowed Chemical Component Types for residues in the PDB + mask. All uppercase.

Reference:
atomworks.io.constants.CHEM_TYPE_POLYMERIZATION_ATOMS: Final[mappingproxy[str, tuple[str, str]]] = mappingproxy({'PEPTIDE LINKING': ('C', 'N'), 'L-PEPTIDE LINKING': ('C', 'N'), 'D-PEPTIDE LINKING': ('C', 'N'), 'L-BETA-PEPTIDE, C-GAMMA LINKING': ('CG', 'N'), 'D-BETA-PEPTIDE, C-GAMMA LINKING': ('CG', 'N'), 'L-GAMMA-PEPTIDE, C-DELTA LINKING': ('CD', 'N'), 'D-GAMMA-PEPTIDE, C-DELTA LINKING': ('CD', 'N'), 'DNA LINKING': ("O3'", 'P'), 'L-DNA LINKING': ("O3'", 'P'), 'RNA LINKING': ("O3'", 'P'), 'L-RNA LINKING': ("O3'", 'P')})#

A mapping of chemical component types to the atoms that they link when part of a polymer.

atomworks.io.constants.CRYSTALLIZATION_AIDS: Final[list[str]] = ['SO4', 'GOL', 'EDO', 'PO4', 'ACT', 'PEG', 'DMS', 'TRS', 'PGE', 'PG4', 'FMT', 'EPE', 'MPD', 'MES', 'CD', 'IOD']#

A list of CCD codes of common crystallization aids used in the crystallization of proteins.

Reference:
atomworks.io.constants.DEFAULT_VALENCE = {'B': 3, 'Br': 1, 'C': 4, 'Cl': 1, 'F': 1, 'H': 1, 'N': 3, 'O': 2}#

Default valences of common elements in organic compounds. Only elements that have unambiguous valences are included.

Reference:
atomworks.io.constants.DICT_THREE_TO_ONE: Final[dict[str, str]] = {' * ': '*', 'ALA': 'A', 'ARG': 'R', 'ASN': 'N', 'ASP': 'D', 'ASX': 'B', 'CYS': 'C', 'GLN': 'Q', 'GLU': 'E', 'GLX': 'Z', 'GLY': 'G', 'HIS': 'H', 'ILE': 'I', 'LEU': 'L', 'LYS': 'K', 'MET': 'M', 'PHE': 'F', 'PRO': 'P', 'SER': 'S', 'THR': 'T', 'TRP': 'W', 'TYR': 'Y', 'UNK': 'X', 'VAL': 'V'}#

A dictionary that maps three-letter amino acid codes to one-letter codes.

Reference:
atomworks.io.constants.DNA_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'DNA LINKING', 'DNA OH 3 PRIME TERMINUS', 'DNA OH 5 PRIME TERMINUS', 'L-DNA LINKING'})#

Set of DNA-like chemical component types. All uppercase.

atomworks.io.constants.DO_NOT_MATCH_CCD: Final[frozenset[str]] = frozenset({'DOD', 'HOH', 'UNL'})#

CCDs that should not be matched to a template for the purpose of adding missing atoms.

atomworks.io.constants.ELEMENT_NAME_TO_ATOMIC_NUMBER: Final[mappingproxy[str, int]] = mappingproxy({'H': 1, 'HE': 2, 'LI': 3, 'BE': 4, 'B': 5, 'C': 6, 'N': 7, 'O': 8, 'F': 9, 'NE': 10, 'NA': 11, 'MG': 12, 'AL': 13, 'SI': 14, 'P': 15, 'S': 16, 'CL': 17, 'AR': 18, 'K': 19, 'CA': 20, 'SC': 21, 'TI': 22, 'V': 23, 'CR': 24, 'MN': 25, 'FE': 26, 'CO': 27, 'NI': 28, 'CU': 29, 'ZN': 30, 'GA': 31, 'GE': 32, 'AS': 33, 'SE': 34, 'BR': 35, 'KR': 36, 'RB': 37, 'SR': 38, 'Y': 39, 'ZR': 40, 'NB': 41, 'MO': 42, 'TC': 43, 'RU': 44, 'RH': 45, 'PD': 46, 'AG': 47, 'CD': 48, 'IN': 49, 'SN': 50, 'SB': 51, 'TE': 52, 'I': 53, 'XE': 54, 'CS': 55, 'BA': 56, 'LA': 57, 'CE': 58, 'PR': 59, 'ND': 60, 'PM': 61, 'SM': 62, 'EU': 63, 'GD': 64, 'TB': 65, 'DY': 66, 'HO': 67, 'ER': 68, 'TM': 69, 'YB': 70, 'LU': 71, 'HF': 72, 'TA': 73, 'W': 74, 'RE': 75, 'OS': 76, 'IR': 77, 'PT': 78, 'AU': 79, 'HG': 80, 'TL': 81, 'PB': 82, 'BI': 83, 'PO': 84, 'AT': 85, 'RN': 86, 'FR': 87, 'RA': 88, 'AC': 89, 'TH': 90, 'PA': 91, 'U': 92, 'NP': 93, 'PU': 94, 'AM': 95, 'CM': 96, 'BK': 97, 'CF': 98, 'ES': 99, 'FM': 100, 'MD': 101, 'NO': 102, 'LR': 103, 'RF': 104, 'DB': 105, 'SG': 106, 'BH': 107, 'HS': 108, 'MT': 109, 'DS': 110, 'RG': 111, 'CN': 112, 'NH': 113, 'FL': 114, 'MC': 115, 'LV': 116, 'TS': 117, 'OG': 118, 'X': 0})#

Case-sensitive.

Type:

Map canonical UPPERCASE 2 letter element names to their atomic numbers. WARNING

atomworks.io.constants.GAP: Final[str] = '<G>'#

The (non-standard) code for a gap token.

atomworks.io.constants.GAP_ONE_LETTER: Final[str] = '-'#

The one-letter code for a gap token.

atomworks.io.constants.HYDROGEN_LIKE_SYMBOLS: Final[tuple[str, ...]] = ('H', 'H2', 'D', 'T')#

A tuple of symbols for (isotopes of) hydrogen.

WARNING: It is important that this remains a tuple, as it is used by np.isin

downstream, which does not play well with sets.

atomworks.io.constants.LIGAND_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'NON-POLYMER', 'OTHER'})#

Set of ligand-like chemical component types. All uppercase.

atomworks.io.constants.MASK_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'MASK'})#

Set of mask-like chemical component types. All uppercase.

atomworks.io.constants.METAL_ELEMENTS: Final[frozenset[str]] = frozenset({'AG', 'AL', 'AU', 'BA', 'BE', 'BI', 'CA', 'CD', 'CO', 'CR', 'CS', 'CU', 'FE', 'GA', 'HF', 'HG', 'IN', 'IR', 'K', 'LA', 'LI', 'MG', 'MN', 'MO', 'NA', 'NB', 'NI', 'OS', 'PB', 'PD', 'PT', 'RB', 'RE', 'RH', 'RU', 'SC', 'SN', 'SR', 'TA', 'TC', 'TI', 'TL', 'V', 'W', 'Y', 'ZN', 'ZR'})#

Case-sensitive.

Type:

A set of all metal elements, all UPPERCASE. WARNING

atomworks.io.constants.NA_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'DNA LINKING', 'DNA OH 3 PRIME TERMINUS', 'DNA OH 5 PRIME TERMINUS', 'L-DNA LINKING', 'L-RNA LINKING', 'RNA LINKING', 'RNA OH 3 PRIME TERMINUS', 'RNA OH 5 PRIME TERMINUS'})#

DNA or RNA-like chemical component types.

atomworks.io.constants.NUCLEIC_ACID_FRAME_ATOM_NAMES: Final[tuple[str, ...]] = ("C1'", "C3'", "C4'")#

A tuple of the names of the frame atoms (backbone) for nucleic acids.

atomworks.io.constants.PDB_ISOTOPE_SYMBOL_TO_ELEMENT_SYMBOL: Final[dict[str, str]] = {'D': 'H', 'T': 'H'}#

Map isotopes symbols used in the PDB to the element symbols.

NOTE: Other isotopes like 14C do not have a special symbol in the PDB.

atomworks.io.constants.PDB_MIRROR_PATH: Final[str] = None#

A path to a mirror of the PDB.

atomworks.io.constants.PEPTIDE_MAX_RESIDUES: Final[int] = 20#

The maximum number of residues until which we consider a protein-like sequence to be a peptide.

atomworks.io.constants.POLYPEPTIDE_D_CHEM_TYPES: Final[frozenset[str]] = frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE LINKING', 'D-PEPTIDE NH3 AMINO TERMINUS'})#

Set of polypeptide-D (right-handed amino acids) chemical component types. All uppercase.

atomworks.io.constants.POLYPEPTIDE_L_CHEM_TYPES: Final[frozenset[str]] = frozenset({'L-BETA-PEPTIDE, C-GAMMA LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE LINKING', 'L-PEPTIDE NH3 AMINO TERMINUS'})#

Set of polypeptide-L (left-handed amino acids) chemical component types. All uppercase.

atomworks.io.constants.PROTEIN_FRAME_ATOM_NAMES: Final[tuple[str, ...]] = ('N', 'CA', 'C')#

A tuple of the names of the frame atoms (backbone) proteins.

atomworks.io.constants.RNA_LIKE_CHEM_TYPES: Final[frozenset[str]] = frozenset({'L-RNA LINKING', 'RNA LINKING', 'RNA OH 3 PRIME TERMINUS', 'RNA OH 5 PRIME TERMINUS'})#

Set of RNA-like chemical component types. All uppercase.

atomworks.io.constants.STANDARD_AA: Final[tuple[str, ...]] = ('ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLN', 'GLU', 'GLY', 'HIS', 'ILE', 'LEU', 'LYS', 'MET', 'PHE', 'PRO', 'SER', 'THR', 'TRP', 'TYR', 'VAL')#

Tuple of the CCD codes for the standard 20 amino acids, alphabetically sorted by their three-letter CCD codes.

atomworks.io.constants.STANDARD_AA_ONE_LETTER: Final[tuple[str, ...]] = ('A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V')#

Tuple of the one-letter symbols for the standard 20 amino acids, alphabetically sorted by their three-letter CCD codes.

atomworks.io.constants.STANDARD_AA_TIP_ATOM_NAMES: Final[dict[str, list[str]]] = {'ALA': ['CB'], 'ARG': ['NH1', 'NH2'], 'ASN': ['OD1', 'ND2'], 'ASP': ['OD1', 'OD2'], 'CYS': ['SG'], 'GLN': ['OE1', 'NE2'], 'GLU': ['OE1', 'OE2'], 'GLY': ['CA'], 'HIS': ['CE1', 'NE2'], 'ILE': ['CD1'], 'LEU': ['CD1', 'CD2'], 'LYS': ['NZ'], 'MET': ['CE'], 'PHE': ['CZ'], 'PRO': ['CD', 'CG'], 'SER': ['OG'], 'THR': ['OG1', 'CG2'], 'TRP': ['CH2'], 'TYR': ['OH'], 'VAL': ['CG1', 'CG2']}#

A dictionary that maps the standard 20 amino acids to their tip atoms.

Tip atoms are defined as the side-chain heavy atoms that are furthest away from the backbone oxygen atom in the residue’s bond graph. With the exception of GLY, which has no backbone oxygen atom and we therefore use the CA atom as the tip atom.

atomworks.io.constants.STANDARD_DNA: Final[tuple[str, ...]] = ('DA', 'DC', 'DG', 'DT')#

Tuple of the CCD codes for the standard 4 DNA nucleotides.

atomworks.io.constants.STANDARD_DNA_ONE_LETTER: Final[tuple[str, ...]] = ('A', 'C', 'G', 'T')#

Tuple of the one-letter symbols for the standard 4 DNA nucleotides.

atomworks.io.constants.STANDARD_NA: Final[tuple[str, ...]] = ('A', 'C', 'G', 'U', 'DA', 'DC', 'DG', 'DT')#

Tuple of the CCD codes for the standard 8 nucleotides (4 RNA + 4 DNA).

atomworks.io.constants.STANDARD_PURINE_RESIDUES: Final[tuple[str, ...]] = ('A', 'G', 'DA', 'DG')#

Tuple of the CCD codes for the 4 standard purine nucleotides.

atomworks.io.constants.STANDARD_PYRIMIDINE_RESIDUES: Final[tuple[str, ...]] = ('C', 'U', 'DC', 'DT')#

Tuple of the CCD codes for the 4 standard pyrimidine nucleotides.

atomworks.io.constants.STANDARD_RNA: Final[tuple[str, ...]] = ('A', 'C', 'G', 'U')#

Tuple of the CCD codes for the standard 4 RNA nucleotides. These happen to be the same as the one-letter symbols.

atomworks.io.constants.STRUCT_CONN_BOND_ORDER_TO_INT: Final[mappingproxy[str, int]] = mappingproxy({'sing': 1, 'doub': 2, 'trip': 3, 'quad': 4})#

Mapping from struct_conn.pdbx_value_order to integer bond orders.

References

atomworks.io.constants.STRUCT_CONN_BOND_TYPES: Final[frozenset[str]] = frozenset({'covale', 'disulf', 'metalc'})#

A set of bond types that are considered when adding bonds to the atom array.

Reference:
atomworks.io.constants.UNKNOWN_AA: Final[str] = 'UNK'#

The CCD code for unknown amino acids (UNK).

Reference:
atomworks.io.constants.UNKNOWN_ATOM: Final[str] = 'UNX'#

The CCD code for unknown atoms (UNX).

Reference:
atomworks.io.constants.UNKNOWN_ATOMIC_NUMBER: Final[int] = 0#

The atomic number for an unknown element.

atomworks.io.constants.UNKNOWN_DNA: Final[str] = 'DN'#

The CCD code for unknown DNA nucleotides (DN).

Reference:
atomworks.io.constants.UNKNOWN_ELEMENT: Final[str] = 'X'#

The element name for an unknown element.

atomworks.io.constants.UNKNOWN_LIGAND: Final[str] = 'UNL'#

The CCD code for unknown ligands (UNL).

Reference:
atomworks.io.constants.UNKNOWN_RNA: Final[str] = 'N'#

The CCD code for unknown RNA nucleotides (N).

Reference:
atomworks.io.constants.WATER_LIKE_CCDS: Final[tuple[str, ...]] = ('HOH', 'DOD')#

A tuple of CCD codes for water-like molecules.

WARNING: It is important that this remains a tuple, as it is used by np.isin

downstream, which does not play well with sets.