Sequence Utilities#

Utility functions for working with monomer sequences.

atomworks.io.utils.sequence.get_1_from_3_letter_code(res_name: str, chain_type: ChainType, use_closest_canonical: bool = False, gap_three_letter: str = '<G>', gap_one_letter: str = '-') str[source]#

Converts a 3-letter residue name to its 1-letter code based on the chain type.

Optionally, the closest canonical mapping can be used.

Parameters:
  • res_name (str) – The 3-letter residue name.

  • chain_type (ChainType) – The type of chain, using the ChainType enum.

  • use_closest_canonical (bool) – Whether to use the closest canonical mapping (from BioPython). Defaults to False.

  • gap_three_letter (str) – The three-letter code for a gap. Defaults to “<G>”.

  • gap_one_letter (str) – The one-letter code for a gap. Defaults to “-” (as is standard within MSAs).

Returns:

The corresponding 1-letter code. Returns “X” if the residue name or chain type is not supported.

Return type:

str

atomworks.io.utils.sequence.get_3_from_1_letter_code(letter: str, chain_type: ChainType, gap_one_letter: str = '-', gap_three_letter: str = '<G>') str[source]#

Converts a 1-letter residue name to its 3-letter code based on the chain type.

Note

Converting from a three-letter, to a one-letter, back to a three-letter code is not invertible (i.e., 1:1) and may result in a different three-letter sequence.

Parameters:
  • letter (str) – The 1-letter residue name.

  • chain_type (ChainType) – The type of chain, using the ChainType enum.

  • gap_one_letter (str) – The one-letter code for a gap. Defaults to “-” (as is standard within MSAs).

  • gap_three_letter (str) – The three-letter code for a gap. Defaults to “<G>”.

Returns:

The corresponding 3-letter code.

Return type:

str