PSFContainer

A class for reading protein structure (.psf) files.

Index

PSFContainer([filename, title, atoms, ...])

A container for managing protein structure files.

API

class FOX.PSFContainer(filename=None, title=None, atoms=None, bonds=None, angles=None, dihedrals=None, impropers=None, donors=None, acceptors=None, no_nonbonded=None)[source]

A container for managing protein structure files.

The PSFContainer class has access to three general sets of methods.

Methods for reading & constructing .psf files:

  • PSFContainer.read()

  • PSFContainer.write()

Methods for updating atom types:

  • PSFContainer.update_atom_charge()

  • PSFContainer.update_atom_type()

Methods for extracting bond, angle and dihedral-pairs from plams.Molecule instances:

  • PSFContainer.generate_bonds()

  • PSFContainer.generate_angles()

  • PSFContainer.generate_dihedrals()

  • PSFContainer.generate_impropers()

  • PSFContainer.generate_atoms()

filename

A 1D array with a single string as filename.

Type

\(1\) numpy.ndarray [str]

title

A 1D array of strings holding the title block.

Type

\(n\) numpy.ndarray [str]

atoms

A Pandas DataFrame holding the atoms block. The DataFrame should possess the following collumn keys:

  • "segment name"

  • "residue ID"

  • "residue name"

  • "atom name"

  • "atom type"

  • "charge"

  • "mass"

  • "0"

Type

\(n*8\) pandas.DataFrame

bonds

A 2D array holding the indices of all atom-pairs defining bonds. Indices are expected to be 1-based.

Type

\(n*2\) numpy.ndarray [int]

angles

A 2D array holding the indices of all atom-triplets defining angles. Indices are expected to be 1-based.

Type

\(n*3\) numpy.ndarray [int]

dihedrals

A 2D array holding the indices of all atom-quartets defining proper dihedral angles. Indices are expected to be 1-based.

Type

\(n*4\) numpy.ndarray [int]

impropers

A 2D array holding the indices of all atom-quartets defining improper dihedral angles. Indices are expected to be 1-based.

Type

\(n*4\) numpy.ndarray [int]

donors

A 2D array holding the atomic indices of all hydrogen-bond donors. Indices are expected to be 1-based.

Type

\(n*1\) numpy.ndarray [int]

acceptors

A 2D array holding the atomic indices of all hydrogen-bond acceptors. Indices are expected to be 1-based.

Type

\(n*1\) numpy.ndarray [int]

no_nonbonded

A 2D array holding the indices of all atom-pairs whose nonbonded interactions should be ignored. Indices are expected to be 1-based.

Type

\(n*2\) numpy.ndarray [int]

np_printoptions

A mapping with Numpy print options. See np.set_printoptions.

Type

Mapping [str, object]

pd_printoptions

A mapping with Pandas print options. See Options and settings.

Type

Mapping [str, object]

as_dict(return_private=False)[source]

Construct a dictionary from this instance with all non-private instance variables.

The returned dictionary values are shallow copies.

Parameters

return_private (bool) – If True, return both public and private instance variables. Private instance variables are defined in PSFContainer._PRIVATE_ATTR.

Returns

A dictionary with keyword arguments for initializing a new instance of this class.

Return type

dict [str, Any]

See also

PSFContainer.from_dict()

Construct a instance of this objects’ class from a dictionary with keyword arguments.

PSFContainer._PRIVATE_ATTR

A set with the names of private instance variables.

copy(deep=True)[source]

Return a shallow or deep copy of this instance.

Parameters

deep (bool) – Whether or not to return a deep or shallow copy.

Returns

A new instance constructed from this instance.

Return type

PSFContainer

property filename

Get PSFContainer.filename as string or assign an array-like object as a 1D array.

property title

Get PSFContainer.title or assign an array-like object as a 1D array.

property atoms

Get PSFContainer.atoms or assign an a DataFrame.

property bonds

Get PSFContainer.bonds or assign an array-like object as a 2D array.

property angles

Get PSFContainer.angles or assign an array-like object as a 2D array.

property dihedrals

Get PSFContainer.dihedrals or assign an array-like object as a 2D array.

property impropers

Get PSFPSFContainerimpropers or assign an array-like object as a 2D array.

property donors

Get PSFContainer.donors or assign an array-like object as a 2D array.

property acceptors

Get PSFContainer.acceptors or assign an array-like object as a 2D array.

property no_nonbonded

Get PSFContainer.no_nonbonded or assign an array-like object as a 2D array.

property segment_name

Get or set the "segment name" column in PSFContainer.atoms.

property residue_id

Get or set the "residue ID" column in PSFContainer.atoms.

property residue_name

Get or set the "residue name" column in PSFContainer.atoms.

property atom_name

Get or set the "atom name" column in PSFContainer.atoms.

property atom_type

Get or set the "atom type" column in PSFContainer.atoms.

property charge

Get or set the "charge" column in PSFContainer.atoms.

property mass

Get or set the "mass" column in PSFContainer.atoms.

update_atom_charge(atom_type, charge)[source]

Change the charge of atom_type to charge.

Parameters
  • atom_type (str) – An atom type in PSFContainer.atoms ["atom type"].

  • charge (float) – The new atomic charge to-be assigned to atom_type. See PSFContainer.atoms ["charge"].

Raises

ValueError – Raised if charge cannot be converted into a float.

update_atom_type(atom_type_old, atom_type_new)[source]

Change the atom type of a atom_type_old to atom_type_new.

Parameters
  • atom_type_old (str) – An atom type in PSFContainer.atoms ["atom type"].

  • atom_type_new (str) – The new atom type to-be assigned to atom_type. See PSFContainer.atoms ["atom type"].

generate_bonds(mol=None, *, segment_dict=None)[source]

Update PSFContainer.bonds with the indices of all bond-forming atoms from mol.

Notes

The mol and segment_dict parameters are mutually exclusive.

Examples

>>> from FOX import PSFContainer
>>> from scm.plams import Molecule

>>> psf = PSFContainer(...)
>>> segment_dict = {"MOL3": Molecule(...)}

>>> psf.generate_bonds(segment_dict=segment_dict)
Parameters
  • mol (plams.Molecule) – A PLAMS Molecule.

  • segment_dict (Mapping[str, plams.Molecule]) – A dictionary mapping segment names to individual ligands. This can result in dramatic speed ups for systems wherein each segment contains a large number of residues.

generate_angles(mol=None, *, segment_dict=None)[source]

Update PSFContainer.angles with the indices of all angle-defining atoms from mol.

Notes

The mol and segment_dict parameters are mutually exclusive.

Parameters
  • mol (plams.Molecule) – A PLAMS Molecule.

  • segment_dict (Mapping[str, plams.Molecule]) – A dictionary mapping segment names to individual ligands. This can result in dramatic speed ups for systems wherein each segment contains a large number of residues.

generate_dihedrals(mol=None, *, segment_dict=None)[source]

Update PSFContainer.dihedrals with the indices of all proper dihedral angle-defining atoms from mol.

Notes

The mol and segment_dict parameters are mutually exclusive.

Parameters
  • mol (plams.Molecule) – A PLAMS Molecule.

  • segment_dict (Mapping[str, plams.Molecule]) – A dictionary mapping segment names to individual ligands. This can result in dramatic speed ups for systems wherein each segment contains a large number of residues.

generate_impropers(mol=None, *, segment_dict=None)[source]

Update PSFContainer.impropers with the indices of all improper dihedral angle-defining atoms from mol.

Notes

The mol and segment_dict parameters are mutually exclusive.

Parameters
  • mol (plams.Molecule) – A PLAMS Molecule.

  • segment_dict (Mapping[str, plams.Molecule]) – A dictionary mapping segment names to individual ligands. This can result in dramatic speed ups for systems wherein each segment contains a large number of residues.

generate_atoms(mol, id_map=None)[source]

Update PSFContainer.atoms with the all properties from mol.

DataFrame keys in PSFContainer.atoms are set based on the following values in mol:

DataFrame column

Value

Backup value(s)

"segment name"

"MOL{:d}"; See "atom type" and "residue name"

"residue ID"

Atom.properties ["pdb_info"]["ResidueNumber"]

1

"residue name"

Atom.properties ["pdb_info"]["ResidueName"]

"COR"

"atom name"

Atom.symbol

"atom type"

Atom.properties ["symbol"]

Atom.symbol

"charge"

Atom.properties ["charge_float"]

Atom.properties ["charge"] & 0.0

"mass"

Atom.mass

"0"

0

If a value is not available in a particular Atom.properties instance then a backup value will be set.

Parameters
  • mol (plams.Molecule) – A PLAMS Molecule.

  • id_map (Mapping[int, Any], optional) – A mapping of ligand residue ID’s to a custom (Hashable) descriptor. Can be used for generating residue names for quantum dots with multiple different ligands.

to_atom_dict()[source]

Create a dictionary of atom types and lists with their respective indices.

Returns

A dictionary with atom types as keys and lists of matching atomic indices as values. The indices are 0-based.

Return type

dict[str, list[int]]

to_atom_alias_dict()[source]

Create a with atom aliases.

write_pdb(mol, pdb_file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, copy_mol=True)[source]

Construct a .pdb file from this instance and mol.

Note

Requires the optional RDKit package.

Parameters
  • mol (plams.Molecule) – A PLAMS Molecule.

  • copy_mol (bool) – If True, create a copy of mol instead of modifying it inplace.

  • pdb_file (str ot IO[str]) – A filename or a file-like object.

sort_values(by, *, return_argsort=False, inplace=False, **kwargs)[source]

Sort the atoms values by the specified columns.

Examples

>>> from FOX import PSFContainer

>>> psf: PSFContainer = ...
>>> print(psf)
PSFContainer(
    acceptors    = array([], shape=(0, 1), dtype=int64),
    angles       = array([], shape=(0, 3), dtype=int64),
    atoms        =     segment name  residue ID residue name atom name atom type  charge       mass  0
                   1           MOL1           1          COR        Cd        Cd     0.0  112.41400  0
                   2           MOL1           1          COR        Cd        Cd     0.0  112.41400  0
                   3           MOL1           1          COR        Cd        Cd     0.0  112.41400  0
                   4           MOL1           1          COR        Cd        Cd     0.0  112.41400  0
                   5           MOL1           1          COR        Cd        Cd     0.0  112.41400  0
                   ..           ...         ...          ...       ...       ...     ...        ... ..
                   223         MOL3          26          LIG         O         O     0.0   15.99940  0
                   224         MOL3          27          LIG         C         C     0.0   12.01060  0
                   225         MOL3          27          LIG         H         H     0.0    1.00798  0
                   226         MOL3          27          LIG         O         O     0.0   15.99940  0
                   227         MOL3          27          LIG         O         O     0.0   15.99940  0

                   [227 rows x 8 columns],
    bonds        = array([[124, 125],
                          [126, 124],
                          [127, 124],
                          [128, 129],
                          [130, 128],
                          ...,
                          [222, 220],
                          [223, 220],
                          [224, 225],
                          [226, 224],
                          [227, 224]], dtype=int64),
    dihedrals    = array([], shape=(0, 4), dtype=int64),
    donors       = array([], shape=(0, 1), dtype=int64),
    filename     = array([], dtype='<U1'),
    impropers    = array([], shape=(0, 4), dtype=int64),
    no_nonbonded = array([], shape=(0, 2), dtype=int64),
    title        = array(['PSF file generated with Auto-FOX',
                          'https://github.com/nlesc-nano/Auto-FOX'], dtype='<U38')
)

>>> psf.sort_values(["residue ID", "mass"])
PSFContainer(
    acceptors    = array([], shape=(0, 1), dtype=int64),
    angles       = array([], shape=(0, 3), dtype=int64),
    atoms        =     segment name  residue ID residue name atom name atom type  charge      mass  0
                   1           MOL2           1          COR        Se        Se     0.0  78.97100  0
                   2           MOL2           1          COR        Se        Se     0.0  78.97100  0
                   3           MOL2           1          COR        Se        Se     0.0  78.97100  0
                   4           MOL2           1          COR        Se        Se     0.0  78.97100  0
                   5           MOL2           1          COR        Se        Se     0.0  78.97100  0
                   ..           ...         ...          ...       ...       ...     ...       ... ..
                   223         MOL3          26          LIG         O         O     0.0  15.99940  0
                   224         MOL3          27          LIG         H         H     0.0   1.00798  0
                   225         MOL3          27          LIG         C         C     0.0  12.01060  0
                   226         MOL3          27          LIG         O         O     0.0  15.99940  0
                   227         MOL3          27          LIG         O         O     0.0  15.99940  0

                   [227 rows x 8 columns],
    bonds        = array([[141, 139],
                          [138, 141],
                          [136, 141],
                          [137, 135],
                          [134, 137],
                          ...,
                          [164, 167],
                          [165, 167],
                          [163, 162],
                          [160, 163],
                          [161, 163]], dtype=int64),
    dihedrals    = array([], shape=(0, 4), dtype=int64),
    donors       = array([], shape=(0, 1), dtype=int64),
    filename     = array([], dtype='<U1'),
    impropers    = array([], shape=(0, 4), dtype=int64),
    no_nonbonded = array([], shape=(0, 2), dtype=int64),
    title        = array(['PSF file generated with Auto-FOX',
                          'https://github.com/nlesc-nano/Auto-FOX'], dtype='<U38')
)

>>> from scm.plams import Molecule

# Sort the molecule in the same order as `psf`
>>> mol: Molecule = ....
>>> psf_new, argsort = psf.sort_values(["residue ID", "mass"], return_argsort=True)
>>> mol.atoms = [mol.atoms[i] for i in argsort]
Parameters
  • by (str or Sequence[str]) – One or more strings with the names of columns.

  • return_argsort (bool) – If True, also return the array of indices that sorts the dataframe.

  • **kwargs (Any) – Further keyword arguments for . Note that axis and ignore_index are not supported. Secondly, inplace=True will always return self.

See also

pd.DataFrame.sort_values

Sort by the values along either axis.