Non-standard aminoacids in OpenMM, with OpenMMForceFields and OpenFF
This is a brief tutorial to prepare a protein with non-standard aminoacids for MD modeling with OpenMM. It illustrates the process for an inhibitor covalently attached to a cysteine. The general idea is to:
create a fragment containing the custom sidechain and part of the protein backbone
parameterize the fragment using OpenFF and parse the resulting parameters and assigned atom types of the fragment to create those of the residue attached to the protein.
load the protein in OpenMM with the new parameters
The procedure requires openmm, openmmforcefields, and rdkit. They can all be installed using conda/mambda.
- Protein and fragment preparation
The procedure below assumes that you are working with Maestro from Schrodinger, Inc. Any molecular modeling editor could be used to do the same. Adjust as needed.
Download 2XU1.PDB from rcsb.org. Residue 424 is the inhibitor covalently bound to CYS 25.
Prepare the protein structure to pick the chain, add hydrogen atoms, pick alternate conformations, add capping groups, missing sidechains, etc. using the protein preparation wizard. Ensure that the ligand has the correct bond orders and formal charges before doing that.
Edit the residue information so that the modified cysteine and the attached ligand are one residue. This residue will retain the backbone atoms of the CYS residue. Call the residue "424" or whatever you want. Ensure that the residue number is the same as that of the original modified cysteine.
Confirm that residue 424 has standard PDB atom names up to CB, including hydrogen atoms attached to it.
Make a copy of the entry and retain only residue 424. Cap the backbone fragment of 424 with acetyl (ACE) and (N-methyl acetamide) NMA capping groups. Make sure that the capping groups are different residues than 424. Name these residues "NCP" and "CCP", respectively. The script we use below assumes these names. Assign to NCP a residue number one less than non-standard aminoacid residue, and CCP one more.
Assign standard PDB atom names to the atoms of the capping groups. "CA" for carbon atoms, "N" for nitrogen, "O" for oxygen, "HA1", "HA2", and "HA3" for the hydrogen atoms. Again, the script below assumes so.
Export the protein structure to a pdb file (such as 2xu1p.pdb). Export the fragment to sdf and pdb files (for example 424p.sdf and 424p.pdb). It is important that the sequence of atoms in 424p.sdf is the same as in 424p.pdb. In Maestro, this can be achieved by exporting the fragment in pdb format and reading it back in before exporting it to sdf.
Edit 2xu1p.pdb to place the atoms of residue 424 at their correct position based on the residue number (OpenMM's Modeller got confused in my hands because 424 was listed at the end).
At this point, you should have three files similar to these: 2xu1p.pdb, 424p.sdf, 424p.pdb.
2. OpenFF parameterization and processing
Download and run openff-lig.py
python openff-lig.py --LIGSDFinFile 424p.sdf --LIGXMLoutFile 424p.xml
which creates a ForceField xml file for the ligand fragment
Next, download and run replace-types.py
python replace-types.py --LIGXMLinFile 424p.xml --LIGPDBinFile 424p.pdb --LIGXMLoutFile 2xu1p.xml
which creates the ForceField xml file 2xu1p.xml which contains the force field information for both the ligand fragment and for the ligand covalently attached to the protein. This script scans the force field information for the ligand fragment and creates force field interaction terms between the ligand and the protein. It creates a residue template for the covalently attached ligand after deleting the caps.
3. Use custom force field
Finally, use the custom 2xu1p.xml file when creating the OpenMM System with the covalently modified protein. For example:
pdbrcpt = PDBFile("2xu1p.pdb")
forcefield = ForceField("amber14-all.xml", "amber14/tip3p.xml", "2xu1p.xml")
system=forcefield.createSystem(pdbrcpt.topology)