Working With Metalloproteins in Rosetta

By Vikram K. Mulligan, Baker Laboratory (vmullig@uw.edu). Documentation created 13 March 2014; changes checked into master branch 14 March 2014.

AT THIS TIME, AUTOMATIC METAL SETUP IS NOT CONSISTENTLY WORKING WELL WITHIN ROSETTASCRIPTS. USE AT YOUR OWN RISK UNTIL THIS IS RESOLVED.

Short summary

The -in:auto_setup_metals flag has been added to make it easy to import a PDB file containing a metalloprotein and to have Rosetta automatically detect coordinate covalent bonds to metal ions and create appropriate constraints.

Usage cases

In general, there are three broad types of task that a user might want to do with a metalloprotein:

What is Rosetta's default behavior, and why is this a problem?

By default, Rosetta is able to recognize most of the metal ions commonly found in proteins (Zn, Cu, Fe, Mg, Na, K, Ca, etc.). On import, however, Rosetta knows nothing about the covalent connectivity between the metal ion and surrounding amino acid residues. As the foldtree is set up, the metal is added by a jump, rooted to the first residue. This isn't ideal, since it means that a small change in the backbone conformation anywhere will not pull the metal along with the surrounding residues.

Because Rosetta knows nothing about metal ion covalent geometry by default, the scoring function will consider van der Waals interactions between the metal ion and surrounding side-chains, meaning that Rosetta will think that it is clashing horribly, yielding huge energy values. Relaxing the structure will push all surrounding side-chains away from the metal, and could result in the metal moving away from the rest of the structure. We probably don't want that.

Even worse, the protonation state of histidine residues (which commonly coordinate metal ions) is, by default, set without any knowledge of the metal. This often places hydrogen atoms inside the van der Waals radius of the metal ion.

How have metalloproteins been handled in the past?

Some people have created Enzdes-style geometric constraints files for each metalloprotein on which they want to work, but this is time-consuming and tedious, and requires user input that shouldn't be necessary. It also requires that each protocol know how to use an Enzdes-style constraints file, and many do not. Others have used virtual metal atoms, but this means that the scoring function ignores the charge on a metal ion, and creates other problems. Some people have even created new metal-binding residue types that have included the metal, but this creates its own set of problems. Finally, some have just worked with apo-metalloproteins (the metal-free versions of the proteins), which is valid in some cases, but is not a generally-applicable strategy.

What's the better solution?

The -in:auto_setup_metals flag has been added to handle import of metalloproteins from PDB files automatically. This flag ensures the following:

How do I control the default behavior of the -in:auto_setup_metals flag?

Three additional flags control the behavior:

How does Rosetta know which residue types can bind metals?

Metal-binding residues have the METALBINDING property in their properties list (in the params file). Additionally, metal-binding atoms are specified with the METAL_BINDING_ATOMS ... line (also in the params file). Meanwhile, metal ions have the METAL property in their properties list. Note: metal ions must have the metal atom as atom 1.

Does this work for noncanonical amino acid residues, too?

Absolutely. The noncanonical amino acid (2,2'-bipyridin-5yl)alanine (BPY) has been added to the database, and can serve as a demonstration case.

How can I confirm that Rosetta is setting the atomic connectivity properly?

Explicit CONECT records can be written on PDB export by using the -inout:connect_info_cutoff 0.0 and -inout:dump_connect_info true flags. Bonds to the metal ion will be visible when the PDB output from Rosetta is loaded in PyMOL if you use these flags.

I'd like to do other things with metals in my own protocols.

Can you ask that in the form of a question?

Where can I find the code for the functionality described here?

Good question. Utility functions are located in src/core/util/metalloproteins_util.cc. (Functions in this file duplicate some of the functionality in src/protocols/toolbox/match_enzdez_util/ with some modifications; in the future, this should probably be consolidated to avoid code duplication.) The is_metal() and is_metalbinding() methods have both been added to both the core::chemical::ResidueType and core::conformation::Residue classes, letting you query whether a residue is a metal ion or a metal-binding residue. For metal-binding residues, the get_metal_binding_atoms() method in ResidueType and Residue provides a list of indices of metal-binding atoms.

Caveats