Membrane Abinitio
Metadata
Author: Vladimir Yarov-Yarovoy
This document was last updated on July 17, 2012 by Vladimir Yarov-Yarovoy (yarovoy@ucdavis.edu) . The membrane ab initio application was developed in David Baker's (dabaker@uw.edu) group by:
- Jack Schonbrun
- Vladimir Yarov-Yarovoy
- Patrick Barth
- Bjorn Wallner
Code and Demo
The membrane ab initio executable is in src/apps/public/membrane_abinitio/membrane_abinitio2.cc. See /rosetta/rosetta_tests/integration/tests/membrane_abinitio directory for an example membrane ab initio run and input files.
References
- Barth P, Wallner B, Baker D. (2009 Feb 3) Prediction of membrane protein structures with complex topologies using limited constraints. Proc Natl Acad Sci U S A. 106(5):1409-14.
- Barth P, Schonbrun J, Baker D. (2007 Oct 2) Toward high-resolution prediction and design of transmembrane helical protein structures. Proc Natl Acad Sci U S A. 104(40):15682-7.
- Yarov-Yarovoy V, Schonbrun J, Baker D. (2006 Mar 1) Multipass membrane protein structure prediction using Rosetta. Proteins. 62(4):1010-25.
Application purpose
This protocol was developed to predict helical membrane protein structures.
Algorithm
This protocol will only generate low-resolution centroid models. The protocol is using transmembrane region predictions from OCTOPUS server ( http://octopus.cbr.su.se/ ) to set initial membrane normal and membrane center vectors and define membrane-specific environment (hydrophobic core, interface, polar, and water layers). For multispan helical membrane proteins the protocol starts from embedding only two randomly selected adjacent transmembrane helical regions and then continues folding by inserting one of adjacent helices until all transmembrane helices will be embedded into the membrane.
Input Files
-
Generate structure fragments:
Programs needed to run: fragment generation - info at fragment file.
Note - use only SAM secondary structure prediction file (*.rdb) - jufo and psipred predict transmembrane helical regions poorly.
Example command:
make_fragments.pl -verbose -id BRD4 BRD4_.fasta -nojufo -nopsipred
-
Genarate transmembrane regions (OCTOPUS) file:
Input OCTOPUS topology file is generated at http://octopus.cbr.su.se/ using protein sequence as input.
Sample OCTOPUS topology file:
############################################################################## OCTOPUS result file Generated from http://octopus.cbr.su.se/ at 2008-09-18 21:06:32 Total request time: 6.69 seconds. ############################################################################## Sequence name: BRD4 Sequence length: 123 aa. Sequence: PIYWARYADWLFTTPLLLLDLALLVDADQGTILALVGADGIMIGTGLVGALTKVYSYRFV WWAISTAAMLYILYVLFFGFTSKAESMRPEVASTFKVLRNVTVVLWSAYPVVWLIGSEGA GIV OCTOPUS predicted topology: oooooMMMMMMMMMMMMMMMMMMMMMiiiiMMMMMMMMMMMMMMMMMMMMMooooooMMM MMMMMMMMMMMMMMMMMMiiiiiiiiiiiiiiiiiiiiiMMMMMMMMMMMMMMMMMMMMM ooo
-
Convert OCTOPUS file to .span file format:
BRD4.span - transmembrane topology prediction file generated using octopus2span.pl script as follows:
octopus2span.pl <OCTOPUS topology file>
Example command:
<path to rosetta>/rosetta/main/source/src/apps/public/membrane_abinitio/octopus2span.pl BRD4.octopus
Sample .span file:
TM region prediction for BRD4 predicted using OCTOPUS 4 123 antiparallel n2c 6 26 6 26 31 51 31 51 58 78 58 78 97 117 97 117
1st line is comment line. 2nd line shows number of predicted transmembrane helices (4 in the command lines example below) and total number of residues (123 in the example below). 3rd line shows predicted topology of transmembrane helices in the membrane (currently only antiparallel topology is implemented). 4th line and all lines below show start and end residue numbers of each of the predicted transmembrane helices (current format repeats these numbers twice).
-
Generate .lips4 file.
BRD4.lips4 - lipophilicity prediction file created using run_lips.pl script as follows (note that blastpgp and nr database are necessary to run run_lips.pl script
run_lips.pl <fasta file> <span file> <path to blastpgp> <path to nr database> <path to alignblast.pl script>
Example command:
<path to mini>/mini/src/apps/public/membrane_abinitio/run_lips.pl BRD4.fasta BRD4.span /work/bjornw/Apps/blast/bin/blastpgp /scratch/shared/genomes/nr ~bjornw/mini/src/apps/public/membrane_abinitio/alignblast.pl
Sample lips4 file:
Lipid exposed data: resnum mean-lipo lipophil entropy 6 -1.000 3.004 1.211 9 -1.000 2.268 2.137 10 -1.000 4.862 1.095 13 -1.000 1.304 1.552 16 -1.000 3.328 2.025 ...
Options
Run membrane ab initio application with the following flags:
./bin/membrane_abinitio2.linuxgccrelease
-in:file:native BRD4.pdb Native structure (optional)
(or -in:file:fasta BRD4_.fasta) Protein sequence in fasta format (required if native structure is not provided)
-in:file:spanfile BRD4.span Octopus transmembrane prediction (see above)
-in:file:lipofile BRD4.lips4 Lipophilicity prediction (see above)
-in:file:frag3 aaBRD4_03_05.200_v1_3 3-residue fragments
-in:file:frag9 aaBRD4_09_05.200_v1_3 9-residue fragments
-in:path:database ~rosetta_database Path to rosetta database
-abinitio:membrane Membrane ab initio application
-score:find_neighbors_3dgrid Use a 3D lookup table for residue neighbors calculations
-membrane:no_interpolate_Mpair Switch off the interpolation between the two layers for the Mpair term
-membrane:Menv_penalties Switch on the following penalties:
* no non-helical secondary structure fragments in predicted transmembrane helical regions in the hydrophobic layer of the membrane.
* no N- and C- termini residues of predicted transmembrane helical regions in the hydrophobic layer of the membrane.
* no transmembrane helices with orientation >45 degrees relative to the membrane normal vector.
-nstruct 1 Number of output structures
You have a choice to use either Monte Carlo (default) or descrete search of membrane normal and membrane center.
Optional settings used for Monte Carlo based membrane normal and center search protocol:
-membrane:normal_cycles (default=100) Total number of membrane normal cycles
-membrane:normal_mag (default=5) Magnitude of membrane normal angle search step size (degrees)
-membrane:center_mag (default=1) Magnitude of membrane center search step size (Angstroms)
Tip - to speedup Monte Carlo based membrane normal and center search use the following settings:
-membrane:normal_cycles 40
-membrane:normal_mag 15
-membrane:center_mag 2
Optional settings for alternative - descrete search of membrane normal and membrane center:
-membrane:center_search (default= false) - perform membrane center search within "center_max_delta" deviation (see below).
-membrane:normal_search (default= false) - perform membrane normal search with normal_start_angle, normal_delta_angle, and normal_max_angle values (see below).
-membrane:center_max_delta (default= 5 A) - magnitude of maximum membrane width deviation during membrane center search (Angstroms).
-membrane:normal_start_angle (default= 10 degrees) - magnitude of starting angle during membrane normal search (degrees).
-membrane:normal_delta_angle (default= 10 degrees) - magnitude of angle deviation during membrane normal search (degrees).
-membrane:normal_max_angle (default= 40 degrees) - magnitude of maximum angle deviation during membrane normal search (degrees).
Expected Outputs
Convert silent file output into pdb file using score application as follows:
~bin/score.macosgccrelease \
-in:file:native BRD4.pdb \
-in:file:centroid_rosetta_inputs/ \
-in:file:silent BRD4_silent.out \
-in:file:silent_struct_type binary \
-in:file:spanfile BRD4.span \
-in:file:lipofile BRD4.lips4 \
-membrane:no_interpolate_Mpair \
-membrane:Menv_penalties \
-score:find_neighbors_3dgrid \
-score:weights score_membrane.wts \
-out:nstruct 1 \
-database ~rosetta_database \
-out:output \
-nstruct 1
Membrane ab initio application specific score outputs in the output score file are:
Mpair membrane pairwise residue interaction energy
Menv membrane residue environment energy
Mcbeta membrane residue centroid density energy
Mlipo membrane residue lipophilicity energy
Menv_hbond membrane non-helical secondary structure in the hydrophobic layer penalty
Menv_termini membrane N- and C-temini residue in the hydrophobic layer penalty
Menv_tm_proj transmembrane helix projection penalty
Post Processing
Generate at least 10,000 models and then use rosetta Cluster application to identify most frequently sampled conformations. In general case, at least one of top 5-10 clusters will have models with the lowest rmsd to the native structure.
Good luck :)!