Running Rosetta with membrane proteins requires additional data from 3rd party apps for describing the topology and embedding of each membrane chain. The following inputs are required for running framework protocols:
- Membrane Spanning Topology Data
- Membrane lipophobicity data (recommended, but not required)
- Membrane embedding data - initial parameters
All required parsing scripts discussed on this page are located in tools/membrane_tools
and detailed on the RosettaMembrane: Scripts and Tools page.
Membrane Spanning Topology Data
Spanning topology data describes the position of individual residues in a membrane protein with respect to different membrane regions (intracellular, membrane spanning, and extracellular region). This data is traditionally generated with OCTOPUS and then converted to the Rosetta span file format.
To generate a spanfile, go to OCTOPUS and generate a topology file from the FASTA sequence. Below is an example format of the resulting file:
#####################################################################
OCTOPUS result file
Generated from http://octopus.cbr.su.se/ at 2008-09-18 21:06:32
Total request time: 6.69 seconds.
#####################################################################
Sequence name: BRD4
Sequence length: 123 aa.
Sequence:
PIYWARYADWLFTTPLLLLDLALLVDADQGTILALVGADGIMIGTGLVGALTKVYSYRFV
WWAISTAAMLYILYVLFFGFTSKAESMRPEVASTFKVLRNVTVVLWSAYPVVWLIGSEGA
GIV
OCTOPUS predicted topology:
oooooMMMMMMMMMMMMMMMMMMMMMiiiiMMMMMMMMMMMMMMMMMMMMMooooooMMM
MMMMMMMMMMMMMMMMMMiiiiiiiiiiiiiiiiiiiiiMMMMMMMMMMMMMMMMMMMMM
ooo
tools/membrane_tools
Example usage of this script is below:
./octopus2span.pl BRD4.oct > BRD4.span
Membrane Lipophobicity Data
Lipophobicity data for membrane proteins describes the lipid accessibility of particular positions with respect to their position in the membrane. This data is used as part of the current membrane scoring function.
To generate a lipid accessibility data file (.lips or .lips4), you will need to have already generated a .span file above. Running the script requires the following dependencies:
- blast (not blast+)
- nr blast databases
- alignblast.pl script (also in
tools/membrane_tools
) - fasta sequence
- .span file
Below is example usage of the script:
./run_lips.pl <myseq.fasta> <mytopo.span> /path/to/blastpgp /path/to/nr alignblast.pl
Embedding Definition
In the membrane protein framework, each chain of a pose also requires a specific membrane protein embedding definition. This definition defines a normal vector and center position of the chain with respect to Rosetta's defined implicit membrane (normal vec , center position (0, 0, 0), thickness = 30A).
For each chain, you will need to generate an embedding definition file. This file will either (a) specify the final embedding parameters used or (b) specify initial parameters for a more detailed calculation of the embedding. How these initial parameters are used is controlled by the following method tags in the normal and center fields.
- default - Uses a default setting of (0,0,0) for center and (0,0,1) for normal as final starting parameters
- from_topology - Calculates normal and center from membrane spanning topology using specified xyz as final from whatever coordinates are specified in the .embed file
- user_defined - Accepts specified coordinates as final coordinates
- from_search - Search for parameter using membrane MCM search method and coordinates specified in the .embed file as starting coordinates for this calculation. Note: In previous, this behavior was controlled by the command line and is is no longer supported
Note - you do not have to specify the same tag for both normal and center. The code is flexible enough to use different calculation methods for different parameters. Good examples of this can be found in the unit test for Embedding Factory.
The depth parameter (depth of chain with respect to membrane) is completely user controlled.
Below is an example .embed file format.
POSE <Description>
normal x y z <tag>
center x y z <tag>
depth <value>
Putting it All Together (Setting up a Job)
The RosettaMembrane framework requires that one of these data files be generated for each chain specified. Therefore, the resource definition file (as part of the ResourceManager) is used as a means of organizing these inputs. For more about setting up resource description files, please refer to the ResourceManager documentation (note - still internal dox)
Below is an example Resource Definition file for a pose with 2 chains.
<JD2ResourceManagerJobInputter>
<! Locators for Membrane Resources >
<ResourceLocators>
<FileSystemResourceLocator tag="spanning_locator" base_path="/path/to/input/spanning_locator"/>
<FileSystemResourceLocator tag="embedding_locator" base_path="/path/to/input/embedding_defs"/>
<FileSystemResourceLocator tag="startstruct_locator base_path="/path/to/input/pdbs"/>
<FileSystemResourceLocator tag="lipsexp_locator" base_path="/path/to/input/lips_exp"/>
</ResourceLocators>
<! Options for Resources >
<ResourceOptions>
<EmbedSearchParamOptions tag="embedB_options"
normal_search="true"
normal_start_angle="10"
normal_max_angle="10"
normal_delta_angle="10"
center_search="true"
center_max_delta="10"
/>
<PoseFromPDBOptions
tag="pdb1"
ignore_unrecognized_res=1
/>
</ResourceOptions>
<! Specifying the Actual Resources with their Options >
<Resources>
<Group 1 - Chain A (from topology)>
<PoseFromPDB tag="BRD4_A" locator_tag="startstruct_locator" locator_id="BRD4_A.pdb" resource_options="pdb1">
<EmbedConfig tag="BRD4_embedA" locator_tag="embedding_locator" locator_id="BRD4_A.embed">
<SpanFile tag="BRD4_spanA" locator_tag="spanning_locator" locator_id="BRD4_A.span">
<LipoFile tag="BRD4_lipsA" locator_tag="lipsexp_locator" locator_id="BRD4_A.lips">
<Group 2 - Chain B (from search)>
<PoseFromPDB tag="BRD4_B" locator_tag="startstruct_locator" locator_id="BRD4_B.pdb"
<EmbedConfig tag="BRD4_embedB" locator_tag="embedding_locator" locator_id="BRD4_B.embed" resource_options="embedB_options">
<SpanFile tag="BRD4_spanB" locator_tag="spanning_locator" locator_id="BRD4_B.span">
<LipoFile tag="BRD4_lipsB" locator_tag="lipsexp_locator" locator_id="BRD4_B.lips">
</Resources>
<! Map resource tags to resource descriptions >
<Jobs>
<Job name="BRD4">
<Data desc="BRD4_A" resource_tag="BRD4_A"/>
<Data desc="BRD4_embedA" resource_tag="BRD4_embedA" />
<Data desc="BRD4_spanA" resource_tag="BRD4_spanA" />
<Data desc="BRD4_lipsA" resource_tag="BRD4_lipsA" />
<Data desc="BRD4_B" resource_tag="BRD4_B"/>
<Data desc="BRD4_embedB" resource_tag="BRD4_embedB" />
<Data desc="BRD4_spanB" resource_tag="BRD4_spanB" />
<Data desc="BRD4_lipsB" resource_tag="BRD4_lipsB" />
</Job>
</Jobs>
</JD2ResourceManagerJobInputter>