RNA 3D design: rna_design
Metadata
Author: Rhiju Das
This document updates documentation written in 2008 by Rhiju Das (rhiju [at] stanford.edu) into the latest documentation format. Last update: April 2011.
Code and Demo
The central code for the rna_design application is in apps/public/rna/rna_design.cc
with core routines run through the amazing Rosetta packer.
For a 'minimal' demo example of RNA design:
demos/public/RNA_Design
[in the release version]
References
Das, R., Karanicolas, J., and Baker, D. (2010), "Atomic accuracy in predicting and designing noncanonical RNA structure". Nature Methods 7:291-294. [for high resolution refinement] Paper. Link.
(Reprint available at http://daslab.stanford.edu/pubs.html ).
Purpose
This code is intended to carry out fixed backbone design of RNA sequences given an input backbone.
Algorithm
This application carries out combinatorial optimization of nucleobase type and conformation along with 2'-OH torsions, in the context of a pre-specified RNA backbone. It is very similar to the Rosetta fixed-backbone protein design algorithm, and has been used to test the new Rosetta RNA potential. Unfortunately, it is not presently very optimized for speed; the precalculation of rotamer energies takes a while. Runs on RNA backbones longer than ~ten nucleotides take many minutes or hours; algorithm improvements implemented in future releases will greatly speed this up.
Limitations
- This method does not currently include any optimization of the backbone positions.
- This method does not yet support the design of a subset of nucleotide positions.
Input Files
Required file
Just the PDB file with desired backbone.
How to run with this file.
rna_design.<exe> -s chunk001_uucg_RNA.pdb -nstruct 3 -ex1:level 4 -dump -score:weights rna_hires.wts -database <path to database>
This demo redesigns a 'UUCG' tetraloop on a single-base pair RNA 'helix', as a small 6-nucleotide test case. As illustration, only 3 designs are output. It takes about 15 seconds to run. The typical sequence output is cuuggg (native is cuucgg).
Options
-database Path to Rosetta databases. [PathVector]
-in:file:s Name(s) of single PDB file(s) to process. [FileVector]
-nstruct Number of times to process each input PDB. [Integer]
-ex1:level <n> Use extra chi1 sub-rotamers for all residues that pass
the extrachi_cutoff.
[Boolean]
The integers that follow the ex flags specify the pattern
for chi dihedral angle sampling.
There are currently 8 options; they all include the original
chi dihedral angle No. 4 means: EX_TWO_HALF_STEP_STDDEVS
[-1,-1/2,0,1/2,1 standard deviations].
-dump Generate pdb output,default:false. [Boolean]
-score:weights rna_hires.wts Name of weights file, default is standard. [String]
-sample_chi Sample chi (glycosidic torsion angle).
-disable_o2star_rotamers Turn off sampling of 2'-OH proton position.
Tips
What do the scores mean?
The most common question we get is on what the terms in the 'SCORE lines' of silent files mean. Here's a brief rundown, with more explanation in the papers cited above.
***Energy interpreter for fullatom silent output:
score Final total score
fa_atr lennard-jones attractive
fa_rep lennard-jones repulsive
fa_intra_rep Lennard-jones repulsive between atoms in the same residue
lk_nonpolar lazaridis-karplus non-polar solvation energy
hack_elec_rna_phos_phos Simple electrostatic repulsion term between phosphates
hbond_sr_bb_sc backbone-sidechain hbonds close in primary sequence
hbond_lr_bb_sc backbone-sidechain hbonds distant in primary sequence
hbond_sc sidechain-sidechain and sidechain-backbone hydrogen bond energy
ch_bond Carbon hydrogen bonds
geom_sol Geometric Solvation energy for polar atoms
rna_torsion RNA torsional potential.
atom_pair_constraint Atom pair distance constraints score?
angle_constraint (not in use)
rms rmsd
Expected Outputs
If you use the sample flag files, there are also other output files generated.
start.pdb: Idealized structure
S_000*.pdb: Output of the rna denovo design.
chunk001_uucg_RNA.sequence_recovery.txt: This is a simple report for design identity of each RNA residue and structure.
chunk001_uucg_RNA.pack.txt: Total score and sequence for each output model
chunk001_uucg_RNA.pack.out: Scores (with breakdown by score component) for each re-designed sequence