MultiBodyFeaturesReporters

StructureFeatures

A structure is a group of spatially organized residues. The definition corresponds with a Pose in Rosetta. Unfortunately in Rosetta there is not a well defined way to identify a Pose. For the purposes of the the features database, each structure is assigned a unique struct_id. To facilitate connecting structures in the database with structures in structures Rosetta, the tag field is unique.

    CREATE TABLE IF NOT EXISTS structures (
        struct_id INTEGER PRIMARY KEY AUTOINCREMENT,
        protocol_id INTEGER,
        tag TEXT,
        UNIQUE (protocol_id, tag),
        FOREIGN KEY (protocol_id) REFERENCES protocols (protocol_id) DEFERRABLE INITIALLY DEFERRED);

PoseConformationFeatures

The PoseConformationFeatures measures the conformation level information in a Pose. Together with the ProteinResidueConformationFeatures , the atomic coordinates can be reconstructed. To facilitate creating poses from conformation structure data stored in the features database, PoseConformationFeatures has a load_into_pose method.

    CREATE TABLE IF NOT EXISTS pose_conformations (
        struct_id INTEGER AUTOINCREMENT PRIMARY KEY,
        annotated_sequence TEXT,
        total_residue INTEGER,
        fullatom BOOLEAN,
        FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
    CREATE TABLE IF NOT EXISTS fold_trees (
        struct_id INTEGER AUTOINCREMENT,
        start_res INTEGER,
        start_atom TEXT,
        stop_res INTEGER,
        stop_atom TEXT,
        label INTEGER,
        keep_stub_in_residue BOOLEAN,
        FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
    CREATE TABLE IF NOT EXISTS jumps (
        struct_id INTEGER AUTOINCREMENT,
        jump_id INTEGER,
        xx REAL,
        xy REAL,
        xz REAL,
        yx REAL,
        yy REAL,
        yz REAL,
        zx REAL,
        zy REAL,
        zz REAL,
        x REAL,
        y REAL,
        z REAL,
        FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
    CREATE TABLE IF NOT EXISTS chain_endings (
        struct_id INTEGER AUTOINCREMENT,
        end_pos INTEGER,
        FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);

GeometricSolvationFeatures

    CREATE TABLE IF NOT EXISTS geometric_solvation (
        struct_id INTEGER AUTOINCREMENT,
        hbond_site_id TEXT,
        geometric_solvation_exact REAL,
        FOREIGN KEY (struct_id, hbond_site_id) REFERENCES hbond_sites(struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
        PRIMARY KEY(struct_id, hbond_site_id));

RadiusOfGyrationFeatures

Measure the radius of gyration for each structure. The radius of gyration measure of how compact a structure is in O(n). It is the expected displacement of mass from the center of mass. The Wikipedia page is has some information . Also see, Lobanov MY, Bogatyreva NS, Galzitskaya OV. Radius of gyration as an indicator of protein structure compactness . Molecular Biology. 2008;42(4):623-628.

    CREATE TABLE IF NOT EXISTS radius_of_gyration (
        struct_id INTEGER AUTOINCREMENT,
        radius_of_gyration REAL,
        FOREIGN KEY(struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
        PRIMARY KEY(struct_id));

SandwichFeatures

Function summary: Extract and analyze beta-sandwiches

Function detail: Extract beta-sandwiches conservatively so that it correctly excludes alpha-helix that is identified as beta-sandwiche by SCOP and excludes beta-barrel that is identified as beta-sandwiches by CATH. To dump into pdb files, use Matt's format_converter.

Analyze beta-sandwiches such as phi, psi angles in core/edge strand each, assign one beta-sheet between two beta-sheets that constitute one beta-sandwich as additional chain so that InterfaceAnalyzer can be used.

CREATE TABLE sw_can_by_components(
    struct_id INTEGER AUTOINCREMENT NOT NULL,
    sw_can_by_components_PK_id INTEGER NOT NULL,
    tag TEXT NOT NULL,
    sw_can_by_sh_id INTEGER NOT NULL,
    sheet_id INTEGER,
    sheet_antiparallel INTEGER,
    sw_can_by_components_bs_id INTEGER,
    sw_can_by_components_bs_edge INTEGER,
    intra_sheet_con_id INTEGER,
    inter_sheet_con_id INTEGER,
    residue_begin INTEGER NOT NULL,
    residue_end INTEGER NOT NULL,
    FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, residue_begin) REFERENCES residues(struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, residue_end) REFERENCES residues(struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
    PRIMARY KEY (struct_id, sw_can_by_components_PK_id));

SecondaryStructureSegmentFeatures

Report continuous segments of secondary structure. DSSP is used to define secondary structure, but simplified to be simply H, E, and L (all DSSP codes other than H and E). Due to this simplification of DSSP codes, the dssp column is NOT a foreign key to the dssp_codes table.

    CREATE TABLE IF NOT EXISTS secondary_structure_segments (
        struct_id INTEGER AUTOINCREMENT NOT NULL,
        segement_id INTEGER NOT NULL,
        residue_begin INTEGER,
        residue_end INTEGER,
        dssp TEXT NOT NULL,
    FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, residue_begin) REFERENCES residues(struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, residue_end) REFERENCES residues(struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
    PRIMARY KEY (struct_id, segment_id))

SmotifFeatures

Record a set of geometric parameters defined by two pieces of adjacent secondary structure. More information can be found here: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000750

CREATE TABLE smotifs(
    struct_id INTEGER AUTOINCREMENT NOT NULL,
    smotif_id INTEGER NOT NULL,
    secondary_struct_segment_id_1 INTEGER NOT NULL,
    secondary_struct_segment_id_2 INTEGER NOT NULL,
    loop_segment_id INTEGER NOT NULL,
    distance REAL NOT NULL,
    hoist REAL NOT NULL,
    packing REAL NOT NULL,
    meridian REAL NOT NULL,
    FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, secondary_struct_segment_id_1) REFERENCES secondary_structure_segments(struct_id, segment_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, secondary_struct_segment_id_2) REFERENCES secondary_structure_segments(struct_id, segment_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, loop_segment_id) REFERENCES secondary_structure_segments(struct_id, segment_id) DEFERRABLE INITIALLY DEFERRED,
    PRIMARY KEY (struct_id, smotif_id))

StrandBundleFeatures

Function summary: Find all strands -> Leave all pair of strands -> Leave all pair of sheets

Function detail:

It generates smallest unit of beta-sandwiches that are input files of Tim's SEWING protocol.

After finding all beta strands in pdb files, leave all pair of beta strands (either parallel or anti-parallel) among them. Then leave all pair of beta sheets (which are constituted with 4 beta strands each). As it finds strands/sheets, it find only those that meet criteria specified in option. 'strand_pairs' table and 'sandwich' table are created in a same schema respectively.

CREATE TABLE strand_pairs(
    struct_id INTEGER AUTOINCREMENT NOT NULL,
    strand_pairs_id INTEGER NOT NULL,
    bool_parallel INTEGER NOT NULL,
    beta_select_id_i INTEGER NOT NULL,
    beta_select_id_j INTEGER NOT NULL,
    FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, beta_select_id_i) REFERENCES beta_selected_segments(struct_id, beta_selected_segments_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, beta_select_id_j) REFERENCES beta_selected_segments(struct_id, beta_selected_segments_id) DEFERRABLE INITIALLY DEFERRED,
    PRIMARY KEY (struct_id, strand_pairs_id));
CREATE TABLE sandwich(
    struct_id INTEGER AUTOINCREMENT NOT NULL,
    sandwich_id INTEGER NOT NULL,
    sp_id_1 INTEGER NOT NULL,
    sp_id_2 INTEGER NOT NULL,
    shortest_sc_dis REAL NOT NULL,
    FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, sp_id_1) REFERENCES strand_pairs(struct_id, strand_pairs_id) DEFERRABLE INITIALLY DEFERRED,
    FOREIGN KEY (struct_id, sp_id_2) REFERENCES strand_pairs(struct_id, strand_pairs_id) DEFERRABLE INITIALLY DEFERRED,
    PRIMARY KEY (struct_id, sandwich_id));