<!-- --- title: Database Io -->Database Support in Rosetta

Database Drivers

Rosetta has support for interacting with SQLite3, MySQL and PostgreSQL database backends. This page describes the backends, how to get started using them, and what has already been done. The SQLite3 backend is tested extensively in the integration_tests with every commit, and the PostgreSQL and MySQL support is tested through the BuildBot framework ( PostgreSQL , MySQL).

database connection information can be specified with these RosettaScripts and or command line options .

FeaturesReporter Framework

For many applications, one would like to store and retrieve information about a set of structures, for example maybe its relevant to store the atomic coordinates, how similar each structure is to the native and the predicted binding energy (maybe the project is protein interface design, and the set consists of all the structures in various rounds of prediction). We have developed a modular database schema , where each FeaturesReporter is responsible for a set of tables in the database. Using a particular schema, features for a set of structures is stored as a batch in the database.

Job Distribution and Database IO

Structures can be read from, and written to a relational database when using the JD2 job distributor. Advantages over pdb files or silent files include:

Database IO is implemented simply as a fixed set of FeaturesReporters:

Possible issues for cluster based jobs:

Pose IO

Rosetta can input poses from a database, and output poses to a database. Support for this behavior is supported in any application which utilizes the JD2 job distributor. The DatabaseJobOutputter is compatible with both serial and parallel jobs, and automatically detects non-ideal poses and properly handles output.

Multiple executions of Rosetta can be stored in the same database. Each execution will have a separate protocol_id. If -out:database_protocol_id is not specified, the protocol_id field auto-increments. The Rosetta SVN version, command line, XML script (if available) and flags are stored in the database.

Extracting Poses

Poses can be extracted from the database into PDB or Silent files using the application score_jd2. MySQL and sqlite3 interfaces are also available for perl, python, R and other scripting languages, making it possible to directly parse and analyze data without extracting it. Poses can be extracted from a database in code using protocols::features::ProteinSilentReport::load_pose() function.

Database Filters

Database filters allow you to only output poses that meet some criteria based on the existing poses in the database. Database filters are invoked from the command line with the following syntax:

-out:database_filter <database filter name> <list of database filter options>

At present 4 database filters are implemented: