==================================================================================================== Protein Relative Solvent Accessibility Prediction (ACCpro) 25% Accessibility Threshold Only Method Description & Project Documentation ==================================================================================================== Author(s) : Christophe Magnan (cmagnan@ics.uci.edu) Copyright : Institute for Genomics and Bioinformatics University of California, Irvine Modified : 2015/07/02 ==================================================================================================== Method Description ==================================================================================================== ACCpro is a widely used protein relative solvent accessibility predictor. From an input protein amino-acid sequence, ACCpro predicts the solvent accessibility state (exposed or buried) of each amino-acid in the protein sequence considering a 25% accessibility cut-off value (i.e. at least 25% of the residue must be exposed to be predicted as such, predicted as buried otherwise). ACCpro is a 3-step predictor where the output of each step becomes the input of the next step. Each step is performed by an independent software delivered with ACCpro. ACCpro can thus be considered as a wrapper tool enriched with its own set of prediction models (recurrent neural networks). A brief description of the three main components of ACCpro is provided below: - a sequence profile generator (PROFILpro software) to extract the protein evolutionary information - a set of 100 BRNNs (neural networks trained with 1D-BRNN software) performing the first ab-initio prediction directly from the sequence profiles generated by PROFILpro. - a homology-based solvent accessibility predictor (HOMOLpro software) improving the initial ab-initio predictions with homology-based ones when homologs can be found in the Protein Data Bank ACCpro is now delivered as part of the SCRATCH-1D suite of predictors together with SSpro, SSpro8, ACCpro20, PROFILpro, HOMOLpro, and 1D-BRNN. ACCpro is no longer made available as a standalone tool. SCRATCH-1D allows to run SSpro, SSpro8, ACCpro, and ACCpro20 predictors on multiple sequences using multiple cores in a single run reducing significantly the computation time needed to obtain the predictions in comparison with running each predictor separately. Scripts to run ACCpro separately are nevertheless provided (documentation provided below) in the 'bin' folder of the ACCpro release included in the SCRATCH-1D package but we highly recommend to use directly SCRATCH-1D 'bin' scripts to run all the predictors at once and save a significant amount of time. ==================================================================================================== Project Documentation ==================================================================================================== This section provides a description of the project folder and how to use ACCpro. ========================================= Project Folder ========================================= A brief description of the project folders is given below. - bin Main scripts to run ACCpro - data ACCpro prediction models (BRNNs) - doc Documentation of the software - env Bash profile for running ACCpro - lib ACCpro scripts to predict the relative solvent accessibility - tmp Temporary work folder for the software ========================================= Software Usage ========================================= ACCpro comes with only one script to run the predictor : bin/sequence_to_acc.sh Usage : ./sequence_to_acc.sh input_fasta output_predictions [num_threads] With: - input_fasta Input protein sequences in FASTA file format - output_predictions Predicted relative solvent accessibility - num_threads Number of cores to use to process the dataset (default=1) Three additional scripts are provided for specific cases only: - sequence_to_acc_ab.sh : returns ACCpro ab-initio predictions only, the homology analysis will not be performed and predictions will not be improved by this second stage prediction. This script is only provided for evaluation purposes. Usage is identical to 'sequence_to_acc.sh'. - profiles_to_acc.sh & profiles_to_acc_ab.sh : scripts to run ACCpro directly from the sequence profiles instead of the protein amino-acid sequences. These scripts are used by SCRATCH-1D to optimize computation time and are not expected to be used directly. Usage is similar to the other scripts but the input fasta file must be replaced by the profiles generated by PROFILpro. ======================================= Input Files Format ======================================= Input files must be in the standard FASTA file format. There is no limit for the number of input sequences to process beside the amount of RAM memory available on the machine running the program. When profiles are provided as input instead of protein sequences, the input file format is the same as PROFILpro output files, please refer to the documentation of PROFILpro for more details. ==================================== Output Files Description ==================================== Output files are in the same file format as the input files where the protein amino-acid sequence is replaced by the predicted solvent accessibility. Headers are reported as provided in input. ==================================================================================================== Release Notes ==================================================================================================== Version 5.2 (2015) Author : Christophe Magnan Description : Minor revision Comments : Repackaged for SCRATCH-1D release 1.1 Version 5.1 (2013) Author(s) : Christophe Magnan Description : Retrained predictor and updated datasets together with new code and packaging Comments : Profiles and homology analysis now performed by PROFILpro and HOMOLpro softwares Versions 4.0 & 4.1 (2005) Author(s) : Jianlin Cheng Description : Retrained predictor with updated datasets Comments : No longer available Versions < 4.0 Author(s) : Gianluca Pollastri Description : Initial versions of the predictor Comments : No longer available ====================================================================================================