The ab initio prediction of the structure of RNA loops is an excellent stress test for RNA structure prediction algorithms. This benchmark was originally developed for the stepwise Monte Carlo (SWM) and stepwise assembly (SWA) algorithms, and since then it has developed to permit the generation of equivalent "fragment assembly of RNA" (with optional "full-atom refinement"; FARNA and FARFAR, respectively) jobs.
As a result, this benchmark may be used to gauge the performance, on common problems of broad interest, of new algorithms or new configurations thereof, or to help optimize new scoring terms or to re-weight existing scoring functions.
This repository includes python scripts for constructing benchmark sets of loops from text specifications, setting up runs (either locally or on a cluster) for a particular benchmark set, and analyzing the results (via plots or tables, both in the styles employed in a publication currently under review).
For the most part, the structure and contents of this repository is likely to remain fixed for some time: it is intended to serve as a reproducible starting point for the results obtained in the aforementioned paper. Afterwards, however, the SHA1 of that starting point will be memorialized here and further contributions will be encouraged.
What might be a valuable contribution? Maybe you've got a set of structures that you think stepwise ought to be able to recover well (or poorly); maybe you've got a fun (preferably: informative) visualization or ensemble analysis method you'd like to add to our analysis and plotting scripts.
Any new benchmarks that use the stepwise
applications -- motif rebuilding, all tetraloops, miniproteins, helix nearest neighbor rules, etc. -- should go in here. Some or all of this repo may eventually be used for Rosetta scientific tests.
ref/
holds reference benchmark data
new/
will hold new benchmark runs -- will not be checked in, however
scripts/
holds some scripts for visualizing runs
input_files/
holds input PDBs, and text files like favorites.txt
, which define motifs for benchmark, including sequence, secondary structure, reference PDBs, etc.
- Make sure you have Rosetta compiled. You also need the Rosetta tools repository, and to have set up the rna_tools subdirectory therein. Follow directions here.
- Run
source ./INSTALL
from inside thestepwise_benchmark/
directory.
(Look inside ref/
for checked in examples)
- Go inside
new/
- Create a new folder with a descriptive name like
rna_res_level_energy_rnatorsion1_synGbonus
- Create
extra_flags_benchmark.txt
that describes the interesting new flags in your run. - Create a
README_SETUP
with a command-line to setup the benchmark like:
setup_stepwise_benchmark.py ../../input_files/<favorites.txt>
and run it with source README_SETUP
. You should see subdirectories with all the target names and input files.
- If you are on Stanford's
biox3
, or another cluster running PBS, you can typesource qsubMINI
to queue up all the jobs. Ditto for if you are on Stanford'ssherlock
, or another cluster running slurm; you can then typesource sbatchMINI
. - While it is running or after it is done, run from the command-line:
easy_cat.py SWM
to concatenate models from various subdirectories for each target.
- Copy or rsync files to your local computer for visualization.
You should run regression tests before making any merge to the master
branch. (Once we begin accepting contributions, we will be the ones making merges to master
-- but you should explicitly list any regression test output in your PR.) Since these regression tests are VERY FAST (~15s), ideally you should run these regression tests more often than that.
- Go to the
test/
directory. - Run
test.sh
on master to generate reference results (in `ref``). - Check out your new branch, commit of interest, or whatever.
- Re-run
test.sh
to generate new results (innew/
). The script will report on the diff automatically.
Plotting with Python
- In the
new/
directory, you can compare two runs by running a command like:
make_plots.py rna_res_level_energy4 rna_res_level_energy_rnatorsion1
Plotting with MATLAB
- In the
scripts/matlab/
directory, you can compare two runs by running a command like:
make_plots( {'../new/rna_res_level_energy_rnatorsion1_novirtualo2prime_synGbonus_suitenessbonus','../new/rna_res_level_energy_rnatorsion1_novirtualo2prime_synGbonus_suitenessbonus_varypolarHgeom'} );