Bystroff Lab Downloads
Bystroff Lab Downloads
I-sites Library of sequence-structure motifs
HMMSTR-SS secondary structure prediction
HMMSTR-CM contact map prediction
HMMSTR utilities
HMMSUM pairwise alignment using a local structure-based model
MASKER molecular surface area calculator
PROTEAN torsion space molecular simulations
SCALI non-sequential structure-based alignment tool
Backbone ensemble generation for protein design in PyRosetta
InteractiveROSETTA
Click here for help compiling fortran 90
MASKER molecular surface package
This package contains a Fortran 90 module for calculating the solvent excluded surface and the
solvation free energy of any molecule in PDB format. Atomic radii and surface tensions may be set by the
user. The solvation free energy is assumed to be a weighted sum of the solvent excluded surfaces.
where each atom has a weight determined by the solvation free energy of that atom type.
New atom types may be defined by the user.
The downloadable package includes a program (pdbmask) for calculating the SES and the solvation energy. It also optionally outputs
a raster3D file.
Servers are available for calculating the buried surface between all pairs of amino acids in a structure (MASKER-CM), and for calculating the locations of
buried void spaces within a structure (VOIDMASK).
The MASKER module compiles using gfortran.
Please cite:
Bystroff C. (2002). MASKER: Improved solvent excluded molecular surface area estimations using Boolean masks. Protein Eng 15, 959 - 965 abstract PDF
HMMSTR secondary structure prediction
This package contains the programs needed to predict secondary structure starting with a sequence profile.
The sequence profile (a vector of 20 probabilities for each residue in the sequence) can be the output
of a profile HMM such as HMMer. It may also be the output of Psi-Blast, which uses profiles
internally, or may be generated from a multiple sequence alignment. The programs in this package, HMMSTR
and associated format converters,
will give you a probabilistic prediction of each of the six DSSP symbols: H,E,G,S,T and _.
For now, this is a bare-bones package. Note that this is a small part of the script that runs
from the HMMSTR/Rosetta server. You will need a Unix system and C++ and Fortran90 compilers to run the package.
Please cite:
Bystroff C, Thorsson V & Baker D. (2000). HMMSTR: A hidden markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology 301, 173-90. abstract PDF
Download Compressed TAR file
HMMSTR models
HMMSTR-CM contact map prediction
This package contains the programs needed to predict the contact potential map for a protein.
As above a sequence profile is the input.
HMMSTR-CM gives you a JPEG image and text file showing the residues that are
most likely to come into contact (distance < 8.0A) in the folded structure.
This too is a bare-bones package. This script runs as part of
the HMMSTR/Rosetta server. You need a Linux/pentium system, gcc and pgf90 compilers to install the package.
Please cite:
Shao Y & Bystroff C. (2003a). Predicting inter-residue contacts using templates and pathways. Proteins, Structure, Function and Genetics 53 Suppl 6:497-502. abstract PDF
Download Compressed TAR file
PROTEAN molecular simulations
PROTEAN is a set of Fortran subroutines for calculating the equations ofmotion in
torsion space for polypeptides. Torsion space is the space of all rotatable bonds.
Bond lengths and bond angles remain fixed at their ideal values in a torsion space simulation.
Simulations in torsion space are at least ten times more efficient than simulations in Cartesian
space. Protean is lightly documented, but the adventurous student of molecular simulations
may discover the hidden script language by trial and error, and the help command.
Requires a Fortran compiler such as PGF90.
Download Compressed TAR file
If you find this useful,please cite:
Bystroff, C. (2001) An alternative derivation of the equations of motion in torsion
space for a branched linear chain. Protein Engineering 14, 825-828. abstract
SCALI: Non-sequential structure-based alignments.
Proteins of the same class often share a secondary structure packing arrangement but differ in how the secondary structure units are ordered in the sequence. We find that proteins that share a common core also share local sequence-structure similarities, and these can be exploited to align structures with different topologies. In this study, segments from a library of local sequence-structure alignments were assembled hierarchically, enforcing the compactness and conserved inter-residue contacts but not sequential ordering. Previous structure-based alignment methods often ignore sequence similarity, local structural equivalence, and compactness.
The new program, SCALI (Structural Core ALIgnment), can efficiently find conserved packing arrangements, even if they are non-sequentially ordered in space. SCALI alignments conserve remote sequence similarity and contain fewer alignment errors. Clustering of our pairwise non-sequential alignments shows that recurrent packing arrangements exist in topologically different structures. For example, the 3-layer sandwich domain architecture may be divided into four structural subclasses based on internal packing arrangements. These subclasses represent an intermediate level of structure classification, more general than topology but more specific than architecture as defined in CATH. A strategy is presented for developing a set of predictive hidden Markov models based on multiple SCALI alignments.
An online SCALI structure comparison server is available
here.
Download compressed TAR file
If you find this useful,please cite:
Yuan X & Bystroff, C. (2005) Non-sequential Structure-based Alignments Reveal Topology-independent
Core Packing Arrangements in Proteins. Bioinformatics 27(7):1010-1019.
abstract
PDF
HMMSUM structure-based substitution matrices
HMMSUM (HMMSTR-based SUbstitution matrices) is a new model for structural context-based amino acid substitution probabilities consisting of a set of 281 matrices, each for a different sequence-structure context. HMMSUM does not require the structure of the protein to be known. Instead, predictions of local structure are made using HMMSTR, a hidden Markov model for local structure. Alignments using the HMMSUM matrices compare favorably to alignments carried out using the BLOSUM50 matrix when validated against curated remote homolog alignments from BAliBASE. HMMSUM has been implemented using local Dynamic Programming and with the Bayesian Adaptive alignment method.
The download package contains the essential programs from HMMSTR (see above) and the HMMSTR model itself, alng with Smith-Waterman local dynamic programming and Bayesian Adaptive alignment programs, modified to use the HMMSUM matrices. A server for HMMSUM alignment is under construction.
Download compressed TAR file
If you find this useful,please cite:
Huang, Y-M, & Bystroff, C. (2006) Improved pairwise alignment of proteins in the Twilight Zone using local structure predictions.
Bioinformatics 22(4):413-422
PDF
Ensemble generation scripts for PyRosetta
Motivation: Mutations in homologous proteins affect changes in the backbone conformation that involve a complex interplay of forces and are hard to predict. Protein design algorithms need to anticipate these backbone changes in order to accurately calculate the energy of the structure given an amino acid sequence, and they must do so without the knowledge of the final, designed sequence.
Results: We explored the ability of the Rosetta suite of protein de-sign tools to move the backbone from its position in one structure (template) to its position in a homolog structure (target) as a function of the diversity of the backbone ensemble, the percent sequence identity, and the size of the local zone being modeled. We describe a pareto front in the likelihood of moving the backbone toward the target as a function of ensemble diversity and zone size. The num-bers presented here will be useful for homology modeling and for protein design using the piecemeal approach.
ensemblegen.py
superimp.py
NOTE: Requires PyRosetta and mpi2py
If you find this useful,please cite:
Schenkelberg, C. D., & Bystroff, C. (2016). Protein backbone ensemble generation explores the local structural space of unseen natural homologs. Bioinformatics, 32(10), 1454-1461.
InteractiveROSETTA
Summary: Modern biotechnical research is increasingly becoming more reliant on computational structural modeling programs to de-velop novel solutions to pressing scientific questions. Rosetta is one such protein modeling suite that has already demonstrated wide applicability to a number of diverse research projects. Unfortunately, Rosetta is largely a command-line driven software package which restricts its use among non-computational researchers. Some graphical interfaces for Rosetta exist, but typically are not as sophis-ticated as commercial software. Here we present InteractiveROSETTA, a graphical interface for the PyRosetta framework that pre-sents easy-to-use controls for several of the most widely-used Ro-setta protocols alongside a sophisticated selection system utilizing PyMOL as a visualizer. InteractiveROSETTA is also capable or interacting with remote servers running a standalone Rosetta install, rendering it easy to incorporate more sophisticated protocols that are not accessible in PyRosetta and/or require significant computa-tional resources.
Availability: InteractiveROSETTA is freely available at
github.com/schenc3/InteractiveROSETTA.
This python script requires Python, and a separate download of PyRosetta which is available at http://www.pyrosetta.org after obtaining a license (free for academic use).
Contact: schenc3@rpi.edu, bystrc@rpi.edu
If you find this useful,please cite:
Schenkelberg, CD & Bystroff, C. (2015) InteractiveROSETTA: A client graphical user interface for the Py-Rosetta and Rosetta protein modeling suite.
Bioinformatics btv492
email me
Last updated
Tue Jan 3 16:39:41 EST 2017