Article Abstract: Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides […]
ABOUT CYRUS BENCH
Cyrus Bench ™
Cyrus Bench is an easy-to-use version of the Rosetta molecular modeling and protein design software package. Rosetta is the leading protein structure prediction tool, with top performance in the CASP and CAMEO competitions. Rosetta is the first software experimentally proven to design new proteins computationally, including the first designed protein binder with antibody-like affinity.
Cyrus Bench delivers Rosetta with the complete associated array of bio-molecular computation tools (e.g. BLAST) required to use Rosetta to its full potential. This scientific backbone is delivered via a custom-designed new Rosetta interface GUI that works in a modern web browser, paired with automation of a growing set of standard user procedures, and deployed on the best cloud compute resources.
Rosetta is the leading protein modeling tool, with proven performance in protein structure prediction, protein-ligand docking, protein-protein docking, antibody modeling, and structure modeling with experimental information (X-ray, NMR, Cryo-EM). Rosetta algorithms are tested and refined on real experimental data, and have consistently delivered actionable and verifiable wet-lab results.
Protein Structure Prediction
Rosetta consistently outperforms the competition in protein homology modeling at the bi-annual CASP competition and the weekly CAMEO contest (Song, Structure, 2013). Rosetta was the first software package to consistently predict small protein structures “ab initio”, with no homology (Kim, Proteins, 2014).
Rosetta can be combined with experimental structural data to produce better structures, or new atomic-resolution structures that are otherwise impossible. Low-quality x-ray data can be used to produce high-resolution structures (Adams, Ann. Rev. Biophys. 2013), or sparse NMR data can be transformed into useful structures (Lange, PNAS, 2012).
Rosetta is the world leader in computer design of proteins, and has achieved a number of “firsts”, including the first full-computationally designed and experimentally verified protein and the first protein-binding protein designed in a computer.
Protein-ligand Interaction Design
Rosetta has proven the ability to re-design natural enzymes to act on novel substrates, even in cases where traditional in vitro evolution methods have failed (Liu & Nivon, PNAS, 2014) . Rosetta has also shown the ability to design nano-molar affinity small-molecule binders “de novo” into previously inactive scaffolds (Tinberg, Nature, 2013).
Protein-protein Interaction Design
Rosetta is able to design novel target-protein-binding activity into a huge number of inactive protein scaffolds. The hemagluttinin (influenza virus) binder, HB36, was the first-in-class computationally designed nano-molar protein/protein binder (Fleishman, Science, 2011), and more recently a BHRF1-protein binder was demonstrated using Rosetta (Procko, Cell, 2014).
CORE TECHNOLOGY: ROSETTA
Rosetta has been developed over the past 16 years, beginning at the lab of Prof. David Baker at the University of Washington, and at over 30 other labs around the world. Rosetta began as a tool for protein modeling, combining knowledge-based and physical modeling approaches with a consistent focus on actionable experimental results. Over the last 10 years Rosetta has evolved into the world-leading tool for computational protein design.
Rosetta primarily uses Monte Carlo based sampling using knowledge of protein structure from the protein data bank (pdb). Protein backbones are primarily modeled using “fragments” derived from the pdb using an array of powerful bioinformatics tools, for example BLAST — all of which is built in to Cyrus Bench behind the scenes. Protein sidechains are modeled using the now 30-year-old “rotamer” concept, with constant refinement over the years.
Rosetta uses a combination of physical and knowledge-derived potentials to score proteins during modeling and design. For example, a physics-based coulombic potentials is employed, and statistically-derived hydrogen bonding potential. Each protocol is highly tuned on large amounts of experimental data, often with custom scoring methods, and has been tested multiple times in real experimental contexts. For example a method to predict mutational free energies (Kellogg et al, Proteins, 2011) is tuned on large experimental datasets before deployment in Bench.
THE ROSETTA DIFFERENCE
In the half century since 1963, when Ramachandran published his seminal study of conformational preferences in the protein backbone, massive improvements in the ability to predict small molecule and protein conformations have been made. Initially, the bulk of work in the field of computational modeling and design was focused on developing a reliably predictive energy function that is derived from the fundamental physics of molecular interaction. The energy function is the calculation at the core of any simulation of a molecule, so a predictive function forms the basis of any useful simulation. This function was combined with approaches such as energy minimization, molecular dynamics and Monte Carlo, to 1) reproduce the rapidly expanding pools of publicly available experimental X-ray crystal structures and of increasingly-precise quantum mechanical predictions and 2) make prospective predictions (binding energies to screen lead compounds, structures to generate lead compounds or understand SAR, dynamics to understand target molecules in physical detail thus generating better leads).
As drug discovery has historically been focused on small molecules, most efforts up through the 1990s were focused on small molecule elaboration. A satisfactory energy function for the protein receptor was also necessary, but there was only a modest amount of work directed toward actual redesign of the protein receptor itself.
The relative inattention to potentials and methods that could allow redesign of proteins themselves started to change rapidly in the late 1990s. Astounding advancements in our understanding of biologics (proteins as drugs), including the identification and optimization of monoclonal antibodies, opened new paths to drug discovery. Subsequently, an increasing number of biologic drugs were identified and brought to market, many of them major blockbusters. Biologic drugs now account for roughly a third of all commercial pharmaceutical revenues–seven of the top ten best-selling drugs in 2017–and about half of commercial research budgets, and those fractions are expected to continue to increase.
With the heightened focus on redesigned proteins as drugs, diagnostics and enzymes, there is a concomitant increased need for computational tools to aid in protein design. While the tools developed over the decades for small molecule design provide a starting point, protein design presents its own challenges that require new scoring functions and approaches. In particular, protein design requires that we be able to sample very large numbers of possible changes in both sequence and conformation (typically hundreds of thousands, and frequently millions) in order to reliably characterize potential designs.
Standard approaches developed for small molecule design (e.g. minimization, molecular dynamics, free energy calculations), which are rooted in a high resolution physics-based energy function and (typically) an all atom explicit solvent representation of the system, are simply not practical for reliable protein design. They are computationally too costly to allow examination of the necessary numbers of changes necessary to derive satisfactory search convergence. Most widely-used molecular design platforms—including all of the popular commercial platforms outside of Rosetta/Bench—therefore have limited ability to carry out robust protein redesign.
Rosetta/Bench is different. The scoring function reflects not only the core physics of the traditional energy function, but also incorporates terms that have been inferred from careful statistical analysis of the Protein Data Bank (Alford, JCTS, 2017). The resulting function is thus a hybrid physical/statistical scoring function which– through careful calibration–obviates the need for explicit solvent and is predictive even for reduced atomic representations. The scoring function in Rosetta / Bench is coupled with sophisticated algorithmic methods to further improve calculation efficiency. The scoring function and algorithmic methods are integrated in a Monte Carlo approach, which allows additional improvements in efficiency, through use of finely tuned rotamer libraries and clever mutation moves. The implementation of Monte Carlo in Rosetta/Bench has been carefully coded to allow optimal parallelism.
The net result of these optimizations (scoring function, algorithmic methods, specialized Monte Carlo moves) is the ability to sample orders of magnitude more changes in sequence/conformational space than is possible using the approaches in most software packages or even laboratory screening/display technologies.
The overall approach in Rosetta/Bench has been widely validated in hundreds of publications from dozens of laboratories across many facets of protein design. Rosetta has repeatedly performed best in the bi-annual CASP competition, where participants are asked to predict protein structure from sequence (Song, Structure, 2013). Work performed using Rosetta has also been responsible for an impressive number of firsts in the field of protein design, including: the first design of a novel protein with a structure and sequence never observed in nature (Kuhlman, Science, 2003); the first design of a novel protein-protein interaction (Fleishman, Science, 2011); the first design of a novel small-molecule binding protein (Tinberg, Nature, 2013); the first design of a protein nanostructure (King, Nature, 2014); and the first design of a pH dependent antibody binder (Strauch, PNAS, 2014).
Article Abstract: We describe a general approach for refining protein structure models on the basis of cryo-electron microscopy maps with near-atomic resolution. The method integrates Monte Carlo sampling with local density-guided optimization, Rosetta all-atom refinement and real-space B-factor fitting. In tests on experimental maps of three different systems with 4.5-Å resolution or better, the method […]
Changing a few residues can change the function of homologous proteins. The chloride and proton affinity in the inward chloride-pumping halorhodopsin (HR) and outward proton-pumping bacteriorhodopsin (BR) are compared using classical electrostatic simulations. BR binds and releases protons from acidic residues that have been removed from HR. In the states where these acids are ionized […]