Scientific Summary: Cyrus NextGen Antibody structure prediction beats Schrodinger and others in a large independent benchmark

Over the last few years Cyrus has worked on problems on a range of proteins from enzymes to non-antibody biologics to vaccines to some antibody work. However most biologic drugs are monoclonal antibodies or variants thereof, and historically this has been a small share of our work, and an area where there are many very strong commercial offerings. Our antibody structure prediction tool, based on previous methods in Rosetta, has been a good competitor in that space, but has faced stiff competition. 

In other areas of protein structure prediction using structural homology, Rosetta and Cyrus have been the leaders in many independent benchmarks by academics and in blind tests by industry users. In antibody structure that has not been the case, given very good algorithms from Schrodinger and CCG. 

Over the last two years scientists at Cyrus, led by our CSO Dr. Yifan Song, have built a new method for antibody prediction based on Rosetta algorithms historically used for general protein homology, but not for antibodies — Cyrus NextGen Antibody. Because these algorithms have performed so well over the last 7 years since their introduction in 2013 for general protein structure prediction, many of us expected that they would perform well for antibodies once properly adapted and tuned. 

In the fall of 2019 we completed this work, and our internal testing showed clear superiority, producing more accurate structures than any other method. The gold standard, though, would be a test by a third party, judged by their own quantitative criteria, across a relatively large number of antibodies. 

Now, in July 2020, we’ve completed such a test over 26 antibodies with NextGen against the latest Schrodinger software and two other top-performing software packages. We were very pleased to find that NextGen outperformed all of the other methods in this rigorous test, and now we are publicizing these results for the first time in a scientific blog post, before making a more extensive manuscript available. 

This is an important step forward for Cyrus, but more importantly it promises more accurate results in antibody efficacy and safety predictions, and ultimately a variety of better and more effective antibody drugs produced by Cyrus algorithms for a wide range of diseases. For example, better models from NextGen could enable faster development of an antibody drug, or make certain diseases susceptible to antibody drugs for the first time. Better models could also enable the invention of second-generation versions of existing drugs, such as the popular “TNF-alpha inhibitor” arthritis drugs, with fewer immunogenic side effects or less frequent injections. 

Read Benchmark Details


New Cyrus “NextGen” antibody software outperforms the competition in third party test with BIOCAD biotechnology

Summary

Cyrus has developed a “NextGen” antibody structure prediction tool (NextGen) based on the RosettaCM “hybridize” algorithm (1). Cyrus customized and modified RosettaCM for antibody structures with an antibody-specific database, sequence parsing, and by taking antibody-specific heavy-light chain orientation into account. 

NextGen was developed and benchmarked on the test set of antibody structures from the AMA-II (2). In those tests, the NextGen tool produced more accurate models (measured by RMSD metrics) for the AMA-II antibodies, compared with all other entrants in AMA-II, including Schrödinger, CCG and older Rosetta algorithms. 

To independently validate these results we worked with BIOCAD (https://biocadglobal.com) on a 26-protein test set to compare Cyrus NextGen antibody structure prediction with Schrödinger and  another major vendor software. Tests were performed as follows:

  • BIOCAD scientists ran predictions using all non-Cyrus software
  • Cyrus scientists produced models by running NextGen
  • BIOCAD scientists calculated the metrics described below on the models produced by Cyrus and other software

Cyrus NextGen was the most accurate antibody modeling tool across all of the tested methods in this independent test (Figure 1). 

A sample structure overlay of a crystal structure and a NextGen antibody model is shown in Figure 2

Cyrus is releasing these results here directly for rapid dissemination and will release a more detailed white paper describing the methods and results once that manuscript is ready.

A screenshot of a cell phoneDescription automatically generated

Figure 1. Antibody model accuracy over the BIOCAD set of 26 antibody-Fv structures using the sum of metrics described here in “BIOCAD structure similarity metrics” — lower is more accurate. 

Figure 2. Example structure prediction using NextGen antibody, with Cyan = Crystal Structure. Light Brown = NextGen predicted structure (PDB 4M61)

BIOCAD Dataset

The BIOCAD antibody data set consists of 26 recently-released structures of bound and unbound antibody structures (Fv domains consisting of heavy (VH) and light (VL) chains), which Cyrus did not use as templates for NextGen structure prediction. 

BIOCAD structure similarity metrics

BIOCAD calculated the structural variation of predicted models for each VH-VL antibody complex in comparison to the crystal structures in order to compare model quality of the top-performing algorithms. Combined, there are 50 parameters for each structure which fall into 10 categories.

  1. RMSD of all Ca per chain when aligned by chain

The first type of metric calculates the Root Mean Square Deviation (RMSD) between alpha Carbons (Ca) of the experimental and predicted structures when aligned by chain. (2 parameters per structure).

  1. RMSD of CDR N-Ca-C when aligned by chain

The second metric calculates the RMSD between the backbone atoms (Nitrogen, Ca, and Carbonyl Carbon aka N-Ca-C) in the CDR residues of the experimental and predicted structures when aligned by chain. (6 parameters per structure).

  1. RMSD of CDR Heavy Atoms when aligned by chain

The third metric calculates the RMSD between the Heavy Atoms CDR residues of the experimental and predicted structures when aligned by chain. (6 parameters per structure).

  1. RMSD of CDR N-Ca-C when aligned by CDR

This metric calculates the RMSD between the Na-Ca-C in the CDR residues of the experimental and predicted structures when aligned by each CDR. (6 parameters per structure)

  1. RMSD of CDR Heavy Atoms when aligned by CDR

The third metric calculates the RMSD between the Heavy Atoms in the CDR residues of the experimental and predicted structures when aligned by each CDR. (6 parameters per structure).

  1. Difference between the Stem Length for each loop

For each loop, the distance is calculated between the Ca of the two Framework (Fr) residues before and after the loop. (6 parameters per structure).

  1. Alpha and Tau Angles per CDR

For each loop, the Alpha Angle is calculated by measuring the flat angle created by the last 3 Ca in the CDR. The Tau Angle is calculated by measuring the dihedral angle created by the last 3 Ca in the CDR and the next Fr Ca. (12 parameters per structure).

  1. Distance between the pivot for VH and VL

Four non-atomic positions are defined by Marze et al (3) which characterize the orientation between the Heavy and Light chains. Positions 2 and 3 are the pivot points for the Heavy and Light chains. This distance is calculated once for the structure (c in figure 3). (1 parameter per structure).

  1. Two flat and one dihedral Angle for the Marze positions

For the four positions defined by Marze et al, there are two flat angles between positions 1, 2, and 3 and between positions 2, 3, and 4 and the dihedral for all 4 positions. These were calculated once per structure (a, d, and e in Figure 3). (3 parameters per structure).

Figure 3. From Marze et al (3), a) orientation between the heavy and light chain is calculated by establishing 4 positions at conserved spots with respect to 4 framework sheets. b) The Packing angle, c) Interdomain distance, d) light Opening Angle, and the heavy Opening Angle are calculated as shown based on these 4 points.

  1.  Principal Component Analysis (PCA) Angles

Dunbar et al (4) described a PCA protocol for calculating the orientation between Heavy and Light chains. Two of those flat angles, Tilt and Twist, are calculated following that method. (2 parameters per structure). 

Scoring and Ranking

Starting from these 50 parameters, BIOCAD ran PCA to determine the correlation among parameters. They found that 99% of the variance can be retained without 34 of the 50 parameters. As a result, BIOCAD defined 16 parameters and calculated a linear combination of the original 50 parameters. The weights for the linear combinations are the first 16 eigenvectors of the covariance matrix. 

The resulting difference score is the euclidean norm of the final 16 components. If the compared structures are the same, the sum is equal to zero. The more the structures differ, the larger the sum becomes. A perfect prediction algorithm would score 0 in this sum metric, but of course even crystal structures of the same protein under different conditions differ slightly, so a 0 score is not possible.

Scores are then used to rank each antibody from each group/algorithm/method. The sum of the ranks provides an overall performance ranking per group of all 26 antibodies in the BIOCAD dataset. A lower overall summed score indicates better predictive performance for each group/algorithm.

References

  1. High-Resolution Comparative Modeling with RosettaCM. Song Y, DiMaio F, Yu-Ruei Wang R, Kim D, Miles C, Brunette TJ, Thompson J, Baker D. Structure. 2013 Oct;21(10):1735-1742
    https://www.sciencedirect.com/science/article/pii/S0969212613002979
  1. Second Antibody Modeling Assessment (AMA-II). Almagro JC, Teplyakov A, J Luo, RW Sweet, S Kodangattil, F Hernandez-Guzman, G. Gilliland. Proteins. 2014 Aug;82(8): 1552-1562. 
    https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.24567
  1. Improved prediction of antibody VL-VH orientation. Marze NA, Lyskov S, Gray JJ. Protein Eng Des Sel. 2016 Oct;29(10):409-418.
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5036862/
  2. ABangle: characterising the VH-VL orientation in antibodies. Dunbar J, Fuchs A, Shi J, Deane CM. Protein Eng Des Sel. 2013 Oct;26(10):611-20.
    https://academic.oup.com/peds/article/26/10/611/1509255

Attachments

2020.07.16.-Cyrus-Antibody-Nextgen-BIOCAD.pdf

Use of Cyrus Bench cited in review article of human diseases caused by formin INF2 mutations

SEATTLE, WA July 13, 2020 — Cyrus Bench®, Cyrus Biotechnology, Inc.’s SaaS platform for protein engineering, has been used in a recent review article in Cellular and Molecular Life Sciences to correlate computationally predicted alterations in protein stability due to mutation with disease severity (Labat-de-Hoz L, et al. Cell Mol Life Sci. 2020).

The formin INF2 protein has emerged as an important target of mutations responsible for the appearance of focal segmental glomerulosclerosis (FSGS), which often leads to end-stage renal disease (ESRD), and for the concurrence of FSGS with Charcot–Marie–Tooth disease (CMT), a degenerative neurological disorder affecting peripheral nerves.

As part of a systematic and comprehensive analysis of the pathogenic INF2 missense mutations in patients, Cyrus Bench® ΔΔG was used to predict the impact on protein stability and structure of 54 mutations relative to the structure of the wild-type protein. 

The mutations causing FSGS + CMT were generally predicted to have a more destabilizing effect than those producing only FSGS. This is consistent with the fact that FSGS + CMT mutations are more harmful than those producing isolated FSGS, because they produce earlier ESRD.

About Cyrus Biotechnology

Cyrus Biotechnology, Inc. is a privately-held Seattle-based biotechnology software company offering software and partnerships for protein engineering to accelerate discovery of biologics and small molecules for the Biotechnology, Pharmaceutical, Chemical, Consumer Products and Synthetic Biology industries. Cyrus methods are based on the Rosetta software from Prof. David Baker’s laboratory at the University of Washington and HHMI, the most powerful protein engineering software available. Cyrus customers include 13 of the top 20 Global Pharmaceutical firms and is financed by leading investors in both Technology and Biotechnology, including Trinity Ventures, Orbimed, Springrock Ventures, Alexandria Venture Investments, and W Fund.

https://www.cyrusbio.com

Contacts
Cyrus Biotechnology, Inc.
Lucas Nivon, 206-258-6561
lucas@cyrusbio.com