AMP Designer - Artificial intelligence for fusion AMP design and optimization
To optimize antimicrobial peptide (AMP) to adapt StarCore design, we developed AMP Designer, an open-sourced, artificial intelligence based software for AMP designing and optimization. From a library of ~ 0.3 million randomly synthesized peptides, we built a machine-learning regression model, AMP Forest, a highly accurate fitness landscape of peptide killing efficiency in a protein fusion (ρ = 0.987). Later we uses a genetic algorithm AMP Evolver to optimize the peptide by an in silico evolution in the AMP Forest. Together, AMP Designer allows us to generate better AMP for not only high efficacy but also high expression. Besides, the open-sourced Evolver platform also allowed us to find a better engineering method than random mutagenesis. Harnessing the knowledge we gained from AMP Designer, we created a library of ~ 12000 variants from 4 seed sequences, screened them experimentally, and selected for the highly efficient new AMPs, whose experimental measurement could be further used to update the AMP Forest. As we know, this is the first closed loop of build-test-design cycle, driven by big data and artificial intelligence, in iGEM history and for AMP optimization.
Overview and motivation
From our prototypes, we’ve already confirmed that the AMP behaves differently in StarCore, from a free, chemically synthesized peptide. Based on the modular principle, two important features that determine the antimicrobial effect of our StarCore are the efficiency of AMP itself, and how AMP influences the expression and folding of the scaffold protein. Therefore, a good method to improve the AMP’s efficiency under the context of fusion is crucial for our study.
Antimicrobial peptide (AMP) has very diverse mechanisms and complex sequences features. However, the rational design strategy for AMP is rather limited [1-3]. Directed evolution, a Nobel prize-winning method for peptide and protein engineering, is also used to engineer a few classes of peptides [4,5]. However, the technology is largely limited by the depth of constructed mutant library, and the complexity of fitness landscape itself. Besides, the difficulty to quantify antimicrobial efficiency also restrain the efficiency of directed evolution.
An alternative is to harness the power of computer-aided designs (CAD) in synthetic biology. This semi-rational approach allows us to test peptide designs that are far distant from either natural AMPs, which may be difficult to reach by purely experimental screening, and not likely to be designed by purely rational method.
Machine learning based classification models have been proven to succeed in AMP identification using random forests [6,8], artificial neural networks [7], support vector machine [8] etc. However, no regression model has ever been built to predict the efficacy of antimicrobial peptides. One major reason is the lack of data -- chemically synthesizing AMP and acquire experimental results are extremely expensive to scale up. Also, also there was no any statistical modeling ever to describe protein-fused AMP, which is crucial for our StarCore design.
In our study, we firstly solve the problem of data acquisition by testing AMP-scaffold fusion protein, and then build a regression model AMP Forest, which outputs an arbitrary score of the bacterial survivals given a peptide sequence. Thus our model will help us to select the peptide sequences that are easy to produce by E. coli and also kill bacteria efficiently. Upon the regression model, we further create an automated designing package AMP Designer, by applying genetic algorithm.
Software architecture of AMP Designer
The final package of AMP Designer contains a matlab “AMP_Designer.m” file to setup all the parameters, and start the in silico evolution, which you could download from our Github repository:
There are also two AMP Forest (a faster with 50 trees, and another with 100) saved in mat files, and a reference mat files that are necessary to generate the input for the Forest.
To run the software, the following environment is necessary. An order version matlab might work but was not tested:
AMP Forest, namely, is a random forest model, which is trained on the experimental data where E. coli display a random peptides library on the surface which affect their survival rate by self-targeting. AMP Designer contains a mutation generator, and a selection-iteration interface. Taking in the user-defined the parameters, such as the selection strength, mutation rate, and generation, we could design a novel AMP from random seed sequence, or engineer the natural AMP to make it better.
Besides, users could also consider AMP Designer as a gamified platform to discover better AMP rational design principles. Users are allowed to write their own mutation generation -- which might be guided by a rational designing rule. Under the platform of AMP Designer, we are able to compare different strategies to mutate AMP.
Reference
- 1. Uggerhøj, Lars Erik, et al. "Rational Design of Alpha‐Helical Antimicrobial Peptides: Do's and Don'ts." ChemBioChem 16.2 (2015): 242-253.
- 2. Deslouches, Berthony, et al. "Rational design of engineered cationic antimicrobial peptides consisting exclusively of arginine and tryptophan: WR eCAP activity against multidrug-resistant pathogens." Antimicrobial agents and chemotherapy(2013): AAC-02218.
- 3. Kumar, Prashant, Jayachandran N. Kizhakkedathu, and Suzana K. Straus. "Antimicrobial Peptides: Diversity, Mechanism of Action and Strategies to Improve the Activity and Biocompatibility In Vivo." Biomolecules 8.1 (2018): 4.
- 4. Peschel, Andreas, and Hans-Georg Sahl. "The co-evolution of host cationic antimicrobial peptides and microbial resistance." Nature Reviews Microbiology 4.7 (2006): 529.
- 5. Perron, Gabriel G., Michael Zasloff, and Graham Bell. "Experimental evolution of resistance to an antimicrobial peptide."
- 6.Proceedings of the Royal Society of London B: Biological Sciences 273.1583 (2006): 251-256.
- 7. Maccari, Giuseppe, et al. "Antimicrobial peptides design by evolutionary multiobjective optimization." PLoS computational biology 9.9 (2013): e1003212.
- 8. Fjell, Christopher D., et al. "Identification of novel antibacterial peptides by chemoinformatics and machine learning." Journal of medicinal chemistry 52.7 (2009): 2006-2015.
- 9. Thomas, Shaini, et al. "CAMP: a useful resource for research on antimicrobial peptides." Nucleic acids research 38.suppl_1 (2009): D774-D780.