Line 476: | Line 476: | ||
<br /><br /> | <br /><br /> | ||
− | Now for the four key points in <b>Q3</b> we have something in mathematical forms to describe it in <b>Q5</b>. The most important thing is | + | Now for the four key points in <b>Q3</b> we have something in mathematical forms to describe it in <b>Q5</b>. The most important thing is that how to make a comparison between mutant and wild-type Csy4. |
<br /><br /> | <br /><br /> | ||
Line 502: | Line 502: | ||
<br /><br /> | <br /><br /> | ||
− | Now we have four mathematical forms including two curves, a numerical value and a matrix. Four things can be divided into two kinds of data: the matrix and the numerical value. The interaction matrix and the curve can be regarded as a matrix because the curve is discrete, and the binding free energy is just a numerical value. | + | Now we have four mathematical forms including two curves, a numerical value, and a matrix. Four things can be divided into two kinds of data: the matrix and the numerical value. The interaction matrix and the curve can be regarded as a matrix because the curve is discrete, and the binding free energy is just a numerical value. |
<br /><br /> | <br /><br /> | ||
Line 735: | Line 735: | ||
</div> | </div> | ||
<div class="detail"> | <div class="detail"> | ||
− | The design of hairpin mutant is | + | The design of hairpin mutant is qiute different from the Csy4 mutant due to the large library. Except for the two cleaved sites, G20 and C21, we can have 4<sup>20</sup> mutants. |
<br /><br /> | <br /><br /> | ||
Combining the bioinformatics and machine learning, we present an algorithm to pre-processing our big mutation library. Fig.14 is the flow chart of the pre-processing algorithm. | Combining the bioinformatics and machine learning, we present an algorithm to pre-processing our big mutation library. Fig.14 is the flow chart of the pre-processing algorithm. | ||
Line 747: | Line 747: | ||
<br /><br /> | <br /><br /> | ||
− | The SVM model is | + | The SVM model is training well and the result can be seen in the Fig.15. |
<div align="center"><img src="https://static.igem.org/mediawiki/2018/5/5f/T--OUC-China--mf26.jpg" width="600" > <br /><br /> | <div align="center"><img src="https://static.igem.org/mediawiki/2018/5/5f/T--OUC-China--mf26.jpg" width="600" > <br /><br /> | ||
Line 757: | Line 757: | ||
<br /><br /> | <br /><br /> | ||
− | After training the SVM model, we use it to evaluate the hairpin mutants. We choose the hairpin mutants which has high ranks to check the four key points. Finally we choose the five hairpin mutants. The following chart shows the DR-Score which is the evaluated result of SVM model for them. | + | After training the SVM model, we use it to evaluate the hairpin mutants. We choose the hairpin mutants which has high ranks to check the four key points. Finally, we choose the five hairpin mutants. The following chart shows the DR-Score which is the evaluated result of the SVM model for them. |
<br /><br /> | <br /><br /> | ||
Revision as of 06:43, 17 October 2018
Overview
The aim of our project is to develop a better post-transcriptional regulation strategy and use it in monocistron and polycistron. We build models to design and predict our work.
miniToe —— a better transcriptional regulate strategy
To achieve a better post-transcriptional regulation strategy, we design a system which is composed of an RNA endoribonuclease (Csy4) and an RNA module named miniToe. We model to describe the dynamics of the miniToe system and point out the way to achieve different regulation level. The ODE and molecular dynamics are the two main tools to explore it. We use the ODE to describe the reaction curve and the molecular dynamics give some explanations to experimental data.
Below you can follow the several questions we point out to have a better understanding of model work and the miniToe system. We will discuss some structures of Csy4 in different stage (Q1), some structures of miniToe system in different stage (Q2), the reaction order and some keys of miniToe system (Q3), the simulation of ODE model (Q4), some significant symbol in molecular dynamics (Q5) and the way to different regulation level (Q6).
Q1 : What does the structure of Csy4?
Fig.1 The structure of Csy4 without hairpin bound (PDB ID: 4AL5, resolution 2.0 A)
Fig.2 The structure of Csy4 with hairpin bound (PDB ID: 4AL5, resolution 2.0 A)
Q2 : What does the structure of miniToe structure?
1. A cis-repressive RNA (crRNA) to serve as translation suppressor by pairing with RBS as the critical part of the miniToe structure.
2. A Csy4 site as a linker between cis-repressive RNA and RBS, which can be specifically cleaved upon Csy4 function.
3. A CRISPR endoribonuclease Csy4.
Fig.3 is the secondary structure of miniToe.
Fig.3 The structure of miniToe.
Fig.4 The precursor complex of wild-type Csy4
Fig.5 The product complex of wild-type Csy4
Q3 : What is the reaction order and key points of miniToe system?
Fig.6 The working process of miniToe system
(1)The miniToe structure is produced and accumulated.
(2)The Csy4 is produced with IPTG induced.
(3)The Csy4 binds to the miniToe structure and form the Csy4-miniToe complex
(4)The Csy4 cleave the special site and divide the miniToe structure into two parts: the Csy4-crRNA complex and the mRNA of sfGFP.
(5)The sfGFP is produced.
From the description above, we can get four key problems in our system to make sure that our system can work successfully:
(1)Does the Csy4 dock correctly with the miniToe structure (hairpin)?
(2)How about the ability of binding between the Csy4 and miniToe structure (hairpin)?
(3)How about the ability of cleavage between the Csy4 and miniToe structure (hairpin)?
(4)Does cis-repressive RNA release from the RBS?
Q4 : How about simulation result of the ODE model?
According to the work process we build an ODEs model and simulate our miniToe system for 30h, the result can be seen in the Fig.7.
Fig.7 The dynamics of sfGFP by model prediction
Fig.8 The comparison between experimental data and simulation data
Q5 : How about simulation result of the molecular dynamics?
For the first key point, we have the interaction matrix to describe the molecular docking, and the heatmap of the matrix can be seen in Fig.9.
Fig.9 The heatmap of interaction matrix for wild-type Csy4.
For the second problem, we calculated the free binding energy of Csy4/RNA complex. The result of binding free energy for wild-type Csy4 is .
For the third key point, we check the distance of Ser151(OG)-G20(N2’), which is a key interaction in the active site of Csy4 to describe the ability of cleavage. The distance curve of Ser151(OG)-G20(N2’) for wild-type Csy4 can be seem in Fig.10.
Fig.10. The distance of Ser151(OG)-G20(N2’) in wild-type Csy4
For the last key points, we use the RMSD of product to describe the release of crRNA. The result can be seen in the Fig.11.
Fig.11. The distance of Ser151(OG)-G20(N2’) in wild-type Csy4
To see more details
Q6: How to achieve the goal of different regulate level?
Fig.12 The curve of sfGFP with the changing cleavage rate
miniToe Family —— The way to fifferent regulate level
In the miniToe family, the protein and hairpin are mutated to meet the goal of the different regulation level. In this part, the model helps to design mutants. For the own feature of Csy4 and hairpin, different strategies are used to design: molecular dynamics plays an important role in designing protein mutants while the bioinformatics and machine learning support us to choose hairpin mutants.
We are going to discuss the method to design Csy4 mutants (Q7),how the method work in design and the result (Q8), the problem different from Csy4 designing when design the hairpin mutants and how to solve it (Q9) and the result of the mutants designing (10).
Q7: How to design the Csy4 mutants?
click to see more
click to see less
The wet lab members give us four important sites, Gln104, Tyr176, Phe155, His29, which play important roles in binding and cleavage in protein Csy4 which can be seen in Fig.13. Considering 20 kinds of amino acids, we have 80 mutants to explore and choose if we only have one site mutated.
Fig.13 The four importatnt site in Csy4
In Q3, we point out four key points which will directly influence the work of our miniToe system. And in Q5, according to the molecular dynamics, we have four significant symbols to describe the four key points.
Now we are going to construct a logic line to show you how to use the three main information above to designing the Csy4 mutants:
What we know and proved by the experiment is that the wild-type Csy4 with the miniToe system is working well, which means that all the important key points we discussion did not exist in the wild-type Csy4. The wild-type Csy4 can dock correctly with the miniToe structure and the Csy4 have a good ability to bind and cleave the miniToe structure, finally the crRNA release from the RBS. So we choose the wild-type Csy4 as a standard, and all the Csy4 mutant can check the four key points by comparing to wild-type Csy4.
Now for the four key points in Q3 we have something in mathematical forms to describe it in Q5. The most important thing is that how to make a comparison between mutant and wild-type Csy4.
Q8: How does the design methods work?
click to see more
click to see less
In Q7, we have the full logic lines in designing the Csy4 mutnats. Then here we will give the comparison method for the four key points in miniToe system between mutants and Csy4 wild-type.
Now we have four mathematical forms including two curves, a numerical value, and a matrix. Four things can be divided into two kinds of data: the matrix and the numerical value. The interaction matrix and the curve can be regarded as a matrix because the curve is discrete, and the binding free energy is just a numerical value.
For the matrix we can use Euclidean distance to describe the difference between two matric:
For the free bind ing energy, we used the formula below to calculate the difference between the wild type and mutants:
According to description above, we define four value used to compare four key points between mutant and wild-type:
,
,
,
.
By using the four values, five Csy4 mutants is designed in the following table.
Csy4
WT
0
0
0
0
Q104A
0.483
2483
9.48
30.82
Y176F
0.592
-382
11.61
40.62
F155A
0.233
-1627
13.41
35.71
H29A
0.173
833
15.29
316.22
Q9: How to design the hairpin mutant?
click to see more
click to see less
The design of hairpin mutant is qiute different from the Csy4 mutant due to the large library. Except for the two cleaved sites, G20 and C21, we can have 420 mutants.
Combining the bioinformatics and machine learning, we present an algorithm to pre-processing our big mutation library. Fig.14 is the flow chart of the pre-processing algorithm.
Fig.14 the flow chart of the pre-processing algorithm
The SVM model is training well and the result can be seen in the Fig.15.
Fig.15 The training result
After training the SVM model, we use it to evaluate the hairpin mutants. We choose the hairpin mutants which has high ranks to check the four key points. Finally, we choose the five hairpin mutants. The following chart shows the DR-Score which is the evaluated result of the SVM model for them.
Hairpin-Mutant
miniToe1
76.6306
miniToe2
65.6278
miniToe3
66.7160
miniToe4
62.5537
miniToe5
52.9794
Q10: How about the result of mutant designinig result?
click to see more
click to see less
After designing the protein mutant and hairpin mutants, the wet lab members test the all the Csy4 mutants and hairpin mutants. The result can see in the Fig.3-1.
Fig.3-1 The experimental result of mutants
And we try to give a comparison between the special value we used before for evaluating the mutant and experimental result to check our model.
For the protein mutants, we give a comparison between D3 and experimental result. Fig.3-2 is the result.
Fig.3-2 The comparison between model and experiment for protein mutant
As we can see in the Fig.3-2, we can find the inner relationship between D3 and experiment result: the D3 value describe the difference in the ability of cleavage between the wild-type and mutant. The higher D3 value means that it will have an big weaker than the wild-type Csy4 in it.
For the hairpin mutants, we give a comparison between DR-Score and experimental result. Fig.3-3 is the result.
Fig.3-3 The comparison between model and experiment for hairpin mutant
As we can see in the Fig.3-3 we can also can find the inner relationship between DR-Score and experiment result except for the miniToe 1. It is reasonable because the machine learning is quite sensitive to the data amounts and the R2 is not 1 in our training result of SVM model.
After all, our wet lab member test 30 combinations of our Csy4 and hairpin. Fig.3-4 is the heatmap result of it.
Fig.3-4 The heatmap result of 30 combination
The wet lab members give us four important sites, Gln104, Tyr176, Phe155, His29, which play important roles in binding and cleavage in protein Csy4 which can be seen in Fig.13. Considering 20 kinds of amino acids, we have 80 mutants to explore and choose if we only have one site mutated.
Fig.13 The four importatnt site in Csy4
In Q3, we point out four key points which will directly influence the work of our miniToe system. And in Q5, according to the molecular dynamics, we have four significant symbols to describe the four key points.
Now we are going to construct a logic line to show you how to use the three main information above to designing the Csy4 mutants:
What we know and proved by the experiment is that the wild-type Csy4 with the miniToe system is working well, which means that all the important key points we discussion did not exist in the wild-type Csy4. The wild-type Csy4 can dock correctly with the miniToe structure and the Csy4 have a good ability to bind and cleave the miniToe structure, finally the crRNA release from the RBS. So we choose the wild-type Csy4 as a standard, and all the Csy4 mutant can check the four key points by comparing to wild-type Csy4.
Now for the four key points in Q3 we have something in mathematical forms to describe it in Q5. The most important thing is that how to make a comparison between mutant and wild-type Csy4.
Now we have four mathematical forms including two curves, a numerical value, and a matrix. Four things can be divided into two kinds of data: the matrix and the numerical value. The interaction matrix and the curve can be regarded as a matrix because the curve is discrete, and the binding free energy is just a numerical value.
For the matrix we can use Euclidean distance to describe the difference between two matric:
For the free bind ing energy, we used the formula below to calculate the difference between the wild type and mutants:
According to description above, we define four value used to compare four key points between mutant and wild-type: , , , .
By using the four values, five Csy4 mutants is designed in the following table.
Csy4 | ||||
---|---|---|---|---|
WT | 0 | 0 | 0 | 0 |
Q104A | 0.483 | 2483 | 9.48 | 30.82 |
Y176F | 0.592 | -382 | 11.61 | 40.62 |
F155A | 0.233 | -1627 | 13.41 | 35.71 |
H29A | 0.173 | 833 | 15.29 | 316.22 |
Q9: How to design the hairpin mutant?
Combining the bioinformatics and machine learning, we present an algorithm to pre-processing our big mutation library. Fig.14 is the flow chart of the pre-processing algorithm.
Fig.14 the flow chart of the pre-processing algorithm
The SVM model is training well and the result can be seen in the Fig.15.
Fig.15 The training result
After training the SVM model, we use it to evaluate the hairpin mutants. We choose the hairpin mutants which has high ranks to check the four key points. Finally, we choose the five hairpin mutants. The following chart shows the DR-Score which is the evaluated result of the SVM model for them.
Hairpin-Mutant | |
---|---|
miniToe1 | 76.6306 |
miniToe2 | 65.6278 |
miniToe3 | 66.7160 |
miniToe4 | 62.5537 |
miniToe5 | 52.9794 |
Q10: How about the result of mutant designinig result?
After designing the protein mutant and hairpin mutants, the wet lab members test the all the Csy4 mutants and hairpin mutants. The result can see in the Fig.3-1.
Fig.3-1 The experimental result of mutants
And we try to give a comparison between the special value we used before for evaluating the mutant and experimental result to check our model.
For the protein mutants, we give a comparison between D3 and experimental result. Fig.3-2 is the result.
Fig.3-2 The comparison between model and experiment for protein mutant
As we can see in the Fig.3-2, we can find the inner relationship between D3 and experiment result: the D3 value describe the difference in the ability of cleavage between the wild-type and mutant. The higher D3 value means that it will have an big weaker than the wild-type Csy4 in it.
For the hairpin mutants, we give a comparison between DR-Score and experimental result. Fig.3-3 is the result.
Fig.3-3 The comparison between model and experiment for hairpin mutant
As we can see in the Fig.3-3 we can also can find the inner relationship between DR-Score and experiment result except for the miniToe 1. It is reasonable because the machine learning is quite sensitive to the data amounts and the R2 is not 1 in our training result of SVM model.
After all, our wet lab member test 30 combinations of our Csy4 and hairpin. Fig.3-4 is the heatmap result of it.
Fig.3-4 The heatmap result of 30 combination