Content

Introduction
Table

Model

Human Practices

Achievements

Split Luciferase

Model.

Model.

Scroll

In silico Protein-DNA Docking

Aim

For our approach B, DNA is used as a scaffold to bring two fusion-proteins together. Both proteins carry one half of the luciferase protein (either N-luc or C-luc), and upon DNA binding of the fusion-proteins the two parts of the luciferase
are able to interact, reconstituting the complete and functional luciferase. To investigate how the fusion of N-luc and C-luc to OmpR and TetR, respectively, affect their structure and DNA binding, which distances between the two binding sites
results in a close binding of the two proteins and which type of linkers facilitate the luciferase complementation, we simulated DNA-protein docking in silico.
The graphs shown here were obtained with a distance of 10 bp between the two binding sites and a linker containing 3 repeats of flexible linker element.

Protein structure prediction of C-luc-FFF-TetR and OmpR-FFF-N-luc

To obtain a 3D structure of our fusion proteins, we used the protein structure prediction server I-TASSER [1,2,3]. This program uses a combination of both threading, in order to find template proteins of similar folds from the Protein Data
Bank (PDB), and Monte Carlo-based simulations, which is needed for assembling the final protein structure. The program requires the amino acid of the fusion proteins as an input and allows specifying a template PDB file. The predicted proteins
are displayed below. Comparing C-luc-TetR with PDB structures of the luciferase and TetR only, the parts of the fusion protein resembles the structure of the separate
proteins. The same applies to the OmpR-NLuc fusion protein.

Evaluation and further improvement of C-luc-FFF-TetR and OmpR-FFF-N-luc 3D protein structures

We then analyzed both structures with the programmes ProSA [4, 5] and Procheck [6] . ProSA gives protein structures a Z-score, which is an overall quality score in the context of known protein structures. A plot is shown with the ranges of
Z-scores for proteins of a similar size. A Z-score obtained outside of this range points towards an erroneous structure, while a z-score within this range indicates that the proteins were folded correctly. The results are displayed in figure
1 , both showing a z-score that lays within the ideal range.

As a next step protein structures were analysed with Procheck which gives us a Ramachandran plot of the fusion proteins. This plot visualizes energetically allowed regions for the backbone dihedral angles (psi) against amino acid residues
(phi). As our proteins contained multiple residues in ‘disallowed’ regions, we decided to use the program ModRefiner [7] to perform energy minimization in order to increase the quality of our models. The Ramachandran plots of the improved
structures are displayed below.

The plots show a percentage of 88 and 82 % for residues in most favourable regions. Visualizing the structures using Pymol did not show drastic changes in 3D structure. Moreover, our scores closely resemble the 90% score that is needed to be
called a good quality model as determined by the program. We can therefore continue with other fusion proteins to perform protein-DNA docking.

Protein – DNA docking

With the improved structures we separately performed Protein-DNA docking. A DNA strand containing the binding sites was created using C-DART, after which it served as an input, together with the refined structures, for the docking program
HADDOCK [8] . HADDOCK exploits a data-driven approach, where information from NMR experiments, mutagenesis experiments as well as bioinformatic predictions are combined to find the optimal docking structures. The top 10 structures were checked
and the most promising structures were fused to obtain an image of both structures docked to the same DNA strand. The best two options are shown below:

The first result shows a reconstitution of luciferase. However, when looking at previous pdb files of TetR bound to DNA ((https://www.rcsb.org/structure/1qpi) and PhoB (https://www.rcsb.org/structure/1gxp ) (an OmpR homologue, as no file of
OmpR exists) it can be questioned if OmpR is likely to dock in this way. The second result has a docking profile that resembles both the TetR and PhoB docking, and is more likely to be correct. This means that we need to include linkers in our
screening process that are longer than the one taking into account here, to increase the change of luciferase reconstitution.

Conclusion/Recommendations

Altogether, the in silico docking of our fusion proteins to the DNA showed us that reconstitution of the split luciferase is possible within our set-up. However, it is recommended to increase the number of repeats within the linkers as this
will facilitate luciferase reconstitution.

Kinetic model of split-luciferase

Goals

Besides Protein-DNA docking, we also constructed an ODE model to guide
the experimental design of our biosensor based on luciferase
complementation. Simulating our system for different expression levels
of transcription factor and receptor can help us to maximize the binding
of our transcription factor to the DNA.

Model

Reactions

The reactions below represent the split-protein complementation system.
As our reactions are happening on a post-translational level, we can
assume that we don't have to take any protein production and degradation
into account.

$Taz + L \leftrightarrow L-Taz$

$L-Taz \rightarrow L-Taz_{p}$

$L-Taz_{p} + OmpR \leftrightarrow L-Taz_{p}-OmpR$

$L-Taz_{p}-OmpR \rightarrow L-Taz + OmpR_{p}$

$OmpR_{p} + Taz \leftrightarrow Taz-OmpR_{p}$

$Taz-OmpR_{p} \rightarrow OmpR + Taz$

$OmpR + OmpR \leftrightarrow OmpR_{2}$

$OmpR_{p} + OmpR_{p} \leftrightarrow OmpR_{p2}$

$OmpR_{p2} + PompC \leftrightarrow OmpR_{p2}*$

$TetR + TetR \leftrightarrow TetR-2$

$TetR-2 + PTetR \leftrightarrow TetR*$

$TetR* + OmpR_{p2}* \leftrightarrow Luminescence$

From the reactions it follows that TetR repression can be tuned independently
from the Envz/OmpR pathway: It is important to obtain a high concentration of
TetR bound to the DNA in order to increase the luminsecent signal. Therefore
our focus point for this ODE model will be the maximization of OmpR$_{p2}$
binding to the DNA.
$L-Taz \rightarrow L-Taz_{p}$

$L-Taz_{p} + OmpR \leftrightarrow L-Taz_{p}-OmpR$

$L-Taz_{p}-OmpR \rightarrow L-Taz + OmpR_{p}$

$OmpR_{p} + Taz \leftrightarrow Taz-OmpR_{p}$

$Taz-OmpR_{p} \rightarrow OmpR + Taz$

$OmpR + OmpR \leftrightarrow OmpR_{2}$

$OmpR_{p} + OmpR_{p} \leftrightarrow OmpR_{p2}$

$OmpR_{p2} + PompC \leftrightarrow OmpR_{p2}*$

$TetR + TetR \leftrightarrow TetR-2$

$TetR-2 + PTetR \leftrightarrow TetR*$

$TetR* + OmpR_{p2}* \leftrightarrow Luminescence$

Equations

\begin{equation}
\frac{dL}{dt} = k_{-1} \cdot [L-Taz] - k_1 \cdot [L] \cdot [Taz]
\end{equation}

\begin{equation}
\frac{dTaz}{dt} = k_{-1} \cdot [L-Taz] - k_1 \cdot [L] \cdot [Taz] + k_{-5} \cdot [Taz-OmpR_p] - k_5 \cdot [OmpR] \cdot [Taz] + k_6 \cdot [Taz-OmpR_p]
\end{equation}

\begin{equation}
\frac{dL-Taz}{dt} = k_1 \cdot [L] \cdot [Taz] - k_{-1} \cdot [L-Taz] - k_2 \cdot [L-Taz] + k_4 \cdot [L-Taz_p-OmpR]
\end{equation}

\begin{equation}
\frac{dL-Taz_p}{dt} = k_2 \cdot [L-Taz] - k_3 \cdot [L-Taz_p] \cdot [OmpR] + k_{-3}*[L-Taz_p-OmpR]
\end{equation}

\begin{equation}
\frac{dOmpR}{dt} = - k_3 \cdot [L-Taz_p] \cdot [OmpR] + k_{-3}*[L-Taz_p-OmpR] + k_6 \cdot [Taz-OmpR_p] - 2 \cdot k_{dim1} \cdot [OmpR]^2 + 2 \cdot k_{-dim1}*[OmpR_2]
\end{equation}

\begin{equation}
\frac{dL-Taz_p-OmpR}{dt} = k_3 \cdot [L-Taz_p] \cdot [OmpR] - k_{-3}*[L-Taz_p-OmpR]
\end{equation}

\begin{equation}
\frac{dOmpR_p}{dt} = k_4 \cdot [L-Taz_p-OmpR] - 2 \cdot k_{dim2} \cdot [OmpR_p]^2 + 2 \cdot k_{-dim2}*[OmpR_{p2}] + k_{-5} \cdot [Taz-OmpR_p] - k_5 \cdot [OmpR] \cdot [Taz]
\end{equation}

\begin{equation}
\frac{dTaz-OmpR_p}{dt} = - k_{-5} \cdot [Taz-OmpR_p] + k_5 \cdot [OmpR] \cdot [Taz] - k_6 \cdot [Taz-OmpR_p]
\end{equation}

\begin{equation}
\frac{dPompC}{dt} = -k_8 \cdot [OmpR_{p2}] \cdot [PompC] + k_{-8} \cdot [OmpR_{p2}*]
\end{equation}

\begin{equation}
\frac{dOmpR_2}{dt} = k_{dim1} \cdot [OmpR]^2 - k_{-dim1}*[OmpR_2]
\end{equation}

\begin{equation}
\frac{dOmpR_{p2}*}{dt} = k_8 \cdot [OmpR_{p2}] \cdot [PompC] -k_{-8} \cdot [OmpR_{p2}*]
\end{equation}

\begin{equation}
\frac{dOmpR_{p2}}{dt} = k_{dim2} \cdot [OmpR_p]^2 - k_{-dim2}*[OmpR_{p2}] - k_8 \cdot [OmpR_{p2}] \cdot [PompC] + k_{-8} \cdot [OmpR_{p2}*]
\end{equation}

Parameters

The parameters used within this model can be found here (parameter
page). As this ODE model only provided qualitative insights
for guiding the experimental set-up, no experimental data was
incorporated.

Results

With the equations presented above, the influence of different Taz and
OmpR protein levels on the binding of OmpR_p to the DNA was examined.
The figure below shows the result for the induction of the pathway
using the aspartate concentrations 10uM, 0.1mM an 1mM respectively. As Taz is
a membrane receptor that is expressed in low amounts in the cell, we
took a concentration range from 0 to 0.33 uM into account during the
simulation. This responds to a maximum of 200 moleculs of Taz. For OmpR,
we took a range from 0 to 4000 molecules, as the normal expression level
of OmpR lays around 3000 molecules.
All three figures show that level of Taz is critical for the amount of
OmpR_p2 bound to the DNA. The higher the expression level of Taz,
the more OmpR_p2 locates to the DNA. However, as overexpression of
Taz leads to cell death, its best to put Taz under an inducible
promotor. This will allow to tune the expression precisely.
For OmpR a different situation applies. Already for a relative low
expression of OmpR the maximum amount of OmpR_p2 bound to the DNA
for a specific ligand concentration is achieved.
Last, we see that the amount of OmpR$_p2 bound to the DNA
increases with an increasing ligand concentration, confirming the
possiblity of turning this pathway into a biosensor.

Conclusions

Our split-protein ODE model suggest to put the Taz receptor under an
inducible promoter. This way the expression level can be tuned,
obtaining a balance between a high protein expression and a cell that
does keeps proliferating.

References

- Y Zhang. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics, 9: 40 (2008). doi: 10.1186/1471-2105-9-40.
- J Yang, R Yan, A Roy, D Xu, J Poisson, Y Zhang. The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12: 7-8 (2015). doi:10.1038/nmeth.3213
- Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. 2015. The I-TASSER Suite: protein structure and function prediction. Nature Methods. 12(1): 7-8. doi.org/10.1038/nmeth.3213
- Wiederstein & Sippl (2007) ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Research 35, W407-W410.
- Sippl, M.J. (1993) Recognition of Errors in Three-Dimensional Structures of Proteins. Proteins 17, 355-362
- Laskowski R A, MacArthur M W, Moss D S, Thornton J M (1993). PROCHECK - a program to check the stereochemical quality of protein structures. J. App. Cryst., 26, 283-291
- Dong Xu and Yang Zhang. Improving the Physical Realism and Structural Accuracy of Protein Models by a Two-step Atomic-level Energy Minimization. Biophysical Journal, vol 101, 2525-2534 (2011)
- Van Zundert, G. C. P., et al. "The HADDOCK2. 2 web server: user-friendly integrative modeling of biomolecular complexes." Journal of molecular biology 428.4 (2016): 720-725.