Team:Marburg/Results

Results

Foundational experiments with V. natriegens

Flow cytometry

At many different occasions, at meetups or conferences, we showed the growth curve of V. natriegens compared to E. coli (Link). Other scientists were impressed about the extremely fast growth but even more by the high OD that we could show. We were asked many times if the high OD is really due to a high cell density or if it is rather caused by other components like secreted substances to the medium which contribute to the absorbance.
We decided to acquire a growth curve of V. natriegens in the most direct way, by counting cells in a flow cytometer. We inoculated three baffled flask from stationary pre cultures and took sample in 15 minute intervals while the bacteria were incubated at 37 °C and 220 rpm. The OD600 of these samples was measured in a normal photometer and the cultures are thereafter immediately analyzed by flow cytometry. The flow cytometer directs the samples through a thin capillary so that single cells can be counted and analyzed independently. A constant flow rate and time for data acquisition was set, which results in measuring a defined sample volume. Together with the counted events, the cells per volume culture can be calculated.

**Figure 1: Comparison of forward scatter versus events/µL**
The measured events/µL are shown in blue and on the left Y-axis. The forward scatter is displayed in green and on the right Y-axis

A comparison of the OD600 to events/µL values are shown in figure xxxx. When we planned this experiment, we were most curious about the composition of the culture in the stationary phase to answer the question if the high OD that V. natriegens can reach is the result of a high cell density or if it can be traced back to other substances. Interestingly, both values, OD600 and events/µL start to stagnate at a similar time point (165 min). We interpret this result as a confirmation that the high OD is indeed caused by bacterial cells.

By carefully comparing the shape of both growth curves, we realized that, in fact, the most striking data in this plot can be found at the beginning of the experiment. While exponentially increasing values can be seen right from the start for the curve created from the OD600 data, a short lag phase is apparent when events/µL are plotted (figure xxxx). We tried to find an explanation for this observation and realized that the absorbance of a culture does not necessarily correlate with the concentration of cells but rather with the biomass inside the flask.

**Figure 1: Scatterpolot acquired by flow cytometry**
A single sample after 45 minutes is shown as an example. Each dot represents one event

Fortunately, additional data can be obtained from the forward and side scatter in a flow cytometer which provide information about the size and inner complexity of the analyzed cells, respectively. Figure xxx exemplary shows one sample at t = 45 min. The side scatter (SSC-A) is plotted on the Y-Axis versus the side scatter (FCS-A) on the X-Axis. Each dot in this scatter plot represents one detected event and a heatmap can be used to visualize many events with the same properties. The population in the top right corner represents roughly 98 % of all events and can be seen as fully viable cells while the population in the bottom left corner most likely consists of sick or dormant cells and cell debris.

We plotted the mean side scatter values of all cells in a figure together with the events/µL. Apparently, the mean side scatter value dramatically increases during the first data points with a peak after 45 minutes. This is also the same time point which we identified as the beginning of the exponential phase. During the subsequent course of the experiment the forward scatter values decrease even below the initial numbers when the culture reaches stationary phase.
By considering the course of all three sets of data, we suggest that the cells start to expand upon provision of fresh medium but without undergoing cell division. This results in an increase of the cell volumes and thus the measured OD600 but without an increase in cell concentration in the culture. After 45 minutes, when the forward scatter peaks, we assume that a majority of cells reach maximum cell volume and enter the exponential phase thereafter. During the following time points, exponential growth can be observed and the decrease of the forward scatter could be seen as a hint for a reduction in mean cell size.

**Figure 1: Histograms of forward scatter during the coarse of the experiments**
The histograms are plotted from top to bottom during the time coarse of the experiment

To additionally visualize the composition of the measured cells in regard to the forward scatter, we created figure xxx.
It shows histograms of the forward scatter. It is apparent that the population is heterogeneous at the beginning of the experiment and at the end when the cultures again reaches stationary phase. During the period of exponential growth, the sample is more homogenous. The already discussed trend in the forward scatter curve can also be observed with these histograms which show a shift to the right when the forward scatter peaks and a shift to the left for the following time points.

We want to thank Dr. Max Mundt who carried out the experiments with us and who helped with analyzing the data.

Results Part Collection
We created a simple and reliable workflow for the characterization of parts from the Marburg Collection in V. natriegens
Experimental data for constitutive and inducible promoters, RBS strength, terminator readthrough, ori dependent plasmid copy number and the behavior of our newly designed connectors were obtained.

Motivation

After creation of the Marburg Collection, we wanted to characterize the parts in V. natriegens. When we started with our project, we had no clue about the behavior of the genetic parts that were integrated into our toolbox. Previous research mainly focused on microbiological description rather than characterization of synthetic constructs as we already discussed in our V. natriegens review (Link!!!).
We decided to characterize the parts in our Marburg Collection and hence we did pioneering work to provide the scientific community the data that enable rational utilization of V. natriegens for various applications in synthetic biology.

Establishing a workflow for platereader experiments

Most of our data were obtained by measuring the expression of reporters in platereader experiments. In the first attempts, we failed to obtain reproducible data by applying workflows that are commonly used for E. coli. We realized that a new chassis requires overthinking existing procedures and we decided to establish a workflow for platereader experiments tailored to V. natriegens which respects species specific properties, primarily its unbeaten doubling time.

Many platereader workflows for E. coli start with growing overnight cultures of certain samples, measuring the OD600 and dilute all samples in a 96 well plate to a defined OD600 (e.g. 0.05). Depending on the number of samples, pipetting the test plate can easily take up to 45 minutes.
When first following this approach with V. natriegens, we realized that a different workflow is needed for the worlds fastest growing organism. We tested how much V. natriegens grows during the preparation of a full 96 well plate. We set up an experiment using a stationary culture of V. natriegens with a plasmid weakly expressing lux - to achieve realistic conditions - and diluted this culture 1 : 100 in a 96 well plate with one pipetting step taking 30 seconds. The result of this experiment can be seen in figure 1.
The data show an obvious trend towards higher OD for wells that were pipetted firstl. Seemingly, V. natriegens is able to recover from stationary phase and undergo almost one cell division in 45 minutes at room temperature. Please note that performing this experiment with a culture in the exponential phase, as described in some E. coli protocols, would most likely result in an even stronger trend.
With this in mind, we tried to establish a workflow that omits measuring and independently diluting individual wells. We tested to cultivate our precultures - inoculated from glycerol stocks - directly in 96 well plates and incubated this plate for five to six hours in a platereader or shaking incubator. For V. natriegens, this time frame is sufficient for all cultures to reliably reach stationary phase which equals an overnight culture for E. coli. The cultures are then diluted in two steps: 1:50 and 1:40, finally resulting in a 1:2000 dilution of the preculture. Compared to the commonly used workflows for E. coli, our approach does not consider the OD600 of individual wells but instead dilutes all cultures by the same factor.
The OD600 after 1:50 dilution of the 96 well plate is shown in figure 2. Despite not calculating dilutions for individual wells, the range of values is explicitly smaller and additionally no positional bias can be observed.

Figure 1:
Heterogeneity of 96 well plate after preparation
A) One pipetting step each 30 s, starting with A1
B) Intermediate dilution (1:50) of a preculture

Moreover, we realized that inoculating with a low cell number is advantageous when working with V. natriegens. This prolongs the exponential phase, the period in which most relevant data are acquired. A 1:2000 dilution results in a cell density much lower than the inoculum used in most E. coli experiments.The subsequent, kinetic measurement can be completed in as less as five to six hours.

Besides the experimental workflow, we established an approach for fast and accurate data analysis. From a mathematical point of view, constant expression and constant degradation result in a steady state level of a molecule. In the context of bacterial reporter experiments, the molecule is a reporter protein or enzyme and the steady state level. That represents a constant signal to OD600 ratio, meaning a constant reporter concentration in the growing cells. For a stable reporter, constant degradation is actually due to dilution by cell division. Thus, the degradation rate mainly depends on the growth rate.
To conclude, we assumed that a constant signal to OD600 ratio is achieved throughout the exponential phase since the growth rate, as well as the expression from constitutive or constantly induced promoters remains stable.
With this in mind, we aimed to create an algorithm identifying the exponential growth phase and calculating the signal to OD600 ratio. In our first experiments we tried to identify a point in time at which all cultures grow exponentially, and failed. We were not able to obtain reproducible data for two reasons. Firstly, cultures grow differently, depending on the starting concentration and fitness differences caused by varying test constructs and expression strengths. Secondly, individual measurements are highly variable (we saw this especially for OD600 measurements). Therefore performing calculations with a single value is highly susceptible to outliers.
We solved both issues by developing a Matlab script that identifies the exponential phase and includes a range of seven measuring points for each individual well. The culture is assumed to be in the mid exponential phase at an OD600 of 0.2. Three timepoints (5 min intervals), each before and after the culture has first reached 0.2, are taken. Then the mean of all signal to OD600 ratios is calculated. Through this calculation, a single value is assigned to each well representing the strength of reporter expression in the exponential phase. All samples were measured in four technical replicates in three subsequent independent experiments.
Consequently, every result (e.g. promoter strength) is, in total, the mean of 84 measurements. This high number of raw data leads to a high degree of reproducibility and, as we believe, to highly accurate characterization data.

Example Matlab code

clear all
close all
%% Import Platereader raw data
Lux_Data_raw_Day1 = xlsread('Connector_Lux_071018.xlsx','All Cycles');
OD_Data_raw_Day1 = xlsread('Connector_OD_071018.xlsx','All Cycles');

Lux_Data_raw_Day2 = xlsread('Connector_Lux_071018_2.xlsx','All Cycles');
OD_Data_raw_Day2 = xlsread('Connector_OD_071018_2.xlsx','All Cycles');

Lux_Data_raw_Day3 = xlsread('Connector_Lux_081018.xlsx','All Cycles');
OD_Data_raw_Day3 = xlsread('Connector_OD_081018.xlsx','All Cycles');

%% Merge all data in two matrices
OD_Data(:,:,1) = OD_Data_raw_Day1;
OD_Data(:,:,2) = OD_Data_raw_Day2;
OD_Data(:,:,3) = OD_Data_raw_Day3;

Lux_Data(:,:,1) = Lux_Data_raw_Day1;
Lux_Data(:,:,2) = Lux_Data_raw_Day2;
Lux_Data(:,:,3) = Lux_Data_raw_Day3;

%% Blank substraction and cut off
for h = 1:size(Lux_Data,3)
for k = 1:size(Lux_Data,2)
        Blank = mean(Lux_Data([60,72,84,96],k,h));
    for j = 1:size(Lux_Data,1)
        Lux_Data(j,k,h) = Lux_Data(j,k,h)-Blank;
        if Lux_Data(j,k,h) < 50
           Lux_Data(j,k,h) = 50; % Cut off for Lux signal
        end
    end
end
end

for h = 1:size(OD_Data,3)
for k = 1:size(OD_Data,2)
        Blank = mean(OD_Data([60,72,84,96],k,h));
    for j = 1:size(OD_Data,1)
        OD_Data(j,k,h) = OD_Data(j,k,h)-Blank;
        if OD_Data(j,k,h) < 0.01
           OD_Data(j,k,h) = 0.01; % Cut off for OD
        end
    end
end
end

%% Calculating ratio around OD 0.2

Ratios = zeros(size(OD_Data,1),1,3);
for q = 1:size(OD_Data,3) % looping over days
for o = 1:size(OD_Data,1) % looping over samples
    for p = 1:size(OD_Data,2)-3 % looping through time points
        if OD_Data(o,p,q) > 0.2 && OD_Data(o,p+1,q) > 0.2 && Ratios(o,q) == 0
           Ratios(o,q) = mean(Lux_Data(o,p-3:p+3,q)./OD_Data(o,p-3:p+3,q));
        end
    end
end
end

%% Calculating mean and std of individial samples

for k = 1:12
Samples(k,1,1:3) = mean(Ratios([k,k+12,k+24,k+36],:)); % calculate mean of technical triplicates
Samples(k,2,1:3) = std(Ratios([k,k+12,k+24,k+36],:)); % calculate std of technical triplicates
end

for k = 13:24
Samples(k,1,1:3) = mean(Ratios([k+36,k+48,k+60,k+72],:)); % calculate mean of technical triplicates
Samples(k,2,1:3) = std(Ratios([k+36,k+48,k+60,k+72],:)); % calculate std of technical triplicates
end

%% Sort samples for top and bottom row

Names = {'J23100', '5Con1 long','5Con2 long','5Con3 long','5Con4 long',...
        '5Con5 long','5Con1 short','5Con2 short','5Con3 short',...
        '5Con4 short','5Con5 short','5Con1 long','5Con2 long','5Con3 long',...
        '5Con4 long','5Con5 long','5Con1 short','5Con2 short','5Con3 short',...
        '5Con4 short','5Con5 short', 'Promoter Dummy'};
Top_row = Samples(1:11,:,:); 
Bot_row = Samples(13:23,:,:);

SortedValues = [];
for k = 1:size(Top_row,1)
SortedValues = [SortedValues; Top_row(k,:,:); Bot_row(k,:,:)];
end
SortedValues = [SortedValues;Samples(12,:,:)];
SortedValues(22,:,:) = []; 

%% Calculate relativ strength

Relative_strength = SortedValues;
for h = 1:size(SortedValues,3)
for k = 1:size(SortedValues,1)
    Relative_strength(size(SortedValues,1)-k+1,2,h) = ...
        Relative_strength(size(SortedValues,1)-k+1,2,h)/Relative_strength(1,1,h);
    Relative_strength(size(SortedValues,1)-k+1,1,h) = ...
        Relative_strength(size(SortedValues,1)-k+1,1,h)/Relative_strength(1,1,h);
end
end

%% plot relative strenths

figure(1) % constructs with J23100 
hold on
bar(mean(Relative_strength([1:11,22],1,1:3),3),'facecolor',[125/255,202/255,97/255])
errorbar((mean(Relative_strength([1:11,22],1,1:3),3)),...
    std(Relative_strength([1:11,22],1,1:3),1,3),'linestyle','none','color','k')
set(gca, 'YScale', 'log')
ylabel('normalized Luminescence/OD_6_0_0')
set(gca,'Color','w')
xticks([1:12])
xticklabels(Names([1:11,22]))
xtickangle(45)
ylim([0.001 3])
yticks([0.001,0.01,0.05, 0.1, 0.5, 1.0, 2.0 ])
yticklabels([0.001,0.01,0.05, 0.1, 0.5, 1.0, 2.0])

figure(2) % constructs with promoter dummy 
hold on
bar(mean(Relative_strength([1,12:22],1,1:3),3),'facecolor',[125/255,202/255,97/255])
errorbar((mean(Relative_strength([1,12:22],1,1:3),3)),...
    std(Relative_strength([1,12:22],1,1:3),1,3),'linestyle','none','color','k')
set(gca, 'YScale', 'log')
ylabel('normalized Luminescence/OD_6_0_0')
set(gca,'Color','w')
xticks([1:12])
xticklabels(Names([1,12:22]))
xtickangle(45)
ylim([0.001 3])
yticks([0.001,0.01,0.05, 0.1, 0.5, 1.0, 2.0 ])
yticklabels([0.001,0.01,0.05, 0.1, 0.5, 1.0, 2.0])

Finding the best reporter

**Figure 4: Mean ratio of reporter signal over medium blank during the coarse of the experiment.**

After having established a reliable workflow for V. natriegens, we investigated three different reporters and measured the signal to blank ratio. Test constructs (shown in figure xxxxx) were built by using the same set of parts except for the coding sequence. sfGFP, RFP, YFP and the lux operon were analyzed for their performance in V. natriegens. The best signal to blank ratio by far was achieved for the lux operon (2000), followed by sfGFP (3), RFP (1) and YFP (no detectable signal). The main explanation for the superior performance of the lux operon is the almost complete absence of background signal without reporter expression. This makes the lux operon a perfect reporter that can even be used to analyze extremely low levels of expression caused by very weak promoters or terminator readthrough. Based on this finding, we decided to use the lux operon as our reporter for all subsequent experiments.

**Figure 2: Test constructs for reporter experiment**
Plasmids were built with four different reporters.
A) Lux B) RFP C) sfGFP D) YFP

In contrast to fluorescence reporters, the enzymes expressed from the lux operon lead to continuous emission of light. This can result in enhanced crosstalk between neighboring wells. The extent of crosstalk highly depends on the type of 96 well plate that is used in the experiment. We analyzed the crosstalk in clear and black 96 well plates by placing a single lux expressing sample in well C3 and filled all remaining wells with medium. As can be seen in figure xxxx, the signal from a single well is sufficient to significantly illuminate a huge portion of the clear plate (~ 1 % signal overflow to neighboring wells) while the crosstalk is reduced tenfold when using a black plate (~ 0.1 % signal overflow to neighboring wells).

**Figure 3: Luminescence pattern in clear (A) and black (B) 96 well plate**
200 µL of one Lux expressing sample was placed in C3 while all other wells were filled with medium

Thus, we used black plates and payed attention not to place the brightest cultures in direct proximity to the darkest cultures. Therefore we do not see crosstalk as a decisive argument against the lux operon. However, algorithms are under development that will allow for a mathematical correction to further improve the performance of the lux operon as reporter (Georg und seine Leute unpublished)

Finding the best ori

**Figure 3: Testing the Lux expression from plasmids with different oris**
Data were normalized over the strongest construct ColE1. Error bars represent the standard deviation of the measurements of three independent experiments

The dynamic range of a reporter experiment does not only depend on the used reporter but also on the copy number of the tested plasmids, which is determined by the used origin of replication. We wanted to identify the ori which yields the highest dynamic range when expressing the lux operon. To do that, we constructed three plasmids expressing lux. All parts, except for the ori, were identical and tested them for signal strength. We obtained the highest expression from the construct harbouring the ColE1 ori, followed by p15A and pMB1 (figure xxxx). We suggest that ColE1 yields plasmids with the highest copy number. We performed qPCR experiments that that support this hypothesis. We observed a qualitative correlation between copy number and expression strength. As a high dynamic range is essential for analyzing weak expression levels, we chose ColE1 as our default ori for all subsequent experiments.

Promoter Characterization

After having established an experimental and data analysis workflow and after determining the optimal plasmid context for reporter experiments, we started to apply our knowledge to characterize the parts in our Marburg Collection.

**Figure 4: Relative promoter strength of Anderson promoters**
Data were normalized over the strongest construct J23100. Error bars represent the standard deviation of the measurements of three independent experiments

We started by measuring the promoter strength of the Anderson Promoter library in V. natriegens. Firstly, we assembled 19 test plasmids with golden-gate-assembly and measured their expression strength, following our selfmade workflows. The results are shown in figure xxxx. We observed an even distribution of the tested promoters throughout the dynamic range. The strongest promoter (J23100) yielded 40 fold stronger signal than the promoter dummy and was used as a reference to calculate relative promoter strengths. The test constructs were built with dummy connectors which did not possess insulator elements. We assume that this resulted in additional expression caused by transcription throughout the rest of the plasmid, e.g. ori and antibiotic resistance. This is thought to add the same extent of signal to all measured promoters thus reducing the overall dynamic range. To further evaluate this assumption, we could repeat this experiment with one of our insulators instead of the dummy connector.

In addition to constitutive promoters, the Marburg Collection contains two inducible promoters, pTet and pTrc. For all experiments with inducible promoters, we added the respective inducer concentration to the preculture as well as to the main culture to ensure constant expression.The first experiments were performed with the pTet promoter that can be induced by the tetracycline derivative anhydrotetracycline (ATc). ATc is much less cytotoxic but still capable of binding and altering the structure of the repressor TetR, leading to release of the promoter and enabling transcription. To measure the dose response behavior of the pTet, we made a dilution series of ATc. Following the recommendation of our advisors (Stefano Vecchione), we started with the concentration commonly used in E. coli, started with the concentration (100 ng/mL). The starting concentration was diluted twofold in 20 subsequent steps. Our results are shown in figure xxx. The absence of bars for the four highest concentrations means that the cultures did not reach an OD of 0.2 in the seven hours of the measurement. Remarkably, we observed reasonable growth of those same cultures in the preculture already induced with the identical amount of ATc. Knowing that luminescence is produced at the end of an enzymatic cascade, starting with intermediates of the phospholipid metabolism (Meighen 1991), we reckon that very strong induction could decrease the fitness of cells and that after dilution in room temperature medium, strained cells are not able to recover from the stationary phase. However, we only observed this phenomenon in experiments with pTet, although we obtained higher signals for the strongest constitutive promoters as well as for the highly induced pTrc. We checked for toxicity of ATc but could not see a measurable effect (figurexxxxx). Another possibility is that TetR interacts with components inside the cell and that high ATc increases these interactions. Blast searches of TetR against the genome of V. natriegens identified one protein that shares some homology with the N-terminal part of TetR which could result in cross talk between the host and the inducible promoter.

**Figure 4: Dose response of pTet with ATc.**
J23100 was used as positive control and for normalization. Error bars represent the standard deviation of the measurements of three independent experiments

All measured data were normalized to the strongest constitutive promoter J23100. Saturation occurred at a dilution of 2^6 (~ 1.6 ng/mL) and an exponential reduction of luminescence signal can be observed for higher dilutions. In the absence of ATc, the signal is twelve fold lower compared to saturation.
pTet allows relatively tight control of gene expression and is therefore well suited for driving the expression of potentially toxic proteins. On the other hand, we were not able to induce strong expression that can compete with strong constitutive promoters or the fully induced pTrc.

pTrc is the second tested inducible promoter. It contains lac operator sites and is therefore regulated by the repressor LacI which is constitutively expressed from a downstream gene. pTrc can be induced Isoopropyl-β-D-thiogalactopyranosid (IPTG), a chemical derivative of lactose (Camsund et al. 2014). Similar to our experiments with pTet, we made a dilution series starting with the commonly used IPTG concentration for E. coli 0.5 mM. We observed a five fold induction and a saturation that occurred at a dilution of 2^5 (~15 µM). The strongest expression is similar to the expression gained from the strongest constitutive promoter J23100 while the expression in the absence of inducer equals medium strong promoters. As a consequence, we do not recommend using pTrc in constructs where a tight control of gene expression is desired. However, pTrc is well suited when strong expression is required.

**Figure 4: Dose response of pTrc with IPTG.**
J23100 was used as positive control and for normalization. Error bars represent the standard deviation of the measurements of three independent experiments

Taking the results of both inducible promoters into account, we made two observation. In both cases, the dynamic range is smaller compared to E. coli and the inducer concentration that facilitates saturation is 32 and 64 fold lower for pTrc and pTet, respectively, than the concentration that is typically used for E. coli. A possible explanation could be found in the fast growth of V. natriegens which might result in a lower concentration of the repressor proteins in the cells, finally leading to a less restricted control of the negatively regulated promoters. However, we do not have experimental support for our idea.

Characterization of Connectors

One novel key feature of our toolbox are the connectors. They were designed in order to function as insulators to prevent crosstalk between neighboring transcription units (Link zu Design). Therefore a perfectly insulating connector would prevent the readthrough from backbone sequences that most probably caused the notably high expression that was measured in the promoter experiment for the dummy promoter (Verweis zum Promoter Experiment). In addition to blocking transcriptional readthrough, a good connector must not possess any cryptic promoter activity.
We focused on characterizing the 5’ Connector because we expect the stronger influence on signal strengths.For characterizing our connector parts, we created 20 test plasmids with the lux operon as the reporter.
In our toolbox we provide five short connectors, which solely possess the fusion sites for LVL2 cloning, and five long connectors which additionally harbor self-designed insulators. Each of these ten connectors were cloned with the constitutive promoter J23100, to check for effects on an active promoter, and with the Promoter Dummy to quantify the extent of transcriptional activity that reaches the Promoter Dummy.

Figure 5:
Results of Connector measurmenet
A) Connector constructs built with J23100 as promoter part
B) Connector constructs built with the Dummy Promoter as promoter part

The acquired data are shown in figure xxxxx. The data were normalized over the test construct J23100, that was used in the promoter experiment and constructed with the connector dummies. For the five constructs with the active promoter and the long connectors we observed extremely varying signals. We measured a range from 0.2 to 2 fold change compared to the reference construct. It has been shown that the sequence directly upstream of small synthetic promoters can greatly impact the transcription efficiency (Carr et al. 2017). In case of the long connectors, the sequence upstream of the promoter forms the terminator and could affect the efficiency of RNA-polymerase binding to the -35 and -10 regions. For the constructs built with small connectors, we also observed varying signals but to a lesser extent compared to the long connectors. For all ten connectors that are provided in our toolbox, we show a tenfold range in the measured luminescence/OD600 signal. As a conclusion, we recommend to carefully consider the combination of promoter and 5’ Connector for rationally designing constructs.

Taking a look at the constructs that were built with the Promoter Dummy, we also see a huge difference in the expression signals. For the long connectors we expected a negligibly low reporter expression which we observed for two out of five long 5’ Connectors resulting in a 14 fold signal reduction compared to the “Promoter Dummy” reference. The remarkably strong signal observed for the remaining three connectors could be due to inefficient terminators or cryptic promoters in the pretended “neutral sequence”.

For the remaining five constructs possessing the five short 5’ connectors we observed a range from 0.3 to 5.5 fold compared to the “Promoter Dummy” reference. We are not able to give an experimental explanation for this observation but we could imagine that the LVL2 fusion sites, the only four bases that differ in these constructs, could constitute a weak promoter together with surrounding sequences.
Summarizing the connector characterization, we found that that sequences upstream of short synthetic promoters greatly affect reporter expression, which is in accordance with literature (Carr et al. 2017). Moreover, we demonstrated that two of our five self-designed connectors efficiently reduce the signal resulting from other sources than the actual promoter. We additionally conclude that algorithms that predict the “neutrality” of sequences alone are not sufficient to create well functioning insulators.

Characterization of origins of replication

Origins of replication (Oris) are genetic elements where DNA replication is initiated. In plasmids the Ori sequence is responsible for it’s maintenance and for the copy number inside the cell (Selzer et al., 1983; Brantl, 2014).

The origins of replication colE1, pMB1 and p15A belong to the same family. They do not code for any enzyme but are replicated by the hosts RNA polymerase (Cesareni et al., 1991; Brantl, 2014). The polymerase transcribes a region 508 bp upstream the Ori sequence (Tomizawa & Itoh, 1981; Selzer et al., 1983) synthesizing a pre-primer RNA called RNA II. During transcription the RNA II underlies conformation changes building secondary structures(Brantl, 2014). This structures contain typical loops (Cesareni et al., 1991) that binds to the plasmids’ Ori sequence building an RNA-DNA hybrid (Cesareni et al., 1991; Brantl, 2014). The RNA II is than cleaved by the hosts RNase H to become a mature primer (Cesareni et al., 1991; Brantl, 2014).

For our collection we characterized three Oris commonly used in molecular biology: colE1, pMB1 and p15A. We measured two different plasmids, one with and another without a LUX cassette. Both plasmids consist of a kanamycin resistance cassette and one of the three Oris described. The LUX expression plasmid contained a constitutively expressed LUX cassette of ~6kb. The other one contained a connector sequence to build an ‘empty’ plasmid. By comparing this constructs you may consider that the copy number is not only influenced by the LUX expression but also by the plasmids sizes. This Oris belong to the same family differing in mutations in the RNA I region (Tomizawa & Itoh, 1981; Selzer et al., 1983).

We measured the plasmids’ copy number by qPCR using the absolute quantification method.
A qPCR is set up the same way like a normal PCR but with addition of a DNA binding fluorophore in this case SYBR Green. SYBR Green binds double stranded DNA emitting a high signal while unbound SYBR Green shows only low fluorescence (Zipper et al., 2004). In every PCR cycle the number of double stranded DNA is duplicated emitting an increasing fluorescence signal. This signal is detected after every cycle by the qPCR machine and the value is saved. After the run finished, normally after ~40 cycles, a signal threshold is determined and the corresponding cycle when the threshold was reached is saved for further analysis.
For the qPCR run first total DNA from our host containing the plasmids of interest was isolated in the exponential phase (OD₆₀₀ ~ 0.5), purified using the innuPREP Bacteria DNA Kit from Analytik Jena and all samples normalized to ~5ng/ul with the Qubit fluorometer from ThermoFisher scientific. Subsequently a dilution series was made in 1.5ml tubes diluting the DNA 7 times 1:2. This way the dilution series contained 8 steps reaching from 2⁰ to 2^-7. Two different primer pairs were used for the analysis: one matching the housekeeping gene dxs present once on the genome and the other matching the kanamycin resistance cassette on the plasmid. The DNA samples used for the amplification of the kanamycin cassette were the same used for the dilutions 2^-4 and 2^-5. The threshold cycles (Ct) acquired in triplicates from the dxs sequence were used for a standard curve. By comparing the Ct values from the resistance cassette with the corresponding standard curve the number of copies could be determined as multiples from the dxs sequence. It should be considered that the dxs sequence is coded on the first chromosome of V. natriegens at ~ one o’clock. Due to that probably the sequence is present more than once because of multifork replication of the genome.

To build the standard curve the Ct values were plotted on the y-axis and the corresponding dilution steps on the x-axis. The x-axis was set logarithmic and the standard curve was calculated with Excel. The curve’s formula was than used to calculate the corresponding x-value from the resistance cassette’s Ct values. Because the x-values describe a theoretical dilution the Ct values were multiplied with this value and with their corresponding dilution to obtain the final amount of multiplies compared to the genome. For every Ori an own standard curve was calculated.

In our experiments we showed that the plasmids’ copy number controlled by three different Oris differ a lot when comparing V. natriegens with E. coli.
One possible explanation might be different expression levels of RNA I and RNA II respecting the rate of RNA I – RNA II bounds (Cesareni et al., 1991) due to the divergent metabolism in V. natriegens and E. coli. Another plausible explanation might be the different methylation patterns in both organisms probable affecting the formation of the RNA II secondary structures and subsequently its binding affinity to the DNA (Russell & Zinder, 1987; Cesareni et al., 1991).

It was shown that mutations especially in the loop I structure might be responsible for Ori compatibility and copy number control (Selzer et al., 1983; Cesareni et al., 1991). The copy number is mainly determined by two factors: the binding efficiency of the RNA II to the DNA – specially controlled by the stabilization of stem-loop IV – (Cesareni et al., 1991) and the interference of the complementary RNA I to the RNA II pre-primer (Brantl, 2014). The RNA I is transcribed constitutively from the complementary strand from RNA II pre-primer (Brantl, 2014). Binding of RNA I to RNA II prevents the correct folding of the pre-primer (Brantl, 2014). This way the RNA-DNA hybrid can not be formed and subsequently the primer maturation can not take place (Brantl, 2014).

**Figure 1: Quantification of plasmid copy number in dependency of different Oris.**
The columns show the average of the calculated multiplies for the different plasmids. The blue columns show the numbers for the plasmids containing a ~6kb LUX cassette. The orange columns show the numbers for the ‘empty’ plasmids without reporter. For every column six measurements have been calculated. Looking at the ‘empty’ plasmids it is clearly shown that colE1 and p15A remain high copy plasmids like in E. coli with a copy number of ~200 copies per cell. For pMB1 the copy number is scaled down becoming a low copy number Ori in V. natriegens. Looking at the LUX plasmids it is clearly shown that the colE1 Ori remains at a high copy number while pMB1 and p15A drop down to a significantly lower level.