Overview of the Marburg Collection
A major goal of our project was to create a collection of characterized parts for use in V. natriegens to provide other iGEM teams, as well as the whole scientific community a toolbox for the rational design of metabolic pathways, genetic circuits or any other DNA construct.Our toolbox was initially conceived for being used in V. natriegens. However, while designing the Marburg Collection, we realized that a toolbox with maximum flexibility can easily be used in more than one bacterial species. Because alternative bacterial chassis, apart from E. coli, are gaining increasing importance (Kim et al. 2016), a toolbox that is compatible with more than one bacterial species enables scientists to work with organisms that have the exact properties needed for specific applications. Our vision is to establish the Marburg Collection as the first broad-host range golden-gate-based cloning toolbox.
We investigated existing bacterial toolboxes like EcoFlex (Moore et al. 2016), CIDAR (Iverson et al. 2016) and iGEMs PhytoBrick system, which is closely related to the GoldenBraid cloning (Sarrion-Perdigones et al. 2011). All these toolboxes are based on predefined vectors that preset the origin of replication and the antibiotic resistance cassette thus preventing their use in organisms that are not compatible with either feature. We wanted to give scientists the freedom of choice to select oris and resistance cassettes that suit their organisms and applications best. This flexibility is possible because our toolbox does not rely on classical “backbones” but instead functions by the complete de novo assembly of at least eight basic genetic parts.
All basic parts like promoters or resistance parts are stored in LVL0 plasmids. The assembly of a plasmid comprising a single transcription unit is done by assembling at least eight parts resulting in one LVL1 plasmid. One to five LVL1 plasmids can then be used for a subsequent round of assembly to obtain a multigene LVL2 plasmid which could already harbor a full synthetic metabolic pathway consisting of up to five enzymes. Our toolbox even allows more rounds of assembly, each combining up to five constructs of the previous level. This enormous cloning capacity results theoretically in an infinite number of transcription units that can be assembled in a small amount of time. An additional layer of flexibility in the construction of LVL2 plasmids is added to our toolbox by our 5’- and 3’- Connectors which flank the transcription units in LVL1 plasmids. These connectors provide the fusion sites for the assembly of LVL2 plasmids. By selecting the correct connectors, the user can define the order as well as the orientation of each transcription unit. The ability to assign an orientation to a transcription unit is supposed to reduce the influence of neighboring unit on each other caused by transcriptional read through and DNA supercoiling. Moreover, our connectors are designed to function as genetic insulators. They consist of 300 bp “neutral DNA” flanked by strong transcriptional terminators to separate each transcription unit in a LVL2 plasmid. We believe that this novel feature is an important step towards building genetic constructs in a rational and predictable manner.
Creation of new LVL0 entry vectors
All LVL0 parts have to be stored in plasmids to allow for amplification and long term storage. To create new LVL0 parts, a PCR product or annealed oligos are cloned into a part entry vector. This vector harbours the resistance and ori that are required for selection and propagation. Furthermore, part entry vectors can be designed in a way that they contain a dropout. This dropout can be a transcription unit for a marker that generates a visible output. The first golden-gate-based toolbox MoClo (Weber et al. 2011) used a LacZ alpha transcription unit which can be used for blue white screening in many E. coli cloning strains. This concept was also adapted by iGEMs PhytoBrick system. During the cloning of LVL0 parts, this dropout is replaced by the desired part. When the cloning reaction is transformed into a suitable E. coli strain and the cells are plated on agar plates with supplemented IPTG and X-Gal. Colonies transformed with the religated entry plasmid appear blue while white colonies most probably contain the correctly assembled plasmid. The LVL0 part entry vector in iGEMs PhytoBrick system (BBa_P10500) has been designed as described and can be used for blue white screening.We appreciate the approach of using part entry plasmids with dropouts but, for two reasons, we think that LacZ is not an optimal reporter. First, blue white screening requires the two expensive chemicals IPTG and X-Gal which have to be added to the agar plates. Second, blue white screening is restricted to E.coli strains with an incomplete lac operon that is complemented by the LacZ alpha fragment that is expressed from the plasmid (Langley et al. 1975). Consequently blue white screening is not compatible with a V. natriegens wild type strain (Link zu Improvement Page).
We overcame both drawbacks by replacing the LacZ alpha dropout of BBa_P10500 by a RFP and sfGFP dropout and thus created the two parts BBa_K2560001 and BBa_K2560002, respectively. Like BBa_P10500, our two new dropout parts are located in a derivative of pSB1C3 which is extended with two BsaI recognition sites. The dropout parts are flanked by BsmBI recognition sites that are required for replacing the dropout by the desired LVL0 part. The resulting LVL0 parts are classified as PhytoBricks and are RFC10 and RFC25 compatible, assuming that the part itself does not contain prohibited recognition sites. The resulting plasmids can be used to clone and store all LVL0 parts. However, for resistance parts a different approach enables more convenient LVL1 cloning. In the LVL1 golden-gate-reaction, at least eight parts are combined in a single tube together with the required enzymes. All parts, except the resistance part, are stored in the previously described pSB1C3 derivate which confers a chloramphenicol resistance. Obviously, the plasmid that provides the new resistance cassette for the LVL1 plasmid (e.g. a Cas9 plasmid with kanamycin resistance), has to contain the same resistance cassette (kanamyicin in this example). Because cloning is never 100 % efficient, re-ligation of LVL0 plasmids is a common event, and in case of the LVL0 resistance part, results in false positive colonies which do not contain the desired LVL1 plasmid.
We developed a solution for this problem by creating the novel resistance entry vectors BBa_K2560005 and BBa_K2560006. These plasmids contain a RFP and sfGFP dropout, respectively, and a chloramphenicol resistance cassette that is flanked by BsaI and BsmBI recognition sites. When a new LVL0 resistance part is cloned, the chloramphenicol resistance is replaced by the new antibiotic resistance marker resulting in a RFP or sfGFP expressing plasmid with the respective resistance marker. When using these LVL0 parts for LVL1 cloning, the re-ligated resistance parts yield colonies with a visually detectable phenotype. As a result, correct plasmids can be easily identified, even for inefficient LVL1 clonings with < 10 % efficiency.
Choice of fusion sites for LVL1 cloning
Golden-gate based cloning relies upon the use of type IIs restriction endonucleases like BsaI, BsmBI or BpiI. In comparison to the commonly used type II restriction enzymes (e.g. EcoRI or PstI) they also recognize a specific DNA sequence but cleave outside of their recognition sequence (Pingoud and Jeltsch 2001). The golden-gate-cloning method is taking advantage of this property. A single enzyme can be used to create various single-stranded overhangs that match in a predefined order and finally lead to the correctly assembled plasmid (Engler et al. 2008).When we started to design the Marburg Collection, we carefully investigated which fusion sites should be used. The fusion sites do not only set the order in which the single parts will be assembled but also affect the assembly efficiency and determine if a newly designed toolbox is compatible with already existing collections, so that parts can be shared easily. For us, the most important and decisive argument was that we wanted to be compatible with as many other toolboxes as possible. It is our strong belief that scientists all over the world should agree on one set of fusion sites to ensure complete interchangeability between different toolboxes. The toolboxes of MoClo (Weber et al. 2011), Loop Assembly (Pollak et al. 2018) and the PhytoBrick system already use a common set of fusion sites. We decided to adapt these fusion sites for all parts that build the transcription unit (Promoter, RBS, CDS, Terminator, Tags). Fusion sites for parts that are novel to our system (Connectors, Oris, Resistance cassettes) had to be newly designed by us because these parts did not exist in the other toolboxes.
We applied the following design principles to obtain optimal fusion sites. Firstly, the newly designed fusion sites must neither be identical to already existing fusion sites nor be palindromic to prevent assembly in a wrong order. Secondly, the fusion sites should not consist of bases that represent a portion of the recognition sequence of a restriction enzyme. If, for example, a fusion site with the sequence GGTC was used, and the sequence of the downstream part starts with TC, a BsaI recognition site would be reconstituted. So all fusion sites that would result in a partial recognition sequence of either BsaI, BsmBI or any of the enzymes that are used in the Biobrick cloning, are excluded. Lastly, the remaining candidates were sorted according to their GC content and the fusion sites with the highest GC content were chosen.
To make design of new parts as simple as possible, we created a collection of overhangs that can be copied from table xxxx and pasted to the sequence specific part of a primer to create new LVL0 parts. These primers contain the cut sites for integration into the part entry vector as well as the predefined fusion sites that are required for correctly assembling LVL1 plasmids. In some cases these overhangs contain additional bases that will be discussed in the following chapter. part category fwd overhang rev overhang 1 5’ Connector AAGGTCTCGCTCGAACACGTCTCGNNNN GGAGTGAGGGAGACCAA 2 Promoter AACGTCTCGCTCGGGAG TACTTGAGGGAGACGAA 3 RBS AACGTCTCGCTCGTACTAGAG TAATCAATGTGAGGGAGACGAA 4 CDS AACGTCTCGCTCGAATG GCTTTGAGGGAGACGAA 4x N-Tag AACGTCTCGCTCGAATG GGGATGTGAGGGAGACGAA 4y_N-tagged CDS AACGTCTCGCTCGGATG GCTTTGAGGGAGACGAA 5a C-Tag AACGTCTCGCTCGGCTTTA GGGTATGAGGGAGACGAA 5 Terminator AACGTCTCGCTCGGCTTAA CGCTTGAGGGAGACGAA 5b Terminator AACGTCTCGCTCGGGTAA CGCTTGAGGGAGACGAA 6 3’ Connector AAGGTCTCGCTCGCGCT NNNNGGAGACGAGCTTGAGGGAGACCAA 7 Ori AACGTCTCGCTCGAGCT TGCTTGAGGGAGACGAA 8 Resistance AACGTCTCGCTCGTGCT AACATGAGGGAGACGAA
Primer overhangs for creating new LVL0 parts. Overhangs have to be added to the 5’ end of the primer. Reverse overhangs have to be added as reverse complement. Underlined bases represent BsmBI recognition sites. Bolt bases indicate fusion sites for LVL1 assembly. Bold and underlined bases show BsaI recognition sites. N bases written in bold and italic show fusion sites for lvl1 assembly and have to be custom-designed. Additional bases between parts
The fusion sites of most golden-gate-based cloning methods create a four base pair scar that is referred to as the fusion site. The fusion sites are the feature that makes toolboxes compatible with each other. Between some parts, additional bases are required for different reasons. These bases were chosen carefully to achieve best performance of the respective part. Please note that these additional bases are not a strict requirement to use or being compatible with our toolbox but we recommend them for the design of additional parts.
The first additional bases were incorporated between the Promoter and RBS part. The fusion site is TACT and AGAG was added additionally. The sequence between promoter and RBS that results by using our suggested overhangs form the same scar that is created if the the parts were assembled with 3A Assembly (Knight 2003). This means that the distance between promoter and RBS is not changed and therefore we do not expect negative effects in transcription or translation. Moreover, we hope that creating the same scar with a different method will make our experimental data more comparable to the data acquired with plasmids assembled with 3A Assembly in previous iGEM projects. The next bases were integrated between RBS and CDS parts. The fusion site, which was adapted from the PhytoBrick system, AATG and TAATC was added upstream of it. Previous work has shown that the sequence between a RBS and the start codon dramatically affects the expression of the desired protein (Lentini et al. 2013). A spacer length of six base pairs was shown to result in the strongest expression. When comparing different bases in a six bp spacer, the experimental data indicate significant differences. The 3A Assembly scar results in 50 % expression strength compared to the sequence TAATCT which was referred to as the reference (Lentini et al. 2013). We chose to use the spacer sequence which is expected to result in highest expression as we think that a system that is designed to enable strongest expression can be easily adapted for low expression by using weak RBS or promoters while going to the opposite direction might be more difficult. Unfortunately, we could not adapt the exact “reference sequence” because the first A in the fusion site AATG is already part of the spacer. Eventually, we used the first five bases of the strongest spacer (Lentini et al. 2013) upstream of the fusion site.
Close attention has to be paid to fusion sites that connect two sequences which are translated like the CDS part or N- and C- terminal Tags. A fusion site in our system consists of four bases which would result in disrupting the triplet code. To prevent mistranslated proteins, two additional bases have to be added to create a six bp scar that results in two translated amino acids. These can be seen as linkers between the joined CDS parts or tags. We decided to preferably use amino acids that are abundant in natural or synthetic flexible linkers like glycine and serine (Chen et al. 2012). Flexible linkers have been shown to improve the performance of epitope tags in Saccharomyces cerevisiae (Sabourin et al. 2007). Therefore we added the bases GG upstream of the fusion site between 4x (N-Tag) and 4y (CDS) which results in glycine and methionine (methionine is preset by the fusion site) and the bases TA downstream of the fusion site between 4 (CDS) and 5a (C-Tag) resulting in an alanine and leucine linker. To allow for the optional use of C-terminal tags, a CDS part must not possess a stop codon. Therefore an additional linker has to be introduced between 4 (CDS) and 5 (Terminator) as well as between 5a (C-Tag) and 5b (Terminator) resulting in an alanine-STOP and glycine-STOP, respectively.