PROTEASE CLEAVAGE SITE ASSAY
- I.
  Design of the CR2 Gag-Prot-RT Constructs
- II.
  CR2 Gag-Prot-RT DH5-a Transformation
- III.
  CR2 Gag-Prot-RT Induction Optimization
- IV.
  Tricine SDS PAGE purification of CR2 Gag-Prot-RT 28C and 35C.
- V.
  CR2 Gag-Prot-RT Purification
- VI.
  CR2 Gag-Prot-RT Degradation Fragment Extraction for Mass Spectrometry
DESIGNING CONSTRUCTS FOR MATURE VLP FORMATION
- I.
  Designing CR2 Gag Amplicons
- II.
  Transformation of Amplicons
VLP FORMATION
- I.
  Purification of the Various Amplicons Proteins
- II.
  Assembly of the Purified Proteins using Various Assembly Buffer Conditions
- III.
  Assembly of the Purified Proteins Constructs and Assembly Time Assays

INTRODUCTION

Centromere-specific retrotransposons (CRs) are a class of transposable element belonging to the Ty1/Copia and Ty3/Gypsy superfamilies of long terminal repeat (LTR) retrotransposons found predominantly in the centromeres of eukaryotic genomes. CRs are characterized by the chromodomain (CHD) within the C-termini of the integrase; this motif is believed to influence the selectivity of CR insertion at the centromere [5]. Ty3 Gag and Pol genes are arranged in the same order as retroviruses: (1) Gag (2) protease, (3) reverse transcriptase and (4) integrase. On the other hand, Ty1 Gag and Pol genes are ordered: (1) Gag, (2) protease (3) integrase, and (4) reverse transcriptase.

Gag, which forms the structural unit of retroviral and retroviral-like elements, can be further divided into the matrix, capsid, and nucleocapsid domains. The make-up and function of Gag domains varies widely between retroviral and retroviral-like elements. Generally, the matrix facilitates nuclear localization, nucleic acid binding, membrane targeting, and protein-protein interactions (Butterfield-Gerson et al., 2006; Jewell and Mansky, 2008; Parent and Gudleski, 2012), the capsid mediates protein-protein interactions leading to Gag multimerization and particle assembly (Mortuza et al., 2004; Emerson and Thomas, 2011; Cerfoglio et al., 2014), and the nucleocapsid contains a CCHC zinc-finger motif which packages DNA/RNA and facilitates tRNA annealing (Syomin et al., 2012; Goh et al., 2016). Studies investigating the Ty3 Gag polyprotein have shown that Gag is proteolytically processed into a 26-kDa capsid and 9-kDa nucleocapsid (Yurii et al., 2004). The Gag polyprotein of Ty3/Gypsy retroelements does not contain a matrix domain; however, they do still have the ability to form virus-like particles (VLP). VLPs represent a major intermediate step in retrotransposition reactions (Roth, 2000). The Saccharomyces Cerevisiae retrotransposon, Ty1, provides a model for elucidating the assembly and structural properties of retro-elements (Roth, 2000). Genomic Ty elements are first transcribed in the nucleus and then transported into the cytoplasm where the Ty RNA is translated into proteins, the Gag proteins then assemble into VLPs that encapsulate the Ty-RNA, tRNA, and Ty encoded enzymes (Roth, 2000).

Our project is based on the characterization of the CR2-Gag-Pro-RT VLP assembly and packaging capabilities. The first step in this characterization required us to find the specific cut sites of the CR2 protease in order to determine the substrate specificity of these protease enzymes. This determination is critical for giving us insight into the diverse and complex roles the Gag domain has on the formation of VLPs. Using this information we designed CR2 Gag amplicons that translate into proteins which are critical for mature VLP formation. These amplicons were designed with the capsid and nucleocapsid domains in mind. After purifying the proteins created by these amplicons we optimized VLP assembly conditions to determine the most efficient environment for VLP formation. We then assembled and purified VLPs in-vitro using our optimized assembly conditions. The samples were then run through a Native PAGE to determine the approximate VLP size and then visualized using an transmission electron microscope.

I. Design of the CR2 Gag-Prot-RT Constructs

Purpose: To determine the cleavage sites of the aspartyl-like protease we designed a CR2 Gag-Prot-RT construct which is approximately 2 kb in size.

Attributions: Dan Laspisa gathered the sequences to produce the consensus for the sub-families of CRs in this experiment. Ryan Shontell annotated the consensus sequences, made the predictions, and designed the three constructs with insight from Dr. Gernot Presting.

Materials and Methods: Consensus sequences of seven sub-family members of the Zea mays centromere-specific (CR) retrotransposons were assembled by Dan Laspisa.

Sub-Family Member	Sequences in Consensus
CR1A	29
CR1B	81
CR2	171
CR3	13
CR4B	48
CR4R1	37
CR5	157

The consensus sequences were run through the Conserved Domain Basic Local Alignment Search Tool (CD BLAST) to identify conserved domains within the open reading frame of the Gag and Pol polyproteins. The consensus sequences were further analyzed using publicly accessible protein modification servers to identify conserved modifications of the polyproteins. Covalent protein modifications were scanned for using ProtParam (Gasteiger et al., 2005), MyHits Motif Scan (SIB, Switzerland), and PlantsP, (Podell and Gibskov, 2004). Nuclear localization signals were scanned for using NLStradamus (Nguyen et al., 2009), NucPred (Stockholm Bioinformatics Center, Sweden), cNLS Mapper (Kosugi et al., 2009), WoLF PSORT (CBRC, Japan), and SeqNLS (Lin et al., 2012). These conserved domains were annotated on the consensus sequences using Geneious (Biomatters, Newark, NJ 07102, USA) and an alignment was generated using MUSCLE (EMBL-EBI, England).

Making inferences from previous studies investigating the cleavage specificity of the aspartyl-like proteases of retroviruses, we predicted putative cleavage sites along the CR polyproteins. These studies found that in most cases, aspartyl-like proteases cut substrates at the bond between two adjacent hydrophobic amino acids (Toszer, 2010; Laco, 2015). We scanned through the consensus sequences and noted sites meeting the criteria for adjacent hydrophobic residues. We additionally considered the covalent modifications that have been reported in specific domains. We identified three putative cleavage sits between the Gag and Protease domains, eight between the Protease and Reverse Transcriptase domains, and four between the RNAseH and Integrase domains.

After predicting these domains, the putative mature proteins were modelled separately using I- TASSER (Zhang, 2008), PHYRE2 (Kelley et al., 2015), and HHPred (Max Planck Institute, Germany). The accuracy of the models produced were assessed using ANOLEA (Melo et al., 1997) RAMPAGE (University of Cambridge, England), and Verify3D (Luthy et al., 1992). ANOLEA performs pairwise comparisons of the energy between the heavy atoms of each amino acid in a non-local environment, RAMPAGE performs Ramachandran analysis for the stereochemistry of the amino acid side chains around peptide bonds, and Verify3D compares the 3D model against it’s 1D amino acid sequence and scores the likelihood of a residue being in its structural class.

The top five models from the putative mature proteins were used to make consensus models using MODELLER (Sali Lab, San Francisco, California). Five consensus models were produced for each predicted mature protein. These were checked with ANOLEA, RAMPAGE, and Verify3D and the top scoring models were used for further analysis. X-Ray crystallography models from the Protein Data Bank (Berman et al., 2000) of the Ty3/Gypsy or retroviral equivalents to our predicted mature proteins were downloaded and then compared to our consensus models using PyMol (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC).

Results and Discussion: The comparisons between our predicted 3D models and the X-Ray crystallography data allowed us to further restrict the number of predicted putative cleavage sites based on the locations of terminal alpha helices and their associated covalent protein modifications. The CR2 Gag-Prot-RT construct encompasses the region from Gag to the 5’ tip of the reverse transcriptase domain. The construct was designed with a N-terminal and C-terminal 8-histidine tag to allow for the purification of the regions 5’ and 3’ of the cut site using nickel resin. This constructs was cloned into the pET-14b vector using a 5’ NcoI restriction site and a 3’ BamHI restriction site (GenScript).

CR2 Gag-Prot-RT
Amplicon Size (bp)	2342 bp
Protein Size (kDa)	90 kDa

II. CR2 Gag-Prot-RT DH5-a Transformation

Purpose: To ligate the CR2 Gag-Prot-RT into pET-14b vector which will then be used to transform into DH5-α for future experimentation and long-term storage of the plasmid.

Attributions: The initial colony PCRs for verification of inserts were performed by Ryan Shontell. Troubleshooting and colony PCR for verification of CR2 Gag-Prot-Rt was performed by Gina Watanabe and Emily Yang. Initial sequencing using the T7 forward and reverse primers were performed by Ryan Shontell. Later sequencing performed using T7 and internal primers ordered from IDT were performed by Gina Watanabe. Analysis of all sequences was performed by John Banasihan, Ryan Shontell, Jonathan Tello, and Dr. Gernot Presting.

Materials and Methods: Our transformation protocol began with a double-digestion of a pET-14b vector using BamH1 and NcoI NEB-HF restriction enzymes in preparation for ligation. We then proceeded to ligate our CR2 Gag-Prot-RT insert into the now digested vector. We then used DH5-α competent cells, a pUC19 vector as our control and the CR2 Gag-Prot-RT ligated vector (2-fold TE dilution) to transform into the DH5-α competent cells. The DH5-α cells that grew on the antibiotic agar were then used for a colony PCR.

BLASTN was used to determine whether internal priming of would occur between the CR constructs and the T7 Forward and T7 Reverse primers. It was determined that no significant lengths of the primers would be able to anneal; however, there was the potential for extension of smaller PCR products due to partial annealing.

After running our colony PCR products on an agarose gel we saw that the CR2 Gag-Prot-RT bands were not very strong. We later optimized the PCR parameters by decreasing the annealing temperature from 65°C to 55°C and increasing the extension time. These changes are seen in Figure 3.

After this optimization of the PCR parameters the CR2 Gag-Prot-RT from figures 1, 2, and 3 were sent in for sequencing at the University of Hawaii at Manoa sequencing facility. Glycerol stocks were also prepared from these colonies.

Results and Discussion: The decrease in annealing temperature seems to have greatly improved the CR2 Gag-Prot-RT colony PCR; however, much of the smaller bands still appear in the gel all averaging roughly 0.5 – 1 kb. These bands did not appear in the previous gels using high annealing temperatures suggesting that these may be products of internal priming. Even though there are putative internal priming products present, we lose some bands that were present in previous high annealing temperature PCRs. Contamination of the PCR reagents was ruled out as the colony PCR was performed in parallel with the other constructs using the same master mix. The product produced during the colony PCR were then sequence verified before further experimentation.

The CR2 Gag-Prot-RT gels from figures 1-3 showed that the plasmid was indeed taken up by all colonies sent in. As the sequencing had been performed with T7 forward and T7 reverse primers, only the first 500-600 base pairs of the 5’ and 3’ end of the constructs could be verified. We ordered three internal primers for the CR2 Gag-Prot-RT construct to close the gaps for verification of the sequence.

Figure 1. Colony PCR of CR2 Gag-Prot-RT with T7 forward and reverse primers. PCR conditions: (1) Initial Denaturation at 94℃ for 30 seconds (2) Denaturation at 94℃ for 20 seconds (3) Annealing at 65℃ for 20 seconds (4) Extension at 68℃ for 1 minute 30 seconds (5) Final Extension at 68℃ for 5 minutes; run for 35 cycles. Amplicon run on 1.5% agarose gel in TAE at 95V. Weak bands of the appropriate size can be seen in colonies 1, 3, and 4. These were used for sequencing.

Figure 2. Colony PCR of CR2 Gag-Prot-RT with T7 forward and reverse primers with longer extension times. PCR conditions: (1) Initial Denaturation at 94℃ for 30 seconds (2) Denaturation at 94℃ for 20 seconds (3) Annealing at 65℃ for 20 seconds (4) Extension at 68℃ for 2 minute 20 seconds (5) Final Extension at 68℃ for 5 minutes; run for 35 cycles. Amplicon run on 1.5% agarose gel in TAE at 95V, for sequencing. Longer extension times did not improve the PCR significantly.

Figure 3. Colony PCR of CR2 Gag-Prot-RT with T7 forward and reverse primers with longer extension time and decreased annealing temperature. PCR conditions: (1) Initial Denaturation at 94℃ for 30 seconds (2) Denaturation at 94℃ for 20 seconds (3) Annealing at 55℃ for 20 seconds (4) Extension at 68℃ for 2 minutes 20 seconds (5) Final Extension at 68℃ for 5 minutes; run for 35 cycles. Amplicon run on 1.5% agarose gel in TAE at 95V. Lanes 1, 2, 3, 4, and 7 contain the full CR2 Gag-Prot-RT construct.

III. CR2 Gag-Prot-RT Induction Optimization

Purpose: To optimize the induction conditions for the CR2 Gag-Prot-RT construct. These conditions included time, temperature and IPTG concentration.

Attributions: The SDS PAGE runs, inductions, and all subsequent analysis were performed by Ryan Shontell and Gina Watanabe.

Materials and Methods: An overnight culture (ON) of the transformed CR2 Gag-Prot-RT BL21 cells was prepared (10 mL of this ON culture will be used to perform a 200 mL induction). 50 μg/μL of chloramphenicol and 50 μg/μL of ampicillin was added to the ON culture. The flask was then left in the incubation shaker ON at 37°C at 220 rpm.

200 mL of media was then added to the induction container along with the 10 mL of the ON culture. The induction culture was then allowed to grow at 37°C. (Note: For 1-15 mL induction cultures it is recommended to allow growth for 1 hour at 220 rpm. For inductions that are 100+ mL it is recommended to allow to growth for 2 hours at 175 rpm.)

Since the induction culture does not receive antibiotics, aseptic techniques are crucial for the following steps. 100 μL of each culture was pipetted into a clean 1.5 mL microcentrifuge tube and spun at 400 rpm for 5 minutes. The supernatant was discarded and the pellet was resuspended in 20 μL of 1X SDS-PAGE Loading Buffer + 5% B-ME. The tubes were then heated at 95°C for 5 minutes to help with the resuspension. To the induction culture, between 0.2-1 mM of IPTG was added and incubated at the optimized temperature for 2-10 hours. To determine the optimized conditions for induction, 100 μL of culture was collected every hour and then analyzed using an SDS-PAGE.

Results and Discussion: In this experiment we optimized the conditions for the induction of the CR2 Gag-Prot-RT Construct. The full size CR2 Gag-Prot-RT will appear on a gel at ~90 kDa. Putative degradation products occur at ~75 kDa and ~50 kDa. We predict that these bands correspond with the full-size Gag (~52-54 kDa) and Gag-Protease (54-75 kDa). Since the protease appears to be active at these temperatures, we will perform additional inductions at other temperatures (20°C - 28°C). To see if the number or intensity of putative degradation products changes. Under these induction parameters at 35°C a significant quantity of the full-sized construct is expressed. We then used these parameters to induce and purify this CR2 Gag-Prot-RT fragment.

Figure 1. CR2 Gag-Prot-RT Induction at 32℃ and 35℃. Induction performed with 05 mM IPTG for 6 hours at a temperature of 32℃ or 35℃. (1) Marker [Precision Plus Protein Dual COlor Standard] (2) 32℃ Uninduced at 0 hours (3) 32℃ Induced at 6 hours (4) 32℃ Induced at 2 hours (5) 32℃ Induced at 6 hours (6) 35℃ Uninduced at 0 hours (7) 35℃ Uninduced at 6 hours (8) 35℃ Induced for 2 horus (9) 35℃ Induced for 6 hours. Full size CR2 Gag-Prot-RT occurs at ~90 kDa. Putative degradation products occur at 75 kDa and 50 kDa. We predict that these bands correspond with the full-size Gag (~52-54 kDa) and the Gag-Protease (~54-75 kDa).

IV. Tricine SDS PAGE purification of CR2 Gag-Prot-RT 28℃ and 35℃

Purpose: For the CR2-Gag-Pro-RT proteins induced at 28℃ and at 35℃, a Tricine Gel run based on Haider et al., 2012’s protocols was performed. The tricine gel was used to separate proteins of 5-30 kDa or less. This will separate the RT protein from the full Gag protein and allow us to extract gag via gel extraction.

Attributions: All experiments conducted relating to the Tricine gel purification was done by John Banasihan.

Materials and Methods: In this experiment Haider et al., 2012’s protocols were followed with some minor adjustments due to availability of resources. 10% APS was used to make both the stacking and resolving gels. The stacking gel consisted of 0.76 ml of 2.5M Tris (pH 8.8), 3.42 ml DI H 2 O, 150 μl APS, 0.66 ml of acrylamide bisphosphate (29:1, 30%), and 5.0 μl of TEMED. The Resolving gel consisted of 5.6 ml of 2.5 M Tris (pH 8.8), 0.90 ml of DI H 2 O, 3.33 ml of AB (29:1, 30%), 150 μl of APS, 6.0. A running buffer was made of 3.03 g of Tris base, 4.5 g Tricine, 0.5 g SDS in 1 L of DI water.

Results and Discussion: The Tricine gel does not show a clear separation of proteins 5-30 kD in size. While there are some bands that are indicative of the presence of Gag and the Gag-Pro polyprotein, there appears to be no signs of the RT protein. The Tricine gel experiment was unsuccessful in separating the protease and the smaller RT protein. This may have been due to the modifications made to Haider et al., 2012’s protocols, such as running the gel at 70V stacking and 150V running rather than the continuous 125-150V, however, these modifications should not have significantly impacted the overall experiment. The protease cleavage sites will be determined via mass spectrometry and trypsin digest rather than redoing another tricine gel experiment, since the mass spectrometry will offer more accurate results.

Figure 1. CR2-Gag-Prot-RT 35 C; Inclusion body purification without lysozyme run on a tricine gel, (4% Stacking, 10% resolving). The gel does not appear to be able to successfully separate proteins of 5-30 kDa or less as the lowest readable bands are around 50 kDa, indicative of the Gag protein.

Figure 2. CR2-Gag-Prot-RT 28C; inclusion body purification without lysozyme run on a tricine gel, (4% Stacking, 10% resolving). The gel does not appear to be able to successfully separate proteins of 5-30 kDa or less as the lowest readable bands are around 50 kDa, indicative of the Gag protein.

V. CR2 Gag-Prot-RT Purification

Purpose: To produce higher quality SDS-PAGE gel images of the soluble vs. insoluble fractions and to determine if enough expressed protein may be purified from the soluble fraction for future work.

Attributions: Purifications were performed by Ryan Shontell, Gina Watanabe, Jonathan Tello and John Banasihan.

Materials and Methods: A soluble protein purification on the CR2 Gag-Prot-RT was performed according to our standard protocol. The proteins collected for this experiment were induced at 28°C and 35°C. This was not done under denaturing conditions as there was no urea present in the in buffers used. Both the soluble and insoluble proteins were collected and tested on an SDS PAGE.

Results and Discussion: Gels comparing the soluble and insoluble fractions of CR2 Gag-Prot-RT were run to generate higher quality gels to determine which fraction the expressed proteins occur and to determine whether smaller degradation fragments of the CR2 Gag-Prot-RT construct become soluble.

Figure 1 depicts the soluble vs. insoluble fractions of CR2 Gag-Prot-RT induced at 28°C and 35°C. The putative 90 kDa full sized CR2 Gag-Prot-RT construct, 75 kDa CR2 Gag-Prot degradation fragment, and 50 kDa CR2 Gag degradation fragment occur in the insoluble fraction. Another interesting finding to note is the difference in intensity of the 75 and 50 kDa bands between the 28°C and 35°C inductions. It is possible that the protease is more active at lower temperatures and is there for able to cleave the 75 kDa intermediate into the 50 kDa fragment at a higher rate leading to the loss of the 75 kDa band and gain of a more intense 50 kDa band at 28°C. These findings suggest that a lower temperature induction should be conducted to screen for the smaller fragments of the construct corresponding to protease, RT tail, capsid, spacer, and nucleocapsid. In future insoluble purifications to screen for these smaller fragments, lysozyme should be avoided as it generates a very intense band in the 10-15 kDa region. We believe that the RT tail, protease, and nucleocapsid proteins are roughly this size. Protease, nucleocapsid, and spacer by themselves are not histidine tagged and so, would likely be found in the washes of a purification whereas the capsid and RT tail would be found in the elutions.

Figure 2 depicts the purification performed on the soluble fraction of the CR2 Gag-Prot-RT 28°C and 35°C cultures. Small amounts of the full-sized fragment do occur in the soluble fraction; however, the yield is very low and it is inadvisable to attempt future purifications form the soluble fraction. Washes were performed with wash buffer containing no imidazole. Future purifications will utilize 10mM and 30mM wash buffers to optimize the removal of non-histidine tagged proteins form the nickel resin.

Figure 1. CR2 Gag-Prot-RT 28℃ / 35℃ vs. Insoluble fractions. (1) Marker [Plus Protein Dual COlor Standard] (2) 28℃ Total Protein (3) 28℃ Soluble (4) 28℃ Insoluble (5) Empty (6) 35℃ Total Protein (7) 35℃ Insoluble. 5% stacking gel at 80V; 15% resolving gel 100V. The majority of induced protein occurs in the insoluble fraction. Note that possible degradation products at 75 and 50 kDa.

Figure 2. CR2 Gag-Prot-RT 28℃ / 35℃ Soluble Protein Purification. (1) Marker [Plus Protein Dual Color Standard] (2) 28℃ Total Protein (3) 28℃ Wash (4) 28℃ Elution (5) 28℃ Elution 2 (6) 35℃ Wash 1 (7) 35℃ Elution 1. (8) 35℃ Elution. 1.5% stacking gel at 70V; 15% resolving gel at 100V. 5% stacking gel at 80V; 15% resolving gel at 100V. Washed were performed with wash buffers without imidazole. 250 mM imidazole elution buffer was used.

VI. CR2 Gag-Prot-RT Degradation Fragment Extraction for Mass Spectrometry

Purpose: To perform gel extraction of the putative degradation fragments of CR2 Gag-Prot-RT to send to Taplin for analysis via mass spectrometry in order to determine the cleavage sites of CR2 Gag-Prot-RT construct.

Attributions: Gel extraction was performed by Vishal Negi and Ryan Shontell.

Materials and Methods: A glass plate was sterilized with 70% ethanol and separate surgical blades were prepared for each band we excised from the gels. The desired SDS-PAGE gel was then placed on the glass plate and the band of interest was excised. The excised band was then placed into a 1.5 mL microcentrifuge tube and 1 mL of miliq H2O was added. The samples were then labeled and the caps were wrapped in parafilm. Bands were extracted from three separate gels. Gels were selected by Vishal Negi based on the resolution of the gel and intensity of the bands. The 90 kDa band likely corresponds to the full-sized CR2 Gag-Prot-RT. We believe that Band 2 (75 kDa) and Band 3 (50 kDa) are degradation products of the full-sized construct corresponding to CR2 Gag-Prot and CR2 Gag, respectively. Data from the mass spec will provide us with the cleavage sites necessary to accurately produce future constructs of the mature Gag and mature protease.

Results and Discussion: To determine protease cleavage sites, the ~100 kDa, ~75 kDa, and ~50 kDa bands representing the full polyprotein (CR2 Gag-Protease-RT ~90 kDa), the partial polyprotein (CR2 Gag-Protease ~62-77 kDa), and the full gag (~52-54 kDa) respectively were extracted from three separate gels of cells induced at 35℃ selected by Vishal Negi and Ryan Shontell. Selection was based off of resolution of the gel and band intensity (Fig. 1). Samples were sent to the Taplin Mass Spectrometry Facility on June 1, 2018, and received by the facility on June 5, 2018. Results were obtained on June 29, 2018. We were notified that the samples were contaminated and contained fragments from all three bands. However, filtering allowed for the determination of the most relevant fragments for each sample.

Tryptic fragments for each band were mapped on to the CR2 Gag-Protease-RT GenScript construct using fragment data copied from the Taplin results page into an Excel sheet and sorted by position. The 100 kDa fragments covered 90.04 kDa, the 75 kDa fragments covered 56.76 kDa, and the 50 kDa fragments covered 47.88 kDa of the full construct. These fragments confirmed a protease cleavage site downstream of the nucleocapsid in the form of an 18 bp gap between the end of the last 50 kDa fragment (6 bp upstream of the 3’ end of the CCHC Zinc Finger domain) and the beginning of the next 75 kDa and 100 kDa fragments (12bp after the 3’ end of the CCHC Zinc Finger). The second cleavage site between the protease and reverse transcriptase domains could not be identified as the 75 kDa fragments did not exceed 56.76 kDa, with the most upstream protease cut site observed being located well within the protease domain. This suggests that either the wrong band was excised from the PAGE gel or that the protease has auto digestive activity under the conditions used.

Figure 1. Image of gel and corresponding bands sent to Taplin for analysis by mass spectrometry. Bands were excised from the gel and then placed into 1.5 mL microcentrifuge tubes with 1 mL of Miliq H2O. Samples were shipped to Harvard on 6-1-18 and received at the facility 6-5-18.

Figure 2. Image of gel and corresponding bands saved for analysis by mass spectrometry at the local proteomics facility. Bands were excised from the gel and then placed into 1.5 mL microcentrifuge tubes with 1 mL of Miliq H2O. Samples were shipped to Harvard on 6-1-18 and received at the facility 6-5-18.

DESIGNING CONSTRUCTS FOR MATURE VLP FORMATION

I. Designing CR2 Gag Amplicons

Purpose: To design amplicons that will create the proteins necessary for VLP formation. Within the Gag region of this CR2 construct lies a capsid region (CA) and nucleocapsid region (NC). As we are unsure of the role of the CA and NC regions in VLP formation we created amplicons depending on where the protease cut sites are.

Attributions: These amplicons were designed by Gina Watanabe and finalized by the entire team, including Dr. Gernot Presting and Ryan Shontell.

Materials and Methods: To design amplicons that will create proteins necessary for VLP formation we began by using the original CR2 Gag-Prot-RT sequence. Figure 1 portrays the relationship between the Gag domain and the protease along with the N-terminus and C-terminus TEV/His-tag regions. We amplified off of this sequence to obtain the amplicons described in Table 1.

Figure 1. Picture depicting the CR2 construct with the protease. Also highlights the the N-terminus and C-terminus TEV and HIS Tag regions.

Table 1. Table of all of the different amplicons that were created. The NcoI and BamH1 are the restriction enzyme cut sites used for the ligation protocol. HisTEV is a tag used as a key element in the protein purification i.e. the nickel column we use binds to the His region and the TEV is another restriction enzyme site used to cut off the His region when it is no longer needed. CA stands for capsid; NC stands for nucleocapsid; CA 1, 2, and 3 are different portions of the construct obtained by looking at different subfamilies to determine a consensus. The blue highlighted portion represents the N-terminal amplicons and the red represent the C-terminal.

Assigned #	Amplicon	Size (bp)
1	NcoI + HisTEV - CA - NC + BamH1	1304
1a	NcoI + HisTEV - CA - NC + Stop + BamH1	1307
2	NcoI + HisTEV - CA1 + BamH1	971
2a	NcoI + HisTEV - CA1 + Stop + BamH1	974
3	NcoI + HisTEV - CA2 + BamH1	1154
3a	NcoI + HisTEV - CA2 + Stop + BamH1	1157
3.5a	NcoI + HisTEV - CA3 + Stop + BamH1	1187
4a	NcoI + CA - NC + TEVHis + Stop + BamH1	1298
5a	NcoI + CA1 + TEVHis + Stop + BamH1	965
7a	NcoI + CA3 + TEVHis + Stop + BamH1	1178

Results and Discussion: To be able to successfully amplify the C-terminus amplicons, a two step PCR was performed to sequentially add a C-terminal purification tag consisting of a TEV cleavage site and an 8X histidine-tag.

The CA-NC amplicon the cut site was determined using the mass spectrometry results from the protease cut site experiments (shown in Fig 2). For amplicons CA1, CA2, and CA3 the 3’ cut sites were determined by looking at different subfamilies of centromeric retrotransposons to determine a consensus at which 3 hypotheses were then determined. Fig 2 is the visualization of Fig 1, showing the relationship between the 4 different amplicons.

Figure 2. Figures showing the relationship between the CA-NC, CA1, CA2, and CA3 amplicons. These figures depict the different cut sites on the 3’ end of the construct done by the protease, determining where the amplicon will begin and end. Base 125-600 were cut out to fit on page.

II. Transformation of Amplicons

Purpose: Once successfully transformed into BL21 we will express the proteins and purify them for VLP formation experimentation.

Attributions: Protocol for the N-terminus amplicon transformation were done by John Banasihan, and the C-terminus by Gina Watanabe. A second attempt for the C-terminus was conducted by Jonathan Tello and Shelby Roberson.

Methods and Materials: The major steps include; double-digestion using NEB-HF restriction enzymes, vector-insert ligation, DH5-α transformation, colony PCR, plasmid sequencing, PCR product sequencing, BL21(DE3) pLySs transformation, and finally BL21(DE3) pLySs induction. These protocols can all be found on our protocol page.

In short, we used the amplicons mentioned above as template DNA in a PCR reaction, these products were then digested and ligated to a vector which will then be taken up by DH5a. Due to the antibiotic resistance gene found in the vector, the DH5a colonies that grew had to be the ones that successfully took up the vector. A colony PCR was then performed on the successful colonies and the product from this PCR was sequenced to verify that it is in fact the correct amplicon. All of these steps were repeated except the vector was inserted into BL21 cells.

Results and Discussion: The following figures are agarose gel runs of the colonies that successfully expressed the gene that was inserted. The reason there are multiple lanes with the same amplicon is that we chose multiple colonies expressing the same gene.

For the N-terminal amplicons we were able to successfully express the 1, 2, 2a, 3a, and 3.5a amplicons and the proteins were then purified. Amplicons 1a, 2, and 3 were not completed due to a time constraint and the fact that the amplicons we did express were very similar. Now for the C-terminal amplicons we were unable to express the proteins. We used the same protocols for the N-terminus as we did for the C-terminus yet transformation into the DH5a was unsuccessful. C-terminus amplicon transformation was attempted multiple times and every attempt was unsuccessful. Some reasons for this are; (1) Improper parameters for the PCR reactions (2) Unsuccessful annealing of the 2nd step primers (as the first step was always successful even for the C-terminus amplicons) (3) Contamination that interfered with the PCR reaction (4) Problems with the DH5a competent cells i.e. maybe they were too old (5) The restriction enzymes did not properly digest the 2nd step PCR products.

Figure 1 is an agarose gel run of amplicons 1 and 2a. Amplicon 1 appears clearly on the gel while amplicon 2a did not. A possible reason for this is the BL21 colony, chosen to be used for colony PCR, did not successfully express the amplicon. It is possible that the colony chosen was a satellite colony which explains why it was able to grow even in the presence of the antibiotics. After this gel confirmation the colony that was chosen for this gel was then induced to express the amplicon 1 protein. This protein was then purified and stored away for further VLP experimentation. For the 2a amplicon the transformation process executed again since it did not appear on the agarose gel.

Figure 2 is an agarose gel run of amplicons 1, 3.5a, and 7. Amplicon 1 appears clearly on the gel while amplicons 3.5a and 7 did not. Possible reasons for this are mentioned in the above paragraph. Since neither 3.5a and 7 did not show up, another transformation process was executed.

Figure 3 is an agarose gel run of amplicons 3a and 2a. Both amplicons clearly show up in almost all lanes depicting that all of the colonies that grew expressed the inserted gene. These colonies were then induced to created more of their protein and this protein was purified and store away for further VLP experimentation.

Figure 4 is an agarose gel run of amplicons of 3a and 3.5a. Both amplicons clearly show up in almost all lanes depicting that all of the colonies that grew expressed the inserted gene. These colonies were then induced to created more of their protein and this protein was purified and store away for further VLP experimentation.

Figure 1. An agarose gel of the PCR products from BL21. The top portion of the gel is are the products of a colony PCR of BL21 transformed with amplicon 1. The bottom portion of the gel are the products of a colony PCR of BL21 transformed with amplicon 2a. The gel was run on a 1% agarose gel at 90V for 40 minutes.

Figure 2. An agarose gel of the PCR products from BL21. The top portion of the gel is are the products of a colony PCR of BL21 transformed with amplicon 1. The bottom portion of the gel are the products of a colony PCR of BL21 transformed with construct 7 in lanes 2-6 and construct 3.5a in lanes 7-8 . The gel was run on a 1% agarose gel at 90V for 40 minutes.

Figure 3. An agarose gel of the PCR products from BL21. The top portion of the gel is are the products of a colony PCR of BL21 transformed with construct 3a. The bottom portion of the gel are the products of a colony PCR of BL21 transformed with amplicon 2a. The gel was run on a 1% agarose gel at 90V for 40 minutes.

Figure 4. An agarose gel of the PCR products from BL21. The top portion of the gel is are the products of a colony PCR of BL21 transformed with construct 3a. The bottom portion of the gel are the products of a colony PCR of BL21 transformed with construct 3.5a . The gel was run on a 1% agarose gel at 90V for 40 minutes.

VLP FORMATION

I. Purification of the Various Amplicons Proteins

Purpose: To purify all various gag protein constructs and prepare them for VLP assembly. We will store these proteins in the designated VLP assembly buffer in -80 C.

Attributions: John Banasihan was the primary lead on the VLP formation experiments. Ryan Shontell assisted him in the imaging of the VLPs.

Materials and Methods: ON cultures of each protein construct were prepared and each was induced in 100 ml cultures for 2 hours at 28oC. Insoluble protein purification was performed under denaturing conditions. Each proteins was washed with denaturing wash buffer and denaturing elution buffer three times. Each wash and subsequent elution were ran on a 15% PAGE gel to confirm the presence of the target protein.

Results and Discussion: The following images are SDS PAGE runs of the purification process. The now purified proteins were used for further experimentation. All of the previous amplicons stated above were successfully purified. Each construct had bands corresponding to the correct protein size. The following figures represent the constructs 1, 2a, 3, 3a, and 3.5a purified proteins.

Figure 1. Construct 1 purified from BL21 culture induced at 28℃. Expected size of 48.90 kDa matches with the observed bands of protein on the gel. This was run at 70V stacking for 20 minutes and 100V resolving for 60 minutes on a 15% PAGE.

Figure 2. Construct 2a purified from BL21 culture induced at 28℃. Expected size of 36.98 kDa matches with the observed bands of protein on the gel. This was run at 70V stacking for 20 minutes and 100V resolving for 60 minutes on a 15% PAGE.

Figure 3. Construct 3 purified from BL21 culture induced at 28℃. Expected size of 43.44 kDa matches with the observed bands of protein on the gel. This was run at 70V stacking for 20 minutes and 100V resolving for 60 minutes on a 15% PAGE.

Figure 4. Construct 3a purified from BL21 culture induced at 28℃. Expected size of 43.44 kDa matches with the observed bands of protein on the gel. This was run at 70V stacking for 20 minutes and 100V resolving for 60 minutes on a 15% PAGE.

Figure 5. Construct 3.5a purified from BL21 culture induced at 28℃. Expected size of 43.44 kDa matches with the observed bands of protein on the gel. This was run at 70V stacking for 20 minutes and 100V resolving for 60 minutes on a 15% PAGE.

II. Assembly of the Purified Proteins using Various Assembly Buffer Conditions

Purpose: To optimize gag VLP assembly and determine if assembly is favorable under acidic or basic conditions and to determine the effect of having a 8X His-tag on VLP Assembly.

Attributions: John Banasihan followed a modified version of Ryan’s protocol on VLP assembly and created an assay that tested the tagged and untagged Capsid only protein (Construct 3a).

Materials and Methods: A 200 ml induction was performed for construct 3a (HisTEV-CA2). The inclusion bodies were purified according to standard protocol without lysozyme. The proteins were separated for further His-cleavage. For the tagged proteins, a buffer exchange was immediately performed to suspend the proteins in the three different VLP buffers, acidic, neutral, and basic. The proteins selected for His cleavage were dialyzed with non-denaturing wash buffer and then cleaved with AcTEV protease overnight. The cleaved proteins were then dialyzed with the three different VLP buffers. All protein concentration were determined via nanodrop and 100 ng from each protein would be used for VLP assembly.

VLP Assembly reactions would contain 100 ng of each protein, 20 mM DTT, 200 mM MgCl2, and the protein’s VLP assembly buffer to fill up to 20 μl. Once the DTT and MgCl2 were added, the assembly reactions were incubated in a water bath at 30 C for 3 hours. Once complete, 2 μl of 100% glycerol were added. 6 μl of each sample were taken for observation under the EM at the University of Hawaii and Manoa.

Results and Discussion: The VLP assay tested for the effectiveness of VLP assembly under these conditions: (1) The effect of acidity and (2) the effect the presence of a His-tag has on VLP assembly. First, we were able to confirm that our gag protein variant, construct 3a (CA2) could form VLP’s. The EM images of the CA2 VLPs revealed larger VLP formation under acidic conditions and little to no formation under basic conditions (Figure 1 and 3). Under basic conditions, the EM images showed mainly VLP intermediates of the capsid proteins. This suggests that the gag VLP formation may be hindered but not completely prohibited under basic conditions. Both tagged and untagged VLPs formed under acidic conditions and neutral conditions (Figure 1 and 2). This suggests that tagged capsid proteins do not entirely prohibit VLP assembly. The average diameter size for the tagged capsid proteins assembled in acidic conditions was 42.31 nm while the average diameter size for the untagged capsid proteins was 36.11 nm (Table 1). However, the sample size was too small to perform a true statistical analysis, so these are mainly observations.

Table 1. Average VLP diameter for all samples. Each sample size (n) is different. Acidic-tagged n = 10, acidic-untagged n = 8, neutral untagged n = 3, and basic-tagged n = 5.

Assembly Buffer pH	Tagged VLP diameter (nm)	Untagged VLP diameter (nm)
acidic	42.31	36.11
neutral	No VLPs	28.8
basic	32.96	No VLPs

Figure 1 Construct 3a with no His-tag using an assembly buffer with acidic pH. VLPs were imaged in the same buffer they were assembled with. This image was captured on a scale of 50 nm at 40,000x magnification using the Biological Electron Microscopy Facility at the University of Hawaii at Mānoa.The VLPs measured are 33.4 nm, 37.1 nm, 32.2 nm, 33.2 nm, 39.5 nm.

Figure 2. Construct 3a with His-tag using an assembly buffer with acidic pH. VLPs were imaged in the same buffer they were assembled with. This image was captured on a scale of 50 nm at 50,000x magnification using the Biological Electron Microscopy Facility at the University of Hawaii at Mānoa.The VLP’s measured out in the image have diameters of 28.1 nm, 44.7 nm, 44.9 nm, and 40.8 nm.

Figure 3. Construct 3a with a His-tag using an assembly buffer with basic pH. VLPs were imaged in the same buffer they were assembled with. This image was captured on a scale of 50 nm at 50,000x magnification using the Biological Electron Microscopy Facility at the University of Hawaii at Mānoa.

Figure 4. Construct 3a with no His-tag using an assembly buffer with basic pH. VLPs were imaged in the same buffer they were assembled with. This image was captured on a scale of 50 nm at 50,000x magnification using the Biological Electron Microscopy Facility at the University of Hawaii at Mānoa. Many capsid intermediates were observed, but no VLPs.

Figure 5. Construct 3a with a His-tag using an assembly buffer with basic pH. VLPs were imaged in the same buffer they were assembled with. This image was captured on a scale of 20 nm at 80,000x magnification using the Biological Electron Microscopy Facility at the University of Hawaii at Mānoa.

Figure 6. Construct 3a with no His-tag using an assembly buffer with a pH of 7.3. VLPs were imaged in the same buffer they were assembled with. This image was captured on a scale of 20 nm at 100,000x magnification using the Biological Electron Microscopy Facility here at the University of Hawaii at Mānoa. VLP diameter is measured at 35.4 nm.

III. Assembly of the Purified Proteins Constructs and Assembly Time Assays

Purpose: To assemble all other protein constructs into VLP’s using the standard VLP assembly buffer and to observe VLP assembly at different stages of time. Construct 3a was also purified and assembled under acidic and basic conditions to confirm the results from the first VLP experiment.

Materials and Methods: Construct 1 (HisTEV-CA-NC) insoluble proteins were purified into acidic VLP assembly buffer, and then some proteins were separated for His-tag cleavage. Construct 2a (His-TEV-CA1) was also purified and had it’s his-tag cleaved. Construct 3a proteins taken from the last protein purification stock its VLPs were assembled under acidic and basic conditions. During the water bath incubation, Construct 1 VLPs were removed at 1 hour, 2 hours, and 3 hours. Only 100 ng/ul of protein was used for all assemblies. All VLP samples were observed under the TEM at the University of Hawaii at Manoa.

Results and Discussion: Many of the samples observed under the EM showed little to no VLPs suggesting that something had gone wrong during the assembly process. This suggests that the full gag construct may not react well in an acidic assembly buffer, however there was very little VLPs observed for Construct 3a assembled at acidic condtions; a protein that has already been proven to assemble under acidic conditions. However, some VLPs were observed in Construct 1 VLPs (Figure 1). This experiment would have to be repeated in order to obtain more reliable results. Construct 1 may have to be assembled under neutral or basic conditions to test for optimal assembly conditions for the full gag, or more protein should be used for the VLP assembly.

Figure 1. A potential VLP that was assembled in only 1 hour.

Figure 2. Some potential VLPs were observed for untagged construct 1 assembled at 2 hours.

Figure 3. Mostly VLP intermediates were observed for untagged construct 1 assembled at 3 hours. No possible VLPs were identified.

Team:Hawaii/Experiments

PROTEASE CLEAVAGE SITE ASSAY