Team:Duke/Model




Homology Modeling Overview

As mentioned in the project description, our goal is to link five genes from the taxol biosynthesis pathway. In order to better understand the behavior of the proteins that we have isolated from the taxol biosynthesis pathway, we are using homology modeling to learn about active site architecture and catalytic functions. Homology modeling is based on the observation that related protein structures tend to have similar 3-D structures and functions. During homology modeling, 1 or more template proteins are used to identify structurally conserved regions, and to predict structurally variable regions that often include mutations from an already known structure. Through homology modeling, we can learn a lot about details about the protein such as active site architecture, ligand binding, and etc. Usually, when the target sequence is 30-50% similar (30%-50% identical amino acids) to the template sequence, they will share 80%+ shared 3-D structures. During the modeling process, we will be looking for template sequences with 30%+ sequence identity as good models for the target sequence.

Tools

PyMol: a software modeling application that allows users to view the 3-d structure of any proteins, including secondary, tertiary, and quaternary structures and the molecular interactions between side chains. It's the main tool that we use to visualize active site, and terminals of proteins.

ModWeb: a web-based database with all protein templates present and target sequences are matched with reliable models of template proteins.

Literature: previously published scholarly articles on research of proteins that include reliable template sequences for our specific model. Many mutations and active site construction has already been discovered in these articles.

BAPT/DBAT

Template proteins: 5KJV/5KJS
Organism: A. thaliana
Belongs to BAHD family of acyltransferases (Members can be identified by sequence homology & universally conserved HXXXD & DFGWG motifs)

General features
2 quasi-symmetric N-terminal (1-171 & 374-394) & C-terminal (223-373 & 395-431) domains, connected by a long loop (172-222)

  • Each domain has beta sheet core flanked by alpha helices with similar spatial arrangement
  • The active site is at the interface of the domains

Structural features
Tau-nitrogen of His153, H-bond with shikimate 5-O of rho-coumaroylshikimate

  • A base that deprotonates 5-hydroxyl of the acyl acceptor substrate
Indolic N of Trp371 within H-bond distance of the carbonyl oxygen of rho-coumaroyl CoA
  • An oxyanion hole that stabilizes negative charge on the tetrahedral intermediate as a leaving group to produce ester product rho-coumaroylshikimate
Thr 369, H-bond with 3-hydroxyl of shikimate moiety
Arg 356, salt-bridge with carboxyl of shikimate moiety

Distinctive active site conformational states in HCTs
Apo & Holo AtHCT structures have active site conformational changes
Catalytic His153 (switchlike conformational shift)

  • Imidazole side chain stabilized in a 180o rotation relative to apo conformation
  • In rho-coumaryol-CoA-bound AtHCT, His153 adopts side chain rotation ~90o to apo state
Arg 356
  • In presence of rho-coumaroylshikimate, side chain stabilized inside active site
  • Likely mediated by electrostatic attraction between positively charged guanidinium side chain of arginine side chain & negatively charged carboxyl group of shikimate
3 loops (L1 (31-34), L2 (357-365), L3 (392-398) )
  • Shift inward upon binding of various ligands, causing active site to shrink

Universally Conserved Residues
An arginine handle dictates acyl acceptor specificity in HCT

  • Universal conservation of Arg356 & surrounding residues (354-362)
Other universally conserved residues: Thr369 & Trp371

More roles of Arg 356
Positively charged Arg356 side chain & negatively charged shikimate electrostatic attraction facilitates binding of acyl acceptor substrate to active site
Orients substrate’s functional groups properly relative to enzyme’s catalytic machinery
Contributes to catalytic mechanism

  • Salt bridge formation between arginine side chain and shikimate carboxyl confers acyl acceptor substrate binding affinity
  • Arg356 handle orients shikimate in a catalytically productive conformation in HCT’s active site, increase fraction of productive encounters
  • Mutating Arg356 resulted in a total loss of native enzymatic activity, mutating Arg356 to negatively charged residues increased specificity toward certain positive non-native substrates

badA

Template protein: 4RLQ
Organism: Rhodopseudomonas palustris (bacteria)
Class: PFAM00501 (ATP-dependent adenylation enzymes)
Plants and bacteria employ aroyl CoA thioesters for biosynthesis of specialized metabolites
Mechanism: ATP-dependent CoA ligases use 2 half-reactions to catalyze thioesterification

General Features
BadA has C-terminal/N-terminal domains joined by a flexible hinge

  • badA always in conformation primed for thioesterification
  • Nonconserved active site Lys427, present in badA active site in thiolation conformation, not required for adenylation but necessary for thioesterification
Our protein of interest 2-Me-BzO has characteristic N-/C-terminal domains of other ATP-dependent adenylases
  • N-terminal (1-434), C-terminal (435-522) domains fold analogously to the benzoate CoA ligase BCLM (2v7b)
  • N-terminal contains benzoate binding site
  • C-terminal domain contacts other edge of benzoate binding pocket, positions benzoate into active site through charged interaction between carboxylate of benzoate and Lys427

Features of active site
Largely hydrophobic
Para- carbon of BzO neighbors carbonyl of Leu332 along peptide backbone
2 meta- carbons of BzO point toward His333 & Ala227
Si- & re- faces of BzO positioned between backbone amide bonds, comprising Gly327, Ser328, Thr329 on 1 face, Tyr on other face
When Bz-AMP forms, Lys427 moves from interaction with carboxylate of substrate to several polar contacts with Bz-AMP & peptide backbone
Indicates Lys427 is needed for thiolation reaction

Rational mutation of the BadA active site
4 residues (Ala227, Leu332, His 333, Ile334) surround the phenyl ring of BzO in the active site
The residue positioned near a given carbon of BzO depends on whether BzO is in carboxylate-bound orientation or rotated Bz-AMP-bound orientation
Targeted mutations of Ala227Gly, Leu332Ala, His333Ala, Ile334Ala should show relative importance of 2 orientations

BadA structures & Homology Enzymes in this family typically fold in larger N-terminal domain (400-550 residues) & a smaller C-terminal domain (~110 residues), active site lies in between
N-terminal domain contains residues that bind carboxylate substrate
C-terminal residues coordinate ribose & phosphate groups
Phe226 residue in badA offset by ~72o away from BzO, opening CoA binding channel

Catalytically Important Lysine Residues Uses C-terminal lysine (Lys427/512 for BadA) to orient BzO substrate
Lys427 coordinates BzO in the active site & Lys512 is far from active site and solvent-exposed in C-domain
Lys427 of BadA makes 4 polar contacts with Bz-AMP, 1 with benzoyl oxygen, 3 with AMP moiety, and 2 to Gly303 & Gly430 (these contacts anchor Bz-AMP)
Predict Lys427Ala-badA mutant would show the second thioesterification of badA

  • When badA assumes the adenylation conformation, Lys572 enters the active site

Tax10

Template protein:4KE4
Organism: Sorghum bicolor
hydroxycinnamoyltransferase(HCT) participates in early step of phenylpropanoid pathway

Structure of apo-form SbHCT
domains (I & II) with 16 beta-strands & 17 alpha-helices

  • Domain I (1-193, 389-409)
  • Domain II (194-388, 410-448)
  • Both domains have mixed beta-sheet flanked by alpha-helices on both sides of the sheet
Beta-sheets in core of domain I arranged in 1-6-7-8-3-14
Beta-sheets in core of domain II arranged in 9-16-15-13-10-11-12
Domain I has another anti-parallel beta-sheet (2-5-4) perpendicular to core beta-sheets
Residues of 2 core beta-sheets were hydrophobic surface residues
5 helices connecting beta12 & beta 13 were interrupted by Gly & Pro residues
2 regions of high temperature factors in SbHCT
  • 182-236 linked domains I & II
  • C-terminal end of alpha 10 & 257-267 (contained 3 Arg residues)

Substrate-binding Site
Visualize position of p-coumaroyl-CoA & shikimate

  • Both located between domains I & II
Binding pocket is a tunnel throughout the enzyme
  • One side formed by amino end of alpha7 & the carboxy ends of beta10 & beta13
  • Other side formed by amino ends of beta3, alpha15, alpha16, and the loop connecting beta14 and beta15
  • Residues lining the walls were hydrophobic
  • Tunnel had a preference for molecules capable of adopting a linear chain formation
Pantothenate unit located between beta10 & beta13 of domain II
  • Carbonyl oxygen of pantothenate interacted with backbone amide of Asp-300
  • Guanidinium side chains of Arg-252 & Arg-304 weakly interacted with phosphate groups of HS-CoA
Phenolic moiety of p-coumaroyl shikimate anchored to both the hydroxyl side chains of Ser38 & Tyr-40 by an ordered solvent molecule
Shikimate moiety also coordinated by the neighboring residues
  • Hydroxyl group on C3 atom forms H-bond with hydroxyl side chains of Thr-384
  • Carbonyl group forms salt bridge with guanide group of Arg-371

Characterization of steady-state kinetics for SbHCT and its mutant forms
6 mutants (T36A, S38A, Y40A, H162A, R371A, T384A)
They contain a similar amount of secondary structures

Identification of related BAHD family members via distance matrix alignment and BLAST sequences
All contain HXXXD motif (catalytic His site), some also share DFGWG motif (distal to active site)
CoA dependent enzymes had 2 mixed beta-sheets at core
162HHVAD166 located on alpha 6, at interface between 2 domains

  • His 162 at middle of tunnel
  • Carboxyl side chain of Asp-166 (salt bridge with Arg302)
395DFGWG399 motif located between beta13 and beta14
  • Core of domain I & II respectively
  • Likely involved in stabilizing 2-domain structure

Substrate binding & catalytic mechanism
Likely p-coumaroyl-CoA & shikimate enter active site on opposite faces of enzymes
Likely p-coumaroyl-CoA preconditions binding of shikimate
Binding pocket for shikimate hydrophobic

  • Val31, Pro32, Ala 298, Ile318, Phe376, Leu414, Phe416, Leu418
  • These residues not conserved
  • 2 polar residues Arg371, Thr384 at locations for binding a shikimate molecule
  • Proposed mechanism with H162 & T36

Substrate specificity of SbHCT and other members of BAHD family
Residue activity in S38A & Y40A is consistent with the subtle role of these residues in substrate binding
T36A (6%), T384A(9%), S38A(40%), Y40A(62%) reduced activity

TycA

Template proteins: 2VSQ, 5N81, 2XHG, 5ISW
Organism: Multiple

Overview
L-Phe-specific TycA initiation module (TycAf herein) from tyrocidine biosynthesis

  • Adenylation (A), thiolation (T), epimerization (E) domain
  • Preference for native substrate L-Phe
  • T-domains were subsequently primed with the ppant arm using Sfp 4’-phosphopantetheinyl transferase and coenzyme A (CoASH)

Reprogramming TycAF for (S)-beta-Phe
Simultaneous randomization of several positions in the tycA active site
Share conserved residues Asp235 & Lys517

  • Asp235 bind backbone amino groups of the substrate
  • Lys517 bind carboxylate group of the substrate

Catalytic Functions
Replaced T328-S329-I330-C331 in beta13beta14 loop of TycApgamma with fully randomized tripeptide

  • A236 randomized for structural adjustments
Fully randomized beta13beta14 loop converged to a Xaa-Leu-Val motif
  • Xaa=Ala, Thr, Cys, Val, Leu
Chose variant TycAbetapgamma with A236V mutation and a Cys-Leu-Valbeta13beta14 loop sequence and auxillary W239S substitution
These changes inverted the substrate preference in favor of beta-amino acid
  • Reversion of W239S mutation afforded variant TycAbetaF, 220:1 preference for (S)-Phe over L-Phe
  • This was changed while maintain high catalytic efficiency
Structural analysis of alpha/beta switch
C-terminally truncated A domain of TycAbetapgamma & TycAbetaF
Invariant residue Asp235 and auxiliary position Trp/Ser239, used to enlarge the binding pocket
TycAbetaF-AN & A domain of GrsA (a close homologue of TycAf)
  • Engineering of position 236 and loop connecting beta-sheets 13& 14 resulted in a reshaped substrate binding pocket that accommodates rotated aryl group
Beta-aminoacyl-AMP analogues
  • AMP moiety in conserved A-domain nucleotide pocket
  • The appended beta-amino acid docked in remodeled substrate recognition site
W239S mutation enlarges recognition site, providing space for the propargyl substituent used for screening
Beta-amine of (S)-beta-Phe conserved salt bridge with Asp235, which orients it syn-periplanar to sulfamate NH

Downstream processing of beta-amino acids
Recombinantly produce tetramodular synthetase GrsB to test TycAbetaF
Engineered NRPS consisting of TycAbetaf & GrsB reconstituted in E. Coli HM0079