As mentioned in the project description, our goal is to link five genes from the taxol biosynthesis pathway. In order to better understand the behavior of the proteins that we have isolated from the taxol biosynthesis pathway, we are using homology modeling to learn about active site architecture and catalytic functions. Homology modeling is based on the observation that related protein structures tend to have similar 3-D structures and functions. During homology modeling, 1 or more template proteins are used to identify structurally conserved regions, and to predict structurally variable regions that often include mutations from an already known structure. Through homology modeling, we can learn a lot about details about the protein such as active site architecture, ligand binding, and etc. Usually, when the target sequence is 30-50% similar (30%-50% identical amino acids) to the template sequence, they will share 80%+ shared 3-D structures. During the modeling process, we will be looking for template sequences with 30%+ sequence identity as good models for the target sequence.

Tools

PyMol: a software modeling application that allows users to view the 3-d structure of any proteins, including secondary, tertiary, and quaternary structures and the molecular interactions between side chains. It's the main tool that we use to visualize active site, and terminals of proteins.

ModWeb: a web-based database with all protein templates present and target sequences are matched with reliable models of template proteins.

Literature: previously published scholarly articles on research of proteins that include reliable template sequences for our specific model. Many mutations and active site construction has already been discovered in these articles.

BAPT/DBAT

Template proteins: 5KJV/5KJS Organism: A. thaliana Belongs to BAHD family of acyltransferases (Members can be identified by sequence homology & universally conserved HXXXD & DFGWG motifs)

2 quasi-symmetric N-terminal (1-171 & 374-394) & C-terminal (223-373 & 395-431) domains, connected by a long loop (172-222)

Each domain has beta sheet core flanked by alpha helices with similar spatial arrangement
The active site is at the interface of the domains

Structural features

Inspiration

Here are a few examples from previous teams:

Team:Duke/Model

Homology Modeling Overview

Tools

BAPT/DBAT

Inspiration