(Created page with "<html> </div> </div> </div> </div> </div> </div> </div> </div> </div> </head> </body> </section> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8" />...") |
|||
(18 intermediate revisions by 2 users not shown) | |||
Line 92: | Line 92: | ||
<div id="logo"> | <div id="logo"> | ||
<a href="index.php" class="standard-logo" data-dark-logo="images/logo-dark.png"><img src="https://static.igem.org/mediawiki/2018/1/18/T--Vilnius-Lithuania-OG--logo-black.png"></a> | <a href="index.php" class="standard-logo" data-dark-logo="images/logo-dark.png"><img src="https://static.igem.org/mediawiki/2018/1/18/T--Vilnius-Lithuania-OG--logo-black.png"></a> | ||
− | + | ||
</div><!-- #logo end --> | </div><!-- #logo end --> | ||
Line 115: | Line 115: | ||
<div class="swiper-container swiper-parent"> | <div class="swiper-container swiper-parent"> | ||
<div class="swiper-wrapper"> | <div class="swiper-wrapper"> | ||
− | <div class="swiper-slide" style="background-image: url(' | + | <div class="swiper-slide" style="background-image: url('https://static.igem.org/mediawiki/2018/d/d7/T--Vilnius-Lithuania-OG--protgan.jpg');"> |
<div class="container clearfix"> | <div class="container clearfix"> | ||
<div class="slider-caption slider-caption-center"> | <div class="slider-caption slider-caption-center"> | ||
− | <h2 data-caption-animate="fadeInUp" style="color:aliceblue; "> | + | <h2 data-caption-animate="fadeInUp" style="color:aliceblue; ">Protein GAN</h2> |
<p class="d-none d-sm-block" data-caption-animate="fadeInUp" data-caption-delay="400"></p> | <p class="d-none d-sm-block" data-caption-animate="fadeInUp" data-caption-delay="400"></p> | ||
</div> | </div> | ||
Line 150: | Line 150: | ||
<li><a href="#" data-href="#section-building">Building ProteinGAN</a></li> | <li><a href="#" data-href="#section-building">Building ProteinGAN</a></li> | ||
<li><a href="#" data-href="#section-results">Results</a></li> | <li><a href="#" data-href="#section-results">Results</a></li> | ||
+ | <li><a href="#" data-href="#section-deeper">Deeper look at ProteinGAN</a></li> | ||
</ul> | </ul> | ||
</nav> | </nav> | ||
Line 183: | Line 184: | ||
<p>After seeing multiple successful applications of GAN (generative adversarial networks) in numerous of fields, we have decided to apply them to the <strong>field of synthetic biology</strong> for the <strong>creation of novel biological parts</strong> with useful functions. More specifically, we were interest in world’s cleanest and environmentally friendly catalyzers - <strong>enzymes</strong>. For many important reactions used in research or industry <strong>we don’t have the appropriate enzymes</strong> to catalyze them, and have not other option but to use chemical catalyzers.</p> | <p>After seeing multiple successful applications of GAN (generative adversarial networks) in numerous of fields, we have decided to apply them to the <strong>field of synthetic biology</strong> for the <strong>creation of novel biological parts</strong> with useful functions. More specifically, we were interest in world’s cleanest and environmentally friendly catalyzers - <strong>enzymes</strong>. For many important reactions used in research or industry <strong>we don’t have the appropriate enzymes</strong> to catalyze them, and have not other option but to use chemical catalyzers.</p> | ||
− | <p>Thus, we have decided to build the <strong>world’s first Protein Generative Adversarial Network</strong> (ProteinGAN) which would be capable to learn “what makes protein a protein”. We have started by acquiring and standardizing large number of protein sequences from public databases, which all had a specific class attributed to them.</p> | + | <p>Thus, we have decided to build the <strong>world’s first Protein sequence Generative Adversarial Network</strong> (ProteinGAN) which would be capable to learn “what makes protein a protein”. We have started by acquiring and standardizing large number of protein sequences from public databases, which all had a specific class attributed to them.</p> |
<p><br />After in-depth literature analysis and a large number of in-silico prototypes we have built the appropriate GAN architecture for protein work. Finally - we have trained the neural networks with specific classes of enzymes. We have hoped they would learn how to generate the class of enzymes they were trained for, yet also deliver unique protein sequences for that class.</p> | <p><br />After in-depth literature analysis and a large number of in-silico prototypes we have built the appropriate GAN architecture for protein work. Finally - we have trained the neural networks with specific classes of enzymes. We have hoped they would learn how to generate the class of enzymes they were trained for, yet also deliver unique protein sequences for that class.</p> | ||
− | <p>All important technical details, architectural choices and detailed explanation of how ProteinGAN works can be found at the end of the page.</p> | + | <p>All important technical details, architectural choices and detailed explanation of how ProteinGAN works can be found at the <a href="https://2018.igem.org/Team:Vilnius-Lithuania-OG/ProteinGAN#section-deeper">end of the page.</a></p> |
− | <p>In addition to that, we also provided <strong>a short guide on how to build and train your own ProteinGAN</strong>!</p> | + | <p>In addition to that, we also provided <strong> <a href="https://2018.igem.org/Team:Vilnius-Lithuania-OG/ReactionGAN#section-run">a short guide on how to build and train your own ProteinGAN </a></strong>!</p> |
Line 354: | Line 355: | ||
− | + | <section id="section-deeper" class="page-section"> | |
<div class="fancy-title title-border-color"> | <div class="fancy-title title-border-color"> | ||
<h2>Deeper look at ProteinGAN</h2> | <h2>Deeper look at ProteinGAN</h2> | ||
Line 463: | Line 464: | ||
<p>Using the scores from discriminator, each part of the GAN is evaluated using loss function. </p> | <p>Using the scores from discriminator, each part of the GAN is evaluated using loss function. </p> | ||
− | < | + | <img style=" width: 65%; display: block; margin-left: auto; margin-right: auto; margin-bottom: 5%;" src="https://static.igem.org/mediawiki/2018/e/ef/T--Vilnius-Lithuania-OG--Formula.gif"> |
+ | <img style=" width: 35%; display: block; margin-left: auto; margin-right: auto; margin-bottom: 5%;" src="https://static.igem.org/mediawiki/2018/4/40/T--Vilnius-Lithuania-OG--Formula2.gif"> | ||
Line 526: | Line 528: | ||
</div> | </div> | ||
− | <p>Given original GAN formulation, there is nothing to prevent generator from generating a single, very realistic example to fool the discriminator. Such scenario is known as mode collapse. It happens when generator learns to ignore the input (random numbers). Logically, it is an efficient way for generator to start generating examples that could fool discriminator. However, it is not desirable behaviour and it eventually cripples the training as discriminator can easily remember generated examples. While working with proteins, we observed that this issue is even more severe in comparison to images. In scientific community, a lot of different approaches were proposed to address the mode collapse issue: Unrolled GAN (Metz et al., 2017), Dual Discriminator (Nguyen et al., 2017), Mini batch Discriminator (Salimans et al., 2016) to name a few. We preferred Mini Batch Discriminator approach due to its simplicity and minimal overhead. Mini Batch Discriminator works an extra layer in the network that computes the standard deviation across the batch of examples (batch contains only real, or only fake sequences). If the batch contains a small variety of examples standard deviation will be low and discriminator will be able to use this information to lower the final score for each example in the batch. ProteinGAN | + | <p>Given original GAN formulation, there is nothing to prevent generator from generating a single, very realistic example to fool the discriminator. Such scenario is known as mode collapse. It happens when generator learns to ignore the input (random numbers). Logically, it is an efficient way for generator to start generating examples that could fool discriminator. However, it is not desirable behaviour and it eventually cripples the training as discriminator can easily remember generated examples. While working with proteins, we observed that this issue is even more severe in comparison to images. In scientific community, a lot of different approaches were proposed to address the mode collapse issue: Unrolled GAN (Metz et al., 2017), Dual Discriminator (Nguyen et al., 2017), Mini batch Discriminator (Salimans et al., 2016) to name a few. We preferred Mini Batch Discriminator approach due to its simplicity and minimal overhead. Mini Batch Discriminator works an extra layer in the network that computes the standard deviation across the batch of examples (batch contains only real, or only fake sequences). If the batch contains a small variety of examples standard deviation will be low and discriminator will be able to use this information to lower the final score for each example in the batch. ProteinGAN follows the approach proposed by authors of Progressively growing GAN (Karras et al., 2018). |
</p> | </p> | ||
− | + | <p><a href="https://2018.igem.org/Team:Vilnius-Lithuania-OG/ReactionGAN"> Click here to find what we did next </a></p> | |
<div class="toggle toggle-bg" style="margin-top: 10%;"> | <div class="toggle toggle-bg" style="margin-top: 10%;"> |
Latest revision as of 02:52, 18 October 2018