CINXE.COM
Targeted optimization of regulatory DNA sequences with neural editing architectures | bioRxiv
<!DOCTYPE html> <html lang="en" dir="ltr" xmlns="http://www.w3.org/1999/xhtml" xmlns:mml="http://www.w3.org/1998/Math/MathML"> <head prefix="og: http://ogp.me/ns# article: http://ogp.me/ns/article# book: http://ogp.me/ns/book#" > <!--[if IE]><![endif]--> <link rel="dns-prefetch" href="//d33xdlntwy0kbs.cloudfront.net" /> <link rel="dns-prefetch" href="//www.google.com" /> <link rel="dns-prefetch" href="//scholar.google.com" /> <link rel="dns-prefetch" href="//www.googletagmanager.com" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="shortcut icon" href="https://www.biorxiv.org/sites/default/files/images/favicon.ico" type="image/vnd.microsoft.icon" /> <meta name="viewport" content="width=device-width, initial-scale=1" /> <link rel="alternate" type="application/pdf" title="Full Text (PDF)" href="/content/10.1101/714402v2.full.pdf" /> <link rel="alternate" type="text/plain" title="Full Text (Plain)" href="/content/10.1101/714402v2.full.txt" /> <meta name="article_thumbnail" content="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/embed/inline-graphic-1.gif" /> <meta name="type" content="article" /> <meta name="category" content="article" /> <meta name="HW.identifier" content="/biorxiv/early/2019/07/27/714402.atom" /> <meta name="HW.pisa" content="biorxiv;714402v2" /> <meta name="DC.Format" content="text/html" /> <meta name="DC.Language" content="en" /> <meta name="DC.Title" content="Targeted optimization of regulatory DNA sequences with neural editing architectures" /> <meta name="DC.Identifier" content="10.1101/714402" /> <meta name="DC.Date" content="2019-07-28" /> <meta name="DC.Publisher" content="Cold Spring Harbor Laboratory" /> <meta name="DC.Rights" content="© 2019, Posted by Cold Spring Harbor Laboratory. The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission." /> <meta name="DC.AccessRights" content="restricted" /> <meta name="DC.Description" content="Targeted optimizing of existing DNA sequences for useful properties, has the potential to enable several synthetic biology applications from modifying DNA to treat genetic disorders to designing regulatory elements to fine tune context-specific gene expression. Current approaches for targeted genome editing are largely based on prior biological knowledge or ad-hoc rules. Few if any machine learning approaches exist for targeted optimization of regulatory DNA sequences. Here, we propose a novel generative neural network architecture for targeted DNA sequence editing – the EDA architecture – consisting of an encoder, decoder, and analyzer. We showcase the use of EDA to optimize regulatory DNA sequences to bind to the transcription factor SPI1. Compared to other state-of-the-art approaches such as a textual variational autoencoder and rule-based editing, EDA significantly improves predicted binding of SPI1 of genomic sequences with the minimal set of edits. We also use EDA to design regulatory elements with optimized grammars of CREB1 binding sites that can tune reporter expression levels as measured by massively parallel reporter assays (MPRA). We analyze the properties of the binding sites in the edited sequences and find patterns that are consistent with previously reported grammatical rules which tie gene expression to CRE binding site density, spacing and affinity." /> <meta name="DC.Contributor" content="Anvita Gupta" /> <meta name="DC.Contributor" content="Anshul Kundaje" /> <meta name="article:published_time" content="2019-07-28" /> <meta name="article:section" content="New Results" /> <meta name="citation_title" content="Targeted optimization of regulatory DNA sequences with neural editing architectures" /> <meta name="citation_abstract" lang="en" content="<h3>Abstract</h3> <p>Targeted optimizing of existing DNA sequences for useful properties, has the potential to enable several synthetic biology applications from modifying DNA to treat genetic disorders to designing regulatory elements to fine tune context-specific gene expression. Current approaches for targeted genome editing are largely based on prior biological knowledge or ad-hoc rules. Few if any machine learning approaches exist for targeted optimization of regulatory DNA sequences.</p><p>Here, we propose a novel generative neural network architecture for targeted DNA sequence editing – the EDA architecture – consisting of an encoder, decoder, and analyzer. We showcase the use of EDA to optimize regulatory DNA sequences to bind to the transcription factor SPI1. Compared to other state-of-the-art approaches such as a textual variational autoencoder and rule-based editing, EDA significantly improves predicted binding of SPI1 of genomic sequences with the minimal set of edits. We also use EDA to design regulatory elements with optimized grammars of CREB1 binding sites that can tune reporter expression levels as measured by massively parallel reporter assays (MPRA). We analyze the properties of the binding sites in the edited sequences and find patterns that are consistent with previously reported grammatical rules which tie gene expression to CRE binding site density, spacing and affinity.</p>" /> <meta name="citation_journal_title" content="bioRxiv" /> <meta name="citation_publisher" content="Cold Spring Harbor Laboratory" /> <meta name="citation_publication_date" content="2019/01/01" /> <meta name="citation_mjid" content="biorxiv;714402v2" /> <meta name="citation_id" content="714402v2" /> <meta name="citation_public_url" content="https://www.biorxiv.org/content/10.1101/714402v2" /> <meta name="citation_abstract_html_url" content="https://www.biorxiv.org/content/10.1101/714402v2.abstract" /> <meta name="citation_full_html_url" content="https://www.biorxiv.org/content/10.1101/714402v2.full" /> <meta name="citation_pdf_url" content="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402.full.pdf" /> <meta name="citation_doi" content="10.1101/714402" /> <meta name="citation_num_pages" content="7" /> <meta name="citation_article_type" content="Article" /> <meta name="citation_section" content="New Results" /> <meta name="citation_firstpage" content="714402" /> <meta name="citation_author" content="Anvita Gupta" /> <meta name="citation_author_institution" content="Department of Computer Science, Stanford University" /> <meta name="citation_author_email" content="avgupta@stanford.edu" /> <meta name="citation_author" content="Anshul Kundaje" /> <meta name="citation_author_institution" content="Department of Computer Science, Stanford University" /> <meta name="citation_author_institution" content="Department of Genetics, Stanford University" /> <meta name="citation_author_orcid" content="http://orcid.org/0000-0003-3084-2287" /> <meta name="citation_reference" content="E. P. Consortium et al. An integrated encyclopedia of dna elements in the human genome. Nature, 489(7414):57, 2012." /> <meta name="citation_reference" content="J. E. Davis, K. D. Insigne, E. M. Jones, Q. B. Hastings, and S. Kosuri. Multiplexed dissection of a model human transcription factor binding site architecture. bioRxiv, 2019. doi: 10.1101/625434. URL https://www.biorxiv.org/content/early/2019/05/02/625434." /> <meta name="citation_reference" content="M. Ghandi, D. Lee, M. Mohammad-Noori, and M. A. Beer. Enhanced regulatory sequence prediction using gapped k-mer features. PLoS computational biology, 10(7):e1003711, 2014." /> <meta name="citation_reference" content="I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. stat, 1050:20, 2015." /> <meta name="citation_reference" content="A. Gupta and J. Zou. Feedback gan for dna optimizes protein functions. Nature Machine Intelligence, 1(2):105, 2019." /> <meta name="citation_reference" content="A. Gupta, A. T. Müller, B. J. Huisman, J. A. Fuchs, P. Schneider, and G. Schneider. Generative recurrent networks for de novo drug design. Molecular informatics, 37(1-2):1700111, 2018." /> <meta name="citation_reference" content="K. Guu, T. B. Hashimoto, Y. Oren, and P. Liang. Generating sentences by editing prototypes. Transactions of the Association of Computational Linguistics, 6:437–450, 2018." /> <meta name="citation_reference" content="S. Heinz, C. Benner, N. Spann, E. Bertolino, Y. C. Lin, P. Laslo, J. X. Cheng, C. Murre, H. Singh, and C. K. Glass. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and b cell identities. Molecular cell, 38(4):576–589, 2010." /> <meta name="twitter:title" content="Targeted optimization of regulatory DNA sequences with neural editing architectures" /> <meta name="twitter:site" content="@biorxivpreprint" /> <meta name="twitter:card" content="summary" /> <meta name="twitter:image" content="https://www.biorxiv.org/sites/default/files/images/biorxiv_logo_homepage7-5-small.png" /> <meta name="twitter:description" content="Targeted optimizing of existing DNA sequences for useful properties, has the potential to enable several synthetic biology applications from modifying DNA to treat genetic disorders to designing regulatory elements to fine tune context-specific gene expression. Current approaches for targeted genome editing are largely based on prior biological knowledge or ad-hoc rules. Few if any machine learning approaches exist for targeted optimization of regulatory DNA sequences. Here, we propose a novel generative neural network architecture for targeted DNA sequence editing – the EDA architecture – consisting of an encoder, decoder, and analyzer. We showcase the use of EDA to optimize regulatory DNA sequences to bind to the transcription factor SPI1. Compared to other state-of-the-art approaches such as a textual variational autoencoder and rule-based editing, EDA significantly improves predicted binding of SPI1 of genomic sequences with the minimal set of edits. We also use EDA to design regulatory elements with optimized grammars of CREB1 binding sites that can tune reporter expression levels as measured by massively parallel reporter assays (MPRA). We analyze the properties of the binding sites in the edited sequences and find patterns that are consistent with previously reported grammatical rules which tie gene expression to CRE binding site density, spacing and affinity." /> <meta name="og-title" property="og:title" content="Targeted optimization of regulatory DNA sequences with neural editing architectures" /> <meta name="og-url" property="og:url" content="https://www.biorxiv.org/content/10.1101/714402v2" /> <meta name="og-site-name" property="og:site_name" content="bioRxiv" /> <meta name="og-description" property="og:description" content="Targeted optimizing of existing DNA sequences for useful properties, has the potential to enable several synthetic biology applications from modifying DNA to treat genetic disorders to designing regulatory elements to fine tune context-specific gene expression. Current approaches for targeted genome editing are largely based on prior biological knowledge or ad-hoc rules. Few if any machine learning approaches exist for targeted optimization of regulatory DNA sequences. Here, we propose a novel generative neural network architecture for targeted DNA sequence editing – the EDA architecture – consisting of an encoder, decoder, and analyzer. We showcase the use of EDA to optimize regulatory DNA sequences to bind to the transcription factor SPI1. Compared to other state-of-the-art approaches such as a textual variational autoencoder and rule-based editing, EDA significantly improves predicted binding of SPI1 of genomic sequences with the minimal set of edits. We also use EDA to design regulatory elements with optimized grammars of CREB1 binding sites that can tune reporter expression levels as measured by massively parallel reporter assays (MPRA). We analyze the properties of the binding sites in the edited sequences and find patterns that are consistent with previously reported grammatical rules which tie gene expression to CRE binding site density, spacing and affinity." /> <meta name="og-type" property="og:type" content="article" /> <meta name="og-image" property="og:image" content="https://www.biorxiv.org/sites/default/files/images/biorxiv_logo_homepage7-5-small.png" /> <meta name="citation_date" content="2019-07-28" /> <link rel="alternate" type="application/vnd.ms-powerpoint" title="Powerpoint" href="/content/10.1101/714402v2.ppt" /> <meta name="description" content="bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution" /> <meta name="generator" content="Drupal 7 (http://drupal.org)" /> <link rel="canonical" href="https://www.biorxiv.org/content/10.1101/714402v2" /> <link rel="shortlink" href="https://www.biorxiv.org/node/843498" /> <title>Targeted optimization of regulatory DNA sequences with neural editing architectures | bioRxiv</title> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__7SC0i-kTgUlQGKuqbmyS18Sez8FDO-aG9FSHkGrLGl8__EBUojqg5W_1M2-aUl2Y1w9JxEEauoJ0pj29wOb-7Vz4__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <link type="text/css" rel="stylesheet" href="//cdn.jsdelivr.net/qtip2/2.2.1/jquery.qtip.min.css" media="all" /> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__JbFqFjYGp4Zx8gvmj6v5YmfNmbiFphGHPblyC9bfG5Y__OScmsb_1nSVmm_Ax3cJ5Rq7p081PahYkvF_YWQd5GtE__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <style type="text/css" media="all"> /* <![CDATA[ */ .panels-flexible-new .panels-flexible-region{padding:0}.panels-flexible-new .panels-flexible-region-inside{padding-right:.5em;padding-left:.5em}.panels-flexible-new .panels-flexible-region-inside-first{padding-left:0}.panels-flexible-new .panels-flexible-region-inside-last{padding-right:0}.panels-flexible-new .panels-flexible-column{padding:0}.panels-flexible-new .panels-flexible-column-inside{padding-right:.5em;padding-left:.5em}.panels-flexible-new .panels-flexible-column-inside-first{padding-left:0}.panels-flexible-new .panels-flexible-column-inside-last{padding-right:0}.panels-flexible-new .panels-flexible-row{padding:0 0 .5em;margin:0}.panels-flexible-new .panels-flexible-row-last{padding-bottom:0}.panels-flexible-column-new-main{float:left;width:99.0000%}.panels-flexible-new-inside{padding-right:0}.panels-flexible-new{width:auto}.panels-flexible-region-new-center{float:left;width:99.0000%}.panels-flexible-row-new-main-row-inside{padding-right:0} /* ]]> */ </style> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__PWuQ_RYRTJ4BLEKsbWQeHysPg0gOQ3571ruQa_rXvAo__pWyBeQHtoijrNnOoDz9ZPdiNPirDrSCPOz0Q1CDCeno__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <style type="text/css" media="all"> /* <![CDATA[ */ #sliding-popup.sliding-popup-bottom,#sliding-popup.sliding-popup-bottom .eu-cookie-withdraw-banner,.eu-cookie-withdraw-tab{background:gray}#sliding-popup.sliding-popup-bottom.eu-cookie-withdraw-wrapper{background:transparent}#sliding-popup .popup-content #popup-text h1,#sliding-popup .popup-content #popup-text h2,#sliding-popup .popup-content #popup-text h3,#sliding-popup .popup-content #popup-text p,.eu-cookie-compliance-secondary-button,.eu-cookie-withdraw-tab{color:#fff !important}.eu-cookie-withdraw-tab{border-color:#fff}.eu-cookie-compliance-more-button{color:#fff !important} /* ]]> */ </style> <!--[if lte IE 7]> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__ElJr3PIJEvw3qLXc1cnYiLj2G4KgDPSXFOfm6Phf8hw__JdWGm15cDWjsK6KrFlQVXQix9YgNeYysf22XZHj-Y-c__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <![endif]--> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__Wnlyen9qEpwh_Qaf9okEu4QdVGM0BDothxeqA6Nbvo8__EJmw6SZD9bYoS8jocCpPYS3JFRURpdzmuvJoAUNiI-g__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__ILEenG-KqqErAu310RXnS1Elhp5zIvbYwapZiD3LUWw__igagZiGB2PktJMjUpZ2u3tnc56saoUGIrD5N0KSRmI4__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <!--[if (lt IE 9)&(!IEMobile)]> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__XH6bpcI0f2dImc-p674DLCZtWBGb-QwxJK1YexVGtno__vUceGprdo5nIhV6DH93X7fI3r8RcTJbChbas9TQXeW4__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <![endif]--> <!--[if gte IE 9]><!--> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__2WBMox6sOrN42ss5lCnH7WWVRdFdJCxtTKnQJYRwTE4__yqNvNYLvMpjy3ffuJrjjm9uW2i-Me1c23KLYuWHaqio__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <!--<![endif]--> <link type="text/css" rel="stylesheet" href="https://www.biorxiv.org/sites/default/files/advagg_css/css__pdaTRVXQETcsPVwg9_CCQdY0-Qaoz2ABtoEZNiXGzuk__5c1ZiN_pM9tjxvpyA9VpEacScF0S_W4R222pC-s_-Pk__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.css" media="all" /> <link type="text/css" rel="stylesheet" href="https://d33xdlntwy0kbs.cloudfront.net/cshl_custom.css" media="all" /> <script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__BKYqkKToQ7EjirB7eIdMEH5521EU3da9IpoOs8Ex2XI__aSjVoX8giBmLhN2EbCgIGQJNu89Mh5aVu1LvI_gkJ7Y__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js"></script> <script type="text/javascript" src="//cdn.jsdelivr.net/qtip2/2.2.1/jquery.qtip.min.js"></script> <script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__4Cn2dxvNlsJ-sHe6QOTLREaQvcqb0Yh0Zm9tTOHtQow__JeZEUjzbaj_yX6UjCI8eBbXy_J64ZVuoWmc2fSpLZHo__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js"></script> <script type="text/javascript" src="https://www.google.com/recaptcha/api.js?hl=en&render=explicit&onload=drupalRecaptchaOnload"></script> <script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__dGWpV57YWu3sX6UOe04RMH-iP9jSkEP7Ajt0caYXZZk__1l8Wa0iuIek7SEVmMuU0Y9TlAvRR-XZVfl1u9ezOPes__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js"></script> <script type="text/javascript" async="async" src="https://scholar.google.com/scholar_js/casa.js"></script> <script type="text/javascript" async="async" src="https://www.googletagmanager.com/gtag/js?id=G-RZD586MC3Q"></script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- /*! * yepnope1.5.4 * (c) WTFPL, GPLv2 */ (function(a,b,c){function d(a){return"[object Function]"==o.call(a)}function e(a){return"string"==typeof a}function f(){}function g(a){return!a||"loaded"==a||"complete"==a||"uninitialized"==a}function h(){var a=p.shift();q=1,a?a.t?m(function(){("c"==a.t?B.injectCss:B.injectJs)(a.s,0,a.a,a.x,a.e,1)},0):(a(),h()):q=0}function i(a,c,d,e,f,i,j){function k(b){if(!o&&g(l.readyState)&&(u.r=o=1,!q&&h(),l.onload=l.onreadystatechange=null,b)){"img"!=a&&m(function(){t.removeChild(l)},50);for(var d in y[c])y[c].hasOwnProperty(d)&&y[c][d].onload()}}var j=j||B.errorTimeout,l=b.createElement(a),o=0,r=0,u={t:d,s:c,e:f,a:i,x:j};1===y[c]&&(r=1,y[c]=[]),"object"==a?l.data=c:(l.src=c,l.type=a),l.width=l.height="0",l.onerror=l.onload=l.onreadystatechange=function(){k.call(this,r)},p.splice(e,0,u),"img"!=a&&(r||2===y[c]?(t.insertBefore(l,s?null:n),m(k,j)):y[c].push(l))}function j(a,b,c,d,f){return q=0,b=b||"j",e(a)?i("c"==b?v:u,a,b,this.i++,c,d,f):(p.splice(this.i++,0,a),1==p.length&&h()),this}function k(){var a=B;return a.loader={load:j,i:0},a}var l=b.documentElement,m=a.setTimeout,n=b.getElementsByTagName("script")[0],o={}.toString,p=[],q=0,r="MozAppearance"in l.style,s=r&&!!b.createRange().compareNode,t=s?l:n.parentNode,l=a.opera&&"[object Opera]"==o.call(a.opera),l=!!b.attachEvent&&!l,u=r?"object":l?"script":"img",v=l?"script":u,w=Array.isArray||function(a){return"[object Array]"==o.call(a)},x=[],y={},z={timeout:function(a,b){return b.length&&(a.timeout=b[0]),a}},A,B;B=function(a){function b(a){var a=a.split("!"),b=x.length,c=a.pop(),d=a.length,c={url:c,origUrl:c,prefixes:a},e,f,g;for(f=0;f<d;f++)g=a[f].split("="),(e=z[g.shift()])&&(c=e(c,g));for(f=0;f<b;f++)c=x[f](c);return c}function g(a,e,f,g,h){var i=b(a),j=i.autoCallback;i.url.split(".").pop().split("?").shift(),i.bypass||(e&&(e=d(e)?e:e[a]||e[g]||e[a.split("/").pop().split("?")[0]]),i.instead?i.instead(a,e,f,g,h):(y[i.url]?i.noexec=!0:y[i.url]=1,f.load(i.url,i.forceCSS||!i.forceJS&&"css"==i.url.split(".").pop().split("?").shift()?"c":c,i.noexec,i.attrs,i.timeout),(d(e)||d(j))&&f.load(function(){k(),e&&e(i.origUrl,h,g),j&&j(i.origUrl,h,g),y[i.url]=2})))}function h(a,b){function c(a,c){if(a){if(e(a))c||(j=function(){var a=[].slice.call(arguments);k.apply(this,a),l()}),g(a,j,b,0,h);else if(Object(a)===a)for(n in m=function(){var b=0,c;for(c in a)a.hasOwnProperty(c)&&b++;return b}(),a)a.hasOwnProperty(n)&&(!c&&!--m&&(d(j)?j=function(){var a=[].slice.call(arguments);k.apply(this,a),l()}:j[n]=function(a){return function(){var b=[].slice.call(arguments);a&&a.apply(this,b),l()}}(k[n])),g(a[n],j,b,n,h))}else!c&&l()}var h=!!a.test,i=a.load||a.both,j=a.callback||f,k=j,l=a.complete||f,m,n;c(h?a.yep:a.nope,!!i),i&&c(i)}var i,j,l=this.yepnope.loader;if(e(a))g(a,0,l,0);else if(w(a))for(i=0;i<a.length;i++)j=a[i],e(j)?g(j,0,l,0):w(j)?B(j):Object(j)===j&&h(j,l);else Object(a)===a&&h(a,l)},B.addPrefix=function(a,b){z[a]=b},B.addFilter=function(a){x.push(a)},B.errorTimeout=1e4,null==b.readyState&&b.addEventListener&&(b.readyState="loading",b.addEventListener("DOMContentLoaded",A=function(){b.removeEventListener("DOMContentLoaded",A,0),b.readyState="complete"},0)),a.yepnope=k(),a.yepnope.executeStack=h,a.yepnope.injectJs=function(a,c,d,e,i,j){var k=b.createElement("script"),l,o,e=e||B.errorTimeout;k.src=a;for(o in d)k.setAttribute(o,d[o]);c=j?h:c||f,k.onreadystatechange=k.onload=function(){!l&&g(k.readyState)&&(l=1,c(),k.onload=k.onreadystatechange=null)},m(function(){l||(l=1,c(1))},e),i?k.onload():n.parentNode.insertBefore(k,n)},a.yepnope.injectCss=function(a,c,d,e,g,i){var e=b.createElement("link"),j,c=i?h:c||f;e.href=a,e.rel="stylesheet",e.type="text/css";for(j in d)e.setAttribute(j,d[j]);g||(n.parentNode.insertBefore(e,n),m(c,0))}})(this,document); //--><!]]> </script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- yepnope({ test: Modernizr.matchmedia, nope: '/sites/all/libraries/media-match/media.match.min.js' }); //--><!]]> </script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- var _prum=[['id', '612e3d94173350001100005d'], ['mark', 'firstbyte', (new Date()).getTime()]]; (function() { var s=document.getElementsByTagName('script')[0], p=document.createElement('script'); p.async='async'; p.src='//rum-static.pingdom.net/prum.min.js';s.parentNode.insertBefore(p,s);})(); //--><!]]> </script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- if(typeof window.MathJax === "undefined") window.MathJax = { menuSettings: { zoom: "Click" } }; //--><!]]> </script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- window.dataLayer = window.dataLayer || [];function gtag(){dataLayer.push(arguments)};gtag("js", new Date());gtag("set", "developer_id.dMDhkMT", true);gtag("config", "G-RZD586MC3Q", {"groups":"default","anonymize_ip":true}); //--><!]]> </script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- jQuery.extend(Drupal.settings,{"basePath":"\/","pathPrefix":"","ajaxPageState":{"theme":"jcore_1","theme_token":"lznJ5Ilj00-ksBkvMqzQt63mJBfvRJDWRl2zXnBXHbE"},"colorbox":{"opacity":"0.85","current":"{current} of {total}","previous":"\u00ab Prev","next":"Next \u00bb","close":"Close","maxWidth":"98%","maxHeight":"98%","fixed":true,"mobiledetect":true,"mobiledevicewidth":"480px"},"highwire":{"nid":"843498","apath":"\/biorxiv\/early\/2019\/07\/27\/714402.atom","pisa":"biorxiv;714402v2","processed":["highwire_math"],"markup":[{"requested":"full-text","variant":"full-text","view":"full","pisa":"biorxiv;714402v2"}],"modal_window_width":"560","share_modal_width":"560","share_modal_title":"Share this Article"},"jcarousel":{"ajaxPath":"\/jcarousel\/ajax\/views"},"instances":"{\u0022highwire_abstract_tooltip\u0022:{\u0022content\u0022:{\u0022text\u0022:\u0022\u0022},\u0022style\u0022:{\u0022tip\u0022:{\u0022width\u0022:20,\u0022height\u0022:20,\u0022border\u0022:1,\u0022offset\u0022:0,\u0022corner\u0022:true},\u0022classes\u0022:\u0022qtip-custom hw-tooltip hw-abstract-tooltip qtip-shadow qtip-rounded\u0022,\u0022classes_custom\u0022:\u0022hw-tooltip hw-abstract-tooltip\u0022},\u0022position\u0022:{\u0022at\u0022:\u0022right center\u0022,\u0022my\u0022:\u0022left center\u0022,\u0022viewport\u0022:true,\u0022adjust\u0022:{\u0022method\u0022:\u0022shift\u0022}},\u0022show\u0022:{\u0022event\u0022:\u0022mouseenter click \u0022,\u0022solo\u0022:true},\u0022hide\u0022:{\u0022event\u0022:\u0022mouseleave \u0022,\u0022fixed\u0022:1,\u0022delay\u0022:\u0022100\u0022}},\u0022highwire_author_tooltip\u0022:{\u0022content\u0022:{\u0022text\u0022:\u0022\u0022},\u0022style\u0022:{\u0022tip\u0022:{\u0022width\u0022:15,\u0022height\u0022:15,\u0022border\u0022:1,\u0022offset\u0022:0,\u0022corner\u0022:true},\u0022classes\u0022:\u0022qtip-custom hw-tooltip hw-author-tooltip qtip-shadow qtip-rounded\u0022,\u0022classes_custom\u0022:\u0022hw-tooltip hw-author-tooltip\u0022},\u0022position\u0022:{\u0022at\u0022:\u0022top center\u0022,\u0022my\u0022:\u0022bottom center\u0022,\u0022viewport\u0022:true,\u0022adjust\u0022:{\u0022method\u0022:\u0022\u0022}},\u0022show\u0022:{\u0022event\u0022:\u0022mouseenter \u0022,\u0022solo\u0022:true},\u0022hide\u0022:{\u0022event\u0022:\u0022mouseleave \u0022,\u0022fixed\u0022:1,\u0022delay\u0022:\u0022100\u0022}},\u0022highwire_reflinks_tooltip\u0022:{\u0022content\u0022:{\u0022text\u0022:\u0022\u0022},\u0022style\u0022:{\u0022tip\u0022:{\u0022width\u0022:15,\u0022height\u0022:15,\u0022border\u0022:1,\u0022mimic\u0022:\u0022top center\u0022,\u0022offset\u0022:0,\u0022corner\u0022:true},\u0022classes\u0022:\u0022qtip-custom hw-tooltip hw-ref-link-tooltip qtip-shadow qtip-rounded\u0022,\u0022classes_custom\u0022:\u0022hw-tooltip hw-ref-link-tooltip\u0022},\u0022position\u0022:{\u0022at\u0022:\u0022bottom left\u0022,\u0022my\u0022:\u0022top left\u0022,\u0022viewport\u0022:true,\u0022adjust\u0022:{\u0022method\u0022:\u0022flip\u0022}},\u0022show\u0022:{\u0022event\u0022:\u0022mouseenter \u0022,\u0022solo\u0022:true},\u0022hide\u0022:{\u0022event\u0022:\u0022mouseleave \u0022,\u0022fixed\u0022:1,\u0022delay\u0022:\u0022100\u0022}}}","qtipDebug":"{\u0022leaveElement\u0022:0}","panel_ajax_tab":{"path":"sites\/all\/modules\/contrib\/panels_ajax_tab"},"disqus":{"domain":"biorxivstage","url":"https:\/\/www.biorxiv.org\/content\/10.1101\/714402v2","title":"Targeted optimization of regulatory DNA sequences with neural editing architectures","identifier":"node\/843498"},"panels_ajax_pane":{"new-28877068-b9cb-4641-830b-b6b4638c98bb":"{\u0022encrypted\u0022:\u0022{\\\u0022encrypted\\\u0022:\\\u0022P0MXxSXDsX8MHBJxITwWUGYgHgqe71GsDrPTHsIGenKHDu3hMxiL\\\\\\\/kzW+FXM2LV5e3+jyjmNO3BcfGe\\\\\\\/6w3TdjUj5GdXa1I4LaRh5QRthVKHEYcmAc\\\\\\\/L\\\\\\\/DJHnpyyIss87755ae\\\\\\\/aMZ9PsdWpcLwEnRbTEVFr5SHwxHVpY3IMNfDvRzYRhSHSiVmiSJwbHoS8Iz0jjwLWBW7AhLn0YCuDJkc9bzEs6uauPqy9loperg98kRVw9vz+3tdD6Bwiw17EMQWwSmLfg4xkeUZZZMxdABVIuZs9vbQ4psXIAK8mEkDw\\\\\\\/Y9Rr88czJwscRlK2jhGYSaYlUfkuR1c\\\\\\\/fUVWGqI2PNFQWeG1H\\\\\\\/7AEnmtmBHO1YCdR02L7UQyQbqFQSnem0Yslx5qlnQ6CDJMy1VkVLkI7EooWFJlkQd1od53\\\\\\\/xD2YA2rnW8EEy4vw0k\\\\\\\/9OHd3xaPHGhbnxtk8Akf+X4ueiFDbsD6q6ngoqZsHVAsCnEHlPEGTh8tViNOTthzgIeyYEPYQTg7x0MXy1ksBEJtoqp1tGMWlKPTa2gyyd02IBJnUsTB5wy6oAMrTqYnki75ST+qqOsMaJfUFZ0ctk1C0JAz4JTDcuMXdKr20LhYmzs9OVKOdzXmf4RtIepHmdFNInk617uOK9hgSp5SEXsf9u71yfUWpKwMT6BFaUfGl5cdYpfPWoD7S9wy7IziAj5LCQ2LACrw+sn5vIhgyiAZTMl8gp1OlcI4FMia2upFLN2mxdmkD0gTmetU0U\\\\\\\/qBhotmb37Qldb2rulHJG7WcUjDKjr9fZnj6bELqZdN5c8qP+S7KO0fFkMACMhfOB8yeyAcGSeuJS1CXZziXYQU+PjG+TL46VZlJ2feWPUQVUUX0wVDm+PV9cgHjSGqaYnlfNU08TlTcijlpCawWwhy0c23\\\\\\\/X\\\\\\\/GoeUnEmaf7dbo4z408h3\\\\\\\/5Yw8nagb73oVuZjFCBQFfcfzYhIKTon3V1N\\\\\\\/ontuLh08Iki8PshT548Z10QJ4CH7EJmsd6OwAtN0n7fTlYnMnRCTt9ElfW5qUVYt6BoplYC\\\\\\\/8kP056186dVGkqhAEhioR+6YNptyVzea0JOoVvbjI+acFMrWGhfVlwuHKJJpZpZ3bwTNcdQjz8yOSZn6RpnWygimyIV9Hz59ajNIVkijSdPx88DuzDMD3Pe1gBiVn1vtWG7jtZxqOFPD2ODvp1RuSmehLMc265JRM+vG4WR80xoz5qqIum54nE4BmzzcI1cL36DRB6XT2bc14s+GmNkya1IMhDQ1YlXpFnVUsLoSPUt1Qc9q\\\\\\\/DfKC5qCzHS3aVZfifPfaKl45\\\\\\\/Z2AX\\\\\\\/wdIFE\\\\\\\/jvRZ0Xaz4JI5yCKMY2T02HT7Ss6dwUMGwuMUds6wbX1UBW+Gi2VCW+pR4cGXRW+Q=\\\u0022,\\\u0022iv\\\u0022:\\\u0022Gfb\\\\\\\/qqgjSEGNGLdue177Mg==\\\u0022,\\\u0022salt\\\u0022:\\\u0022f9d8845712f72c9fee5db7d9ff111eab\\\u0022}\u0022,\u0022hmac\u0022:\u00228ee99dc5861b133285498ca4cc52331a42ec034561fe5da77513bf4b0a1ba0f9\u0022}"},"urlIsAjaxTrusted":{"\/content\/10.1101\/714402v2.full":true},"ws_fl":{"width":100,"height":21},"ws_gpo":{"size":"","annotation":"","lang":"","callback":"","width":300},"color":{"logo":"https:\/\/www.biorxiv.org\/sites\/default\/files\/biorxiv_article.jpg"},"highwire_list_expand":{"is_collapsed":"1"},"highwireResponsive":{"enquire_enabled":1,"breakpoints_configured":1,"breakpoints":{"zero":"all and (min-width: 0px)","xsmall":"all and (min-width: 380px)","narrow":"all and (min-width: 768px) and (min-device-width: 768px), (max-device-width: 800px) and (min-width: 768px) and (orientation:landscape)","normal":"all and (min-width: 980px) and (min-device-width: 980px), all and (max-device-width: 1024px) and (min-width: 1024px) and (orientation:landscape)","wide":"all and (min-width: 1220px)"}},"eu_cookie_compliance":{"popup_enabled":1,"popup_agreed_enabled":0,"popup_hide_agreed":0,"popup_clicking_confirmation":1,"popup_scrolling_confirmation":false,"popup_html_info":"\u003Cdiv\u003E\n \u003Cdiv class=\u0022popup-content info\u0022\u003E\n \u003Cdiv id=\u0022popup-text\u0022\u003E\n \u003Cp\u003EWe use cookies on this site to enhance your user experience. By clicking any link on this page you are giving your consent for us to set cookies.\u003C\/p\u003E\n \u003C\/div\u003E\n \u003Cdiv id=\u0022popup-buttons\u0022\u003E\n \u003Cbutton type=\u0022button\u0022 role=\u0022dialog\u0022 aria-labelledby=\u0022Continue\u0022\n class=\u0022agree-button eu-cookie-compliance-default-button\u0022\u003EContinue\u003C\/button\u003E\n \u003Cbutton type=\u0022button\u0022 role=\u0022dialog\u0022 aria-labelledby=\u0022Find out more\u0022\n class=\u0022find-more-button eu-cookie-compliance-more-button\u0022\u003EFind out more\u003C\/button\u003E\n \u003C\/div\u003E\n \u003C\/div\u003E\n\u003C\/div\u003E","use_mobile_message":false,"mobile_popup_html_info":"\u003Cdiv\u003E\n \u003Cdiv class=\u0022popup-content info\u0022\u003E\n \u003Cdiv id=\u0022popup-text\u0022\u003E\n \u003C\/div\u003E\n \u003Cdiv id=\u0022popup-buttons\u0022\u003E\n \u003Cbutton type=\u0022button\u0022 role=\u0022dialog\u0022 aria-labelledby=\u0022Continue\u0022\n class=\u0022agree-button eu-cookie-compliance-default-button\u0022\u003EContinue\u003C\/button\u003E\n \u003Cbutton type=\u0022button\u0022 role=\u0022dialog\u0022 aria-labelledby=\u0022Find out more\u0022\n class=\u0022find-more-button eu-cookie-compliance-more-button\u0022\u003EFind out more\u003C\/button\u003E\n \u003C\/div\u003E\n \u003C\/div\u003E\n\u003C\/div\u003E","mobile_breakpoint":"768","popup_html_agreed":"\u003Cdiv\u003E\n \u003Cdiv class=\u0022popup-content agreed\u0022\u003E\n \u003Cdiv id=\u0022popup-text\u0022\u003E\n \u003Ch2\u003EThank you for accepting cookies\u003C\/h2\u003E\u003Cp\u003EYou can now hide this message or find out more about cookies.\u003C\/p\u003E \u003C\/div\u003E\n \u003Cdiv id=\u0022popup-buttons\u0022\u003E\n \u003Cbutton type=\u0022button\u0022 class=\u0022hide-popup-button eu-cookie-compliance-hide-button\u0022\u003EHide\u003C\/button\u003E\n \u003Cbutton type=\u0022button\u0022 class=\u0022find-more-button eu-cookie-compliance-more-button-thank-you\u0022 \u003EMore info\u003C\/button\u003E\n \u003C\/div\u003E\n \u003C\/div\u003E\n\u003C\/div\u003E","popup_use_bare_css":false,"popup_height":"auto","popup_width":"100%","popup_delay":1000,"popup_link":"\/help\/cookie-policy","popup_link_new_window":1,"popup_position":null,"popup_language":"en","store_consent":false,"better_support_for_screen_readers":0,"reload_page":0,"domain":"","popup_eu_only_js":0,"cookie_lifetime":"365","cookie_session":false,"disagree_do_not_show_popup":0,"method":"default","whitelisted_cookies":"","withdraw_markup":"\u003Cbutton type=\u0022button\u0022 class=\u0022eu-cookie-withdraw-tab\u0022\u003EPrivacy settings\u003C\/button\u003E\n\u003Cdiv class=\u0022eu-cookie-withdraw-banner\u0022\u003E\n \u003Cdiv class=\u0022popup-content info\u0022\u003E\n \u003Cdiv id=\u0022popup-text\u0022\u003E\n \u003Cp\u003E\u0026lt;h2\u0026gt;We use cookies on this site to enhance your user experience\u0026lt;\/h2\u0026gt;\u0026lt;p\u0026gt;You have given your consent for us to set cookies.\u0026lt;\/p\u0026gt;\u003C\/p\u003E\n \u003C\/div\u003E\n \u003Cdiv id=\u0022popup-buttons\u0022\u003E\n \u003Cbutton type=\u0022button\u0022 class=\u0022eu-cookie-withdraw-button\u0022\u003EWithdraw consent\u003C\/button\u003E\n \u003C\/div\u003E\n \u003C\/div\u003E\n\u003C\/div\u003E\n","withdraw_enabled":false},"googleanalytics":{"account":["G-RZD586MC3Q"],"trackOutbound":1,"trackMailto":1,"trackDownload":1,"trackDownloadExtensions":"7z|aac|arc|arj|asf|asx|avi|bin|csv|doc(x|m)?|dot(x|m)?|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt(x|m)?|pot(x|m)?|pps(x|m)?|ppam|sld(x|m)?|thmx|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls(x|m|b)?|xlt(x|m)|xlam|xml|z|zip","trackColorbox":1},"jnl_biorxiv_styles":{"defaultJCode":"biorxiv"},"omega":{"layouts":{"primary":"normal","order":["narrow","normal","wide"],"queries":{"narrow":"all and (min-width: 768px) and (min-device-width: 768px), (max-device-width: 800px) and (min-width: 768px) and (orientation:landscape)","normal":"all and (min-width: 980px) and (min-device-width: 980px), all and (max-device-width: 1024px) and (min-width: 1024px) and (orientation:landscape)","wide":"all and (min-width: 1220px)"}}}}); //--><!]]> </script> <!--[if lt IE 9]><script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]--> </head> <body class="html not-front not-logged-in page-node page-node- page-node-843498 node-type-highwire-article context-content hw-default-jcode-biorxiv hw-article-type-article hw-article-category-new-results"> <!-- Google Tag Manager --> <noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-M677548" height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript> <script type="text/javascript">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0];var j=d.createElement(s);var dl=l!='dataLayer'?'&l='+l:'';j.src='//www.googletagmanager.com/gtm.js?id='+i+dl;j.type='text/javascript';j.async=true;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-M677548');</script> <!-- End Google Tag Manager --> <div id="skip-link"> <a href="#main-content" class="element-invisible element-focusable">Skip to main content</a> </div> <div class="page clearfix page-box-shadows footer-borders panels-page panels-layout-jcore_2col" id="page"> <header id="section-header" class="section section-header"> <div id="zone-branding" class="zone zone-branding clearfix print-display-block container-30"> <div class="grid-15 prefix-1 region region-branding print-display-block" id="region-branding"> <div class="region-inner region-branding-inner"> <div class="branding-data clearfix"> <div class="logo-img"> <a href="/" rel="home" class="" data-icon-position="" data-hide-link-title="0"><img alt="bioRxiv" src="https://www.biorxiv.org/sites/default/files/biorxiv_article.jpg" /></a> </div> </div> </div> </div><div class="grid-11 suffix-1 region region-branding-second print-hidden" id="region-branding-second"> <div class="region-inner region-branding-second-inner"> <div class="block block-system block-menu block-main-menu block-system-main-menu odd block-without-title" id="block-system-main-menu"> <div class="block-inner clearfix"> <div class="content clearfix"> <nav class="menubar-nav"><ul class="menu" role="menu"><li class="first leaf" role="menuitem"><a href="/" title="" class="" data-icon-position="" data-hide-link-title="0">Home</a></li> <li class="leaf" role="menuitem"><a href="/about-biorxiv" class="" data-icon-position="" data-hide-link-title="0">About</a></li> <li class="leaf" role="menuitem"><a href="/submit-a-manuscript" class="" data-icon-position="" data-hide-link-title="0">Submit</a></li> <li class="last leaf" role="menuitem"><a href="/alertsrss" title="" class="" data-icon-position="" data-hide-link-title="0">ALERTS / RSS</a></li> </ul></nav> </div> </div> </div><div class="block block-panels-mini block-biorxiv-search-box block-panels-mini-biorxiv-search-box even block-without-title" id="block-panels-mini-biorxiv-search-box"> <div class="block-inner clearfix"> <div class="content clearfix"> <div class="panel-display panel-1col clearfix" id="mini-panel-biorxiv_search_box"> <div class="panel-panel panel-col"> <div><div class="panel-pane pane-highwire-seach-quicksearch" > <div class="pane-content"> <form class="highwire-quicksearch button-style-mini button-style-mini" action="/content/10.1101/714402v2.full" method="post" id="highwire-search-quicksearch-form-0" accept-charset="UTF-8"><div><div class="form-item form-item-label-invisible form-type-textfield form-item-keywords"> <label class="element-invisible" for="search_rightsidebar_keywords_1235243144">Search for this keyword </label> <input placeholder="Search" type="text" id="search_rightsidebar_keywords_1235243144" name="keywords" value="" size="60" maxlength="128" class="form-text" /> </div> <div class="button-wrapper button-mini"><span class="icon-search"></span><input data-icon-only="1" data-font-icon="icon-search" data-icon-position="after" type="submit" id="search_rightsidebar_submit_734609247" name="op" value="Search" class="form-submit" /></div><input type="hidden" name="form_build_id" value="form-sR4zTGhZPduuHxnqAA5GAImcY0sq86-s7gqhUjoTYjQ" /> <input type="hidden" name="form_id" value="highwire_search_quicksearch_form_0" /> </div></form> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-custom pane-2 advanced-search-link" > <div class="pane-content"> <a href="/search">Advanced Search</a> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <div id="zone-header" class="zone zone-header clearfix container-30"> </div> </header> <section id="section-content" class="section section-content"> <div id="zone-content" class="zone zone-content clearfix container-30"> <div class="grid-28 suffix-1 prefix-1 region region-content" id="region-content"> <div class="region-inner region-content-inner"> <a id="main-content"></a> <div class="block block-system block-main block-system-main odd block-without-title" id="block-system-main"> <div class="block-inner clearfix"> <div class="content clearfix"> <div class="panel-display panels-960-layout jcore-2col-layout" > <div class="panel-row-wrapper clearfix"> <div class="main-content-wrapper grid-17 suffix-1 alpha"> <div class="panel-panel panel-region-content"> <div class="inside"><div class="panel-pane pane-highwire-article-citation" > <div class="pane-content"> <div class="highwire-article-citation highwire-citation-type-highwire-article node843498" data-node-nid="843498" id="node-843498--2732797605" data-pisa="biorxiv;714402v2" data-pisa-master="biorxiv;714402" data-apath="/biorxiv/early/2019/07/27/714402.atom" data-hw-author-tooltip-instance="highwire_author_tooltip"><div class="highwire-cite highwire-cite-highwire-article highwire-citation-biorxiv-article-top clearfix has-author-tooltip" > <span class="biorxiv-article-type"> New Results </span> <h1 class="highwire-cite-title" id="page-title">Targeted optimization of regulatory DNA sequences with neural editing architectures</h1> <div class="highwire-cite-authors" ><span class="highwire-citation-authors"><span class="highwire-citation-author first" data-delta="0"><span class="nlm-given-names">Anvita</span> <span class="nlm-surname">Gupta</span></span>, <span class="highwire-citation-author hw-author-orcid-logo-wrapper" data-delta="1"><a href="http://orcid.org/0000-0003-3084-2287" target="_blank" class="hw-author-orcid-logo link-icon-only link-icon"><span class="hw-icon-orcid hw-icon-color-orcid"></span> <span class="title element-invisible">View ORCID Profile</span></a><span class="nlm-given-names">Anshul</span> <span class="nlm-surname">Kundaje</span></span></span></div> <div class="highwire-cite-metadata" ><span class="highwire-cite-metadata-doi highwire-cite-metadata"><span class="label">doi:</span> https://doi.org/10.1101/714402 </span></div> </div> <div id="hw-article-author-popups-node-843498--2732797605" style="display: none;"><div class="author-tooltip-0"><div class="author-tooltip-name">Anvita Gupta </div><div class="author-tooltip-affiliation"><span class="author-tooltip-text"><div class='author-affiliation'><span class='nlm-sup'>1</span><span class='nlm-institution'>Department of Computer Science, Stanford University</span></div></span></div><ul class="author-tooltip-find-more"><li class="author-tooltip-gs-link first"><a href="/lookup/google-scholar?link_type=googlescholar&gs_type=author&author%5B0%5D=Anvita%2BGupta%2B" target="_blank" class="" data-icon-position="" data-hide-link-title="0">Find this author on Google Scholar</a></li><li class="author-tooltip-pubmed-link"><a href="/lookup/external-ref?access_num=Gupta%20A&link_type=AUTHORSEARCH" target="_blank" class="" data-icon-position="" data-hide-link-title="0">Find this author on PubMed</a></li><li class="author-site-search-link"><a href="/search/author1%3AAnvita%2BGupta%2B" rel="nofollow" class="" data-icon-position="" data-hide-link-title="0">Search for this author on this site</a></li><li class="author-corresp-email-link last"><span>For correspondence: <a href="mailto:avgupta@stanford.edu" class="" data-icon-position="" data-hide-link-title="0">avgupta@stanford.edu</a></span></li></ul></div><div class="author-tooltip-1"><div class="author-tooltip-name">Anshul Kundaje </div><div class="author-tooltip-affiliation"><span class="author-tooltip-text"><div class='author-affiliation'><span class='nlm-sup'>1</span><span class='nlm-institution'>Department of Computer Science, Stanford University</span></div><div class='author-affiliation'><span class='nlm-sup'>2</span><span class='nlm-institution'>Department of Genetics, Stanford University</span></div></span></div><ul class="author-tooltip-find-more"><li class="author-tooltip-gs-link first"><a href="/lookup/google-scholar?link_type=googlescholar&gs_type=author&author%5B0%5D=Anshul%2BKundaje%2B" target="_blank" class="" data-icon-position="" data-hide-link-title="0">Find this author on Google Scholar</a></li><li class="author-tooltip-pubmed-link"><a href="/lookup/external-ref?access_num=Kundaje%20A&link_type=AUTHORSEARCH" target="_blank" class="" data-icon-position="" data-hide-link-title="0">Find this author on PubMed</a></li><li class="author-site-search-link"><a href="/search/author1%3AAnshul%2BKundaje%2B" rel="nofollow" class="" data-icon-position="" data-hide-link-title="0">Search for this author on this site</a></li><li class="author-orcid-link last"><a href="http://orcid.org/0000-0003-3084-2287" target="_blank" class="" data-icon-position="" data-hide-link-title="0">ORCID record for Anshul Kundaje</a></li></ul></div></div></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-highwire-panel-tabs pane-panels-ajax-tab-tabs" > <div class="pane-content"> <div class="item-list"><ul class="tabs inline panels-ajax-tab"><li class="first"><a href="/content/10.1101/714402v2" class="panels-ajax-tab-tab" data-panel-name="biorxiv_tab_art" data-target-id="highwire_article_tabs" data-entity-context="node:843498" data-trigger="" data-url-enabled="1">Abstract</a><a href="/panels_ajax_tab/biorxiv_tab_art/node:843498/1" rel="nofollow" style="display:none" class="js-crawler-link"></a></li><li><a href="/content/10.1101/714402v2.full-text" class="panels-ajax-tab-tab" data-panel-name="article_tab_full_text" data-target-id="highwire_article_tabs" data-entity-context="node:843498" data-trigger="full-text" data-url-enabled="1">Full Text</a><a href="/panels_ajax_tab/article_tab_full_text/node:843498/1" rel="nofollow" style="display:none" class="js-crawler-link"></a></li><li><a href="/content/10.1101/714402v2.article-info" class="panels-ajax-tab-tab" data-panel-name="biorxiv_tab_info" data-target-id="highwire_article_tabs" data-entity-context="node:843498" data-trigger="article-info" data-url-enabled="1">Info/History</a><a href="/panels_ajax_tab/biorxiv_tab_info/node:843498/1" rel="nofollow" style="display:none" class="js-crawler-link"></a></li><li><a href="/content/10.1101/714402v2.article-metrics" class="panels-ajax-tab-tab" data-panel-name="article_tab_metrics" data-target-id="highwire_article_tabs" data-entity-context="node:843498" data-trigger="article-metrics" data-url-enabled="1">Metrics</a><a href="/panels_ajax_tab/article_tab_metrics/node:843498/1" rel="nofollow" style="display:none" class="js-crawler-link"></a></li><li class="last"><a href="/content/10.1101/714402v2.full.pdf+html" class="panels-ajax-tab-tab" data-panel-name="biorxiv_tab_pdf" data-target-id="highwire_article_tabs" data-entity-context="node:843498" data-trigger="full.pdf+html" data-url-enabled="1"><i class="icon-file-alt"></i> Preview PDF</a><a href="/panels_ajax_tab/biorxiv_tab_pdf/node:843498/1" rel="nofollow" style="display:none" class="js-crawler-link"></a></li></ul></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-highwire-panel-tabs-container" > <div class="pane-content"> <div data-panels-ajax-tab-preloaded="article_tab_full_text" id="panels-ajax-tab-container-highwire_article_tabs" class="panels-ajax-tab-container"><div class="panels-ajax-tab-loading" style ="display:none"><img class="loading" src="https://www.biorxiv.org/sites/all/modules/contrib/panels_ajax_tab/images/loading.gif" alt="Loading" title="Loading" /></div><div class="panels-ajax-tab-wrap-article_tab_full_text"><div class="panel-display panel-1col clearfix" > <div class="panel-panel panel-col"> <div><div class="panel-pane pane-highwire-markup" > <div class="pane-content"> <div class="highwire-markup"><div xmlns="http://www.w3.org/1999/xhtml" data-highwire-cite-ref-tooltip-instance="highwire_reflinks_tooltip" class="content-block-markup" xmlns:xhtml="http://www.w3.org/1999/xhtml"><div class="article fulltext-view "><span class="highwire-journal-article-marker-start"></span><div class="section abstract" id="abstract-1"><h2 class="">Abstract</h2><p id="p-3">Targeted optimizing of existing DNA sequences for useful properties, has the potential to enable several synthetic biology applications from modifying DNA to treat genetic disorders to designing regulatory elements to fine tune context-specific gene expression. Current approaches for targeted genome editing are largely based on prior biological knowledge or ad-hoc rules. Few if any machine learning approaches exist for targeted optimization of regulatory DNA sequences.</p><p id="p-4">Here, we propose a novel generative neural network architecture for targeted DNA sequence editing – the EDA architecture – consisting of an encoder, decoder, and analyzer. We showcase the use of EDA to optimize regulatory DNA sequences to bind to the transcription factor SPI1. Compared to other state-of-the-art approaches such as a textual variational autoencoder and rule-based editing, EDA significantly improves predicted binding of SPI1 of genomic sequences with the minimal set of edits. We also use EDA to design regulatory elements with optimized grammars of CREB1 binding sites that can tune reporter expression levels as measured by massively parallel reporter assays (MPRA). We analyze the properties of the binding sites in the edited sequences and find patterns that are consistent with previously reported grammatical rules which tie gene expression to CRE binding site density, spacing and affinity.</p></div><div class="section" id="sec-1"><h2 class="">1 Introduction</h2><p id="p-5">Recent generative models for genomic DNA sequences, such as generative adversarial networks, variational autoencoders, and recurrent neural networks, have largely focused on ab initio generation of biological sequences from distributions learned over a large collection of exemplar sequences[<a id="xref-ref-10-1" class="xref-bibr" href="#ref-10">10</a>, <a id="xref-ref-6-1" class="xref-bibr" href="#ref-6">6</a>, <a id="xref-ref-7-1" class="xref-bibr" href="#ref-7">7</a>]. However, generative models have been shown to suffer from low diversity – falling into the failure mode of producing generic sequences with high likelihood [<a id="xref-ref-8-1" class="xref-bibr" href="#ref-8">8</a>]. Generative models that are capable of editing an existing sequence, rather than generating an entirely new sequence from scratch, may be able to draw from the natural diversity present in biological sequences, while still allowing useful changes to the data. Also, many genome engineering applications typically require editing an existing DNA sequence in order to knock out or repair disease genes or modify regulatory DNA to modulate gene expression in specific cell types and states.</p><p id="p-6">Machine learning approaches for editing existing sequences for desired properties have been significantly less well studied than ab-initio generative models. Guu <em>et al</em> proposed a neural editor for natural language to transform an input sentence into an output based on a sampled edit vector; however, the edit vectors are latent and must be interpreted after training [<a id="xref-ref-8-2" class="xref-bibr" href="#ref-8">8</a>].</p><p id="p-7">Here, we propose a novel Encoder-Decoder-Analyzer (EDA) neural network architecture that radically departs from status quo methods [<a id="xref-ref-8-3" class="xref-bibr" href="#ref-8">8</a>]. EDA combines recurrent sequence-to-sequence models, latent vectors based on an explicit predictor of desired sequence properties and adversarial example generation techniques. EDA automatically generates candidate modifications to prespecified regulatory DNA sequences to optimize specific properties such as binding of transcription factors or reporter gene expression. This model represents a unique approach to edit sequences for desired properties by leveraging existing supervised learning models that can accurately map regulatory sequences to specific properties. We showcase the EDA model on two pilot case studies. In the first case study, we use EDA to edit regulatory DNA sequences to increase the binding probability of a transcription factor SPI1 by leveraging <em>in vivo</em> genome-wide binding profiles (ChIP-seq) for SPI1. In the second case study, we use EDA to generate candidate regulatory sequences containing configurations of binding sites of the CREB1 transcription factor that can produce a desired gene expression readout as measured by a massively parallel reporter assay (MRPA) from Davis <em>et al,</em> 2019 [<a id="xref-ref-3-1" class="xref-bibr" href="#ref-3">3</a>]. The EDA approach significantly outperforms existing state-of-the-art approaches.</p></div><div class="section" id="sec-2"><h2 class="">2 Methods</h2><div id="sec-3" class="subsection"><h3>Sequence Variational Autoencoder (SVAE) as a baseline method</h3><p id="p-8">The sequence variational autoencoder (SVAE) for editing is based off the recurrent architecture described in [<a id="xref-ref-1-1" class="xref-bibr" href="#ref-1">1</a>]; the encoder produces the parameters (<em>μ</em>, Σ) of a Gaussian distribution in latent space, from which <em>z</em>, a latent vector encoding the sequence <em>x</em>, is sampled. The decoder attempts to reconstruct the input sequence from <em>z</em>. The loss function of the VAE is given in <a id="xref-disp-formula-1-1" class="xref-disp-formula" href="#disp-formula-1">Equation 1</a>. For editing, the latent space of the SVAE was perturbed through the addition of Gaussian Noise <span class="inline-formula" id="inline-formula-1"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-1.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-1.gif"/></noscript></span></span> where <span class="inline-formula" id="inline-formula-2"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-2.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-2.gif"/></noscript></span></span> as proposed in Guu <em>et al</em> [<a id="xref-ref-8-4" class="xref-bibr" href="#ref-8">8</a>]. <span class="disp-formula" id="disp-formula-1"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/graphic-1.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/graphic-1.gif"/></noscript></span> </span></p></div><div id="sec-4" class="subsection"><h3>Encoder Decoder Analyzer (EDA) Custom Architecture</h3><p id="p-9">Our novel architecture called EDA consists of three deep neural network components: an Encoder, Decoder, and Analyzer. The Encoder and Decoder are recurrent neural networks (RNNs) with attention. The encoder architecture used here consisted of an embedding layer followed by a recurrent layer. The embedding layer contained learnable weights and output size <em>h</em> = 256, and the embedded outputs are then fed into a one layer bidirectional GRU, with dropout of 0.1. Similarly, the decoder consists of an embedding layer (output size <em>h</em> = 256 and dropout <em>p</em> = 0.1) which creates an embedding for the input base pair at each time step, followed by attention over the encoder outputs. The outputs from the attention layer are concatenated with the input at each time step, and fed into a dense connected layer with a ReLU activation function. The outputs from this dense layer are fed into a one layer GRU with dropout <em>p</em> = 0.1, followed by a fully connected layer with a softmax activation function. The output from the decoder at each time step is the predicted next token in the sequence. Conceptually, the Encoder learns to transform any one-hot encoded input DNA sequence to a compact latent representation. The Decoder learns to generate an output DNA sequence given a specific instantiation of the latent representation learned by the encoder. The Analyzer is a neural network that can map the latent representation of a DNA sequence to a specific property that we are typically interested in optimizing. Here, we use a convolutional neural network (CNN) as the Analyzer architecture, although it can be any differentiable architecture. Details of the Analyzer architecture are as follows: the model consists of three convolutional layers (15 filters, kernel size of 3, padding of 1), an average pooling layer, and two densely connected layers. The activation function following each layer was a ReLU activation, save for the last layer, which had a sigmoid activation function in the classification setting, and no nonlinearity in the regression setting.</p><p>The procedure for editing in the EDA architecture consists of three phases. </p><ul class="list-simple " id="list-1"><li id="list-item-1"><p id="p-11"><strong>Stage 1: Training Encoder-Decoder</strong>. The Encoder-Decoder seq2seq architecture is trained as an autoencoder, with loss equal to the categorical cross entropy between the softmax outputs and one-hot-encoded next base pair, summed over every position in the sequence. The loss was minimized with the Adam optimizer with learning rate 0.001. The Encoder takes in sequence <em>x</em> and produces a latent space embedding <em>z</em>, while the Decoder takes <em>z</em> and attempts to reproduce the original sequence <em>x</em>.</p></li><li id="list-item-2"><p id="p-12"><strong>Stage 2: Training Analyzer</strong> The same Encoder from stage 1 is also followed by an Analyzer module, which takes in the latent state <em>z</em> of a sequence from the Encoder, and produces an output prediction <em>y</em> of a property of the sequence.</p></li><li id="list-item-3"><p id="p-13"><strong>Stage 3: Editing</strong>. Given input sentence <em>x</em>, the encoder produces the latent state embedding <em>z</em> for the sentence. We update the latent state to minimize the loss function <span class="inline-formula" id="inline-formula-3"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-3.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-3.gif"/></noscript></span></span>(the binary cross entropy loss) between the analyzer’s prediction <span class="inline-formula" id="inline-formula-4"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-4.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-4.gif"/></noscript></span></span> and the desired score <em>y</em> via the Fast Gradient Sign Method (FGSM) [<a id="xref-ref-5-1" class="xref-bibr" href="#ref-5">5</a>] as in <a id="xref-disp-formula-2-1" class="xref-disp-formula" href="#disp-formula-2">Equation 2</a>.</p></li></ul><p> <span class="disp-formula" id="disp-formula-2"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/graphic-2.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/graphic-2.gif"/></noscript></span> </span></p><div id="statement-1" class="statement"><span class="statement-label">Algorithm 1</span><h3>EDA Architecture Editing.</h3><div id="F1" class="fig pos-float type-figure odd"><div class="highwire-figure"><div class="fig-inline-img-wrapper"><div class="fig-inline-img"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F1.large.jpg?width=800&height=600&carousel=1" title="" class="highwire-fragment fragment-images colorbox-load" rel="gallery-fragment-images-1794575304" data-figure-caption="<div class="highwire-markup"></div>" data-icon-position="" data-hide-link-title="0"><span class="hw-responsive-img"><img class="highwire-fragment fragment-image lazyload" alt="Figure" src="" data-src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F1.medium.gif" width="440" height="219"/><noscript><img class="highwire-fragment fragment-image" alt="Figure" src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F1.medium.gif" width="440" height="219"/></noscript></span></a></div></div><ul class="highwire-figure-links inline"><li class="download-fig first"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F1.large.jpg?download=true" class="highwire-figure-link highwire-figure-link-download" title="Download Figure1" data-icon-position="" data-hide-link-title="0">Download figure</a></li><li class="new-tab last"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F1.large.jpg" class="highwire-figure-link highwire-figure-link-newtab" target="_blank" data-icon-position="" data-hide-link-title="0">Open in new tab</a></li></ul></div></div><p id="p-14"></p></div><p id="p-15">The latent state is updated until the loss is approximately <span class="inline-formula" id="inline-formula-5"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-5.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-5.gif"/></noscript></span></span>. The decoder produces the edited sequence <span class="inline-formula" id="inline-formula-6"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-6.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-6.gif"/></noscript></span></span> from the modified latent representation <em>z</em>′. Epsilon (<em>ϵ</em>) is a hyperparameter varying between zero and one corresponding to the size of steps taken in the latent space.</p><p id="p-16">The pseudocode for the EDA training is shown in Algorithm 1. The training algorithm includes an additional step where the desired property (e.g. SPI binding or reporter expression) is predicted from the edited sentence <span class="inline-formula" id="inline-formula-7"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-7.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-7.gif"/></noscript></span></span>. This step is necessary as the latent representation <span class="inline-formula" id="inline-formula-8"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-8.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-8.gif"/></noscript></span></span> of the decoded sequence <span class="inline-formula" id="inline-formula-9"><span class="highwire-responsive-lazyload"><img src="" class="highwire-embed lazyload" alt="Embedded Image" data-src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-9.gif"/><noscript><img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2019/07/27/714402/embed/inline-graphic-9.gif"/></noscript></span></span> may not be the same as the perturbed latent representation <em>z</em>′ due to noise in the decoding process.</p></div></div><div class="section" id="sec-5"><h2 class="">3 Optimizing regulatory DNA sequences for binding of the SPI1 transcription factor</h2><div id="sec-6" class="subsection"><h3>Dataset</h3><p id="p-17">43,787 reproducibly-identified peaks from an ENCODE ChlP-seq experiment targeting the SPI1 transcription factor in lymphoblastoid cell line GM12878 (GEO GSM803531) were used as the positive labeled set of putative SPI1 bound sequences. The negative labeled set was constructed from an equal number of non-overlapping unbound 200 bp sequences from the human genome. Datapoints from chromosomes 1 and 2 were used as the test and validation set, respectively.</p></div><div id="sec-7" class="subsection"><h3>Baselines</h3><p id="p-18">An SVAE model was trained as a neural baseline, as described above. A simple rule-based editing model was also constructed that randomly adds the SPI1 consensus binding site (“AGGAA”) if not already present in the sequence.</p></div><div id="sec-8" class="subsection"><h3>Evaluation Method</h3><p id="p-19">Edited sequences are evaluated on three quantitative metrics: similarity to the original DNA sequence, predicted binding score (probability) of SPI1, and percent of sequences with matches to known SPI1 binding motifs [<a id="xref-ref-9-1" class="xref-bibr" href="#ref-9">9</a>]. Binding score is measured by an independent CNN model trained to discriminate 200bp sequence labeled as bound by SPI1 ChIP-seq data from a balanced number of background sequences from the geneome. This independent CNN model achieves AUROC of 0.979 and AUPRC of 0.978 on a held-out test set, where, for training the independent model, datapoints from chromosomes 1 and 2 were used as the test and validation set, specifically, while all other sequences were used in the training set. Similarity of edited sequences to the original sequences was calculated by the gapped kmer-mismatch (GKM) kernel, which evaluates DNA sequence similarity based on gapped kernel overlap [<a id="xref-ref-4-1" class="xref-bibr" href="#ref-4">4</a>]. We also used the BLEU-4 score, a metric more commonly used in NLP translation as another measure of sequence similarity.</p></div><div id="sec-9" class="subsection"><h3>3.1 SPI1 Editing Results</h3><div id="sec-10" class="subsection"><h4>Training Results</h4><p id="p-20">The loss curve for the SVAE, as well as the Encoder-Decoder portion of the EDA Architecture is shown in <a id="xref-fig-2-1" class="xref-fig" href="#F2">Figure 1</a>. Whereas the SVAE loss is extremely noisy and difficult to optimize during training, The Encoder-Decoder of EDA achieves average edit distance of 8.8 out of a maximum possible edit distance of 150 between the input and output sequences after 20,000 iterations of training, which shows that the autoencoder is accurately learning to replicate the sequence. The accuracy of the analyzer in the EDA architecture is 92.674%, with AUROC of 0.979 and AUPRC of 0.978.</p><div id="F2" class="fig pos-float type-figure odd"><div class="highwire-figure"><div class="fig-inline-img-wrapper"><div class="fig-inline-img"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F2.large.jpg?width=800&height=600&carousel=1" title="Training and validation loss curves of EDA Architecture (left) and VAE architecture (right)." class="highwire-fragment fragment-images colorbox-load" rel="gallery-fragment-images-1794575304" data-figure-caption="<div class="highwire-markup">Training and validation loss curves of EDA Architecture (left) and VAE architecture (right).</div>" data-icon-position="" data-hide-link-title="0"><span class="hw-responsive-img"><img class="highwire-fragment fragment-image lazyload" alt="Figure 1:" src="" data-src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F2.medium.gif" width="440" height="158"/><noscript><img class="highwire-fragment fragment-image" alt="Figure 1:" src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F2.medium.gif" width="440" height="158"/></noscript></span></a></div></div><ul class="highwire-figure-links inline"><li class="download-fig first"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F2.large.jpg?download=true" class="highwire-figure-link highwire-figure-link-download" title="Download Figure 1:" data-icon-position="" data-hide-link-title="0">Download figure</a></li><li class="new-tab last"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F2.large.jpg" class="highwire-figure-link highwire-figure-link-newtab" target="_blank" data-icon-position="" data-hide-link-title="0">Open in new tab</a></li></ul></div><div class="fig-caption" xmlns:xhtml="http://www.w3.org/1999/xhtml"><span class="fig-label">Figure 1:</span> <p id="p-21" class="first-child">Training and validation loss curves of EDA Architecture (left) and VAE architecture (right).</p><div class="sb-div caption-clear"></div></div></div></div><div id="sec-11" class="subsection"><h4>EDA Edited Sequences</h4><p id="p-22">500 randomly selected sequences from the balanced test set were edited using EDA, where previously bound sequences in the test set were classified based on overlap with SPI1 ChIP-seq peaks). Results are shown in <a id="xref-table-wrap-1-1" class="xref-table" href="#T1">Table 1</a>. Edits from the EDA architecture demonstrate high similarity (52.39% on average) to the original sequences, whereas the SVAE edits achieve similarity of only 5.2%. Any two random sequences from the set have GKM similarity of 2.048%. Thus, rather than editing, the SVAE appears to be sampling separate sequences.</p><div id="T1" class="table pos-float"><div class="table-inline table-callout-links"><div class="callout"><span>View this table:</span><ul class="callout-links"><li class="view-inline first"><a href="" class="table-expand-inline" data-table-url="/highwire/markup/847962/expansion?postprocessors=highwire_tables%2Chighwire_reclass%2Chighwire_figures%2Chighwire_math%2Chighwire_inline_linked_media%2Chighwire_embed&table-expand-inline=1" data-icon-position="" data-hide-link-title="0">View inline</a></li><li class="view-popup"><a href="/highwire/markup/847962/expansion?width=1000&height=500&iframe=true&postprocessors=highwire_tables%2Chighwire_reclass%2Chighwire_figures%2Chighwire_math%2Chighwire_inline_linked_media%2Chighwire_embed" class="colorbox colorbox-load table-expand-popup" rel="gallery-fragment-tables" data-icon-position="" data-hide-link-title="0">View popup</a></li><li class="download-ppt last"><a href="/highwire/powerpoint/847962" class="highwire-figure-link highwire-figure-link-ppt" data-icon-position="" data-hide-link-title="0">Download powerpoint</a></li></ul></div></div><div class="table-caption"><span class="table-label">Table 1:</span> <p id="p-23" class="first-child">Comparison of Editing Methods in terms of sequence similarity (out of 1), BLEU-4 score (out of 100), and percentage of sentences predicted positive.</p><div class="sb-div caption-clear"></div></div></div><p id="p-24">Overall, 84.4% of EDA-edited sequences are predicted to bind to SPI1 by the independent CNN model trained on SPI1 ChIP-seq data. 34.2% of these sequences contain a deterministic match to the SPI1 motif “AGGAA”, and 63.4% of sequences contain the “GGAA” portion, which is has the highest information content in the SPI1 PWM [<a id="xref-ref-9-2" class="xref-bibr" href="#ref-9">9</a>]. The rule-based baseline achieves only 45% sequences predicted positive, similar to the probability that any randomly chosen test set sequence would be predicted positive. Thus, editing these sequences to optimize for binding score is more complex than simply inserting high affinity SPI1 binding sites.</p><p id="p-25"><strong><a id="xref-table-wrap-2-1" class="xref-table" href="#T2">Table 2</a></strong> shows a DNA sequence which initially had a low binding score on the independent model, whose edit received a high score. The EDA model modifies the area in the initial sequence in gray into the full SPI1 binding motif shown in orange.</p><div id="T2" class="table pos-float"><div class="table-inline table-callout-links"><div class="callout"><span>View this table:</span><ul class="callout-links"><li class="view-inline first"><a href="" class="table-expand-inline" data-table-url="/highwire/markup/847960/expansion?postprocessors=highwire_tables%2Chighwire_reclass%2Chighwire_figures%2Chighwire_math%2Chighwire_inline_linked_media%2Chighwire_embed&table-expand-inline=1" data-icon-position="" data-hide-link-title="0">View inline</a></li><li class="view-popup"><a href="/highwire/markup/847960/expansion?width=1000&height=500&iframe=true&postprocessors=highwire_tables%2Chighwire_reclass%2Chighwire_figures%2Chighwire_math%2Chighwire_inline_linked_media%2Chighwire_embed" class="colorbox colorbox-load table-expand-popup" rel="gallery-fragment-tables" data-icon-position="" data-hide-link-title="0">View popup</a></li><li class="download-ppt last"><a href="/highwire/powerpoint/847960" class="highwire-figure-link highwire-figure-link-ppt" data-icon-position="" data-hide-link-title="0">Download powerpoint</a></li></ul></div></div><div class="table-caption"><span class="table-label">Table 2:</span> <p id="p-26" class="first-child">Original sequence and edited sequence from the EDA architecture. The SPI1 motif is highlighted in orange.</p><div class="sb-div caption-clear"></div></div></div></div></div></div><div class="section" id="sec-12"><h2 class="">4 Optimizing Reporter Expression of regulatory DNA sequences containing CREB1 binding site grammars</h2><div id="sec-13" class="subsection"><h3>Dataset</h3><p id="p-27">The CRE MPRA dataset from Davis et al. measures reporter gene expression of a library of DNA sequences with various configurations of CREB1 binding sites by varying motif strength, density, spacing, and distance from the core promoter[<a id="xref-ref-3-2" class="xref-bibr" href="#ref-3">3</a>]. The genomic MPRA dataset consists of 3480 sequences of 150bp within 3 backgrounds with different combinations and locations of CREB1 binding sites. Davis <em>et al.</em> define a strong CREB1 consensus binding site as “TGACGTCA”, and a weak binding site is “TGAAGTCA”. Reporter expression of the library is measured by the log ratio of counts of RNA barcode reads of a sequence to the count of DNA reads. A histogram of log fold change expression levels is shown in <a id="xref-fig-3-1" class="xref-fig" href="#F3">Figure 2</a>. Seventy percent of this dataset was randomly selected for training, with twenty percent for validation and ten percent for testing.</p><div id="F3" class="fig pos-float type-figure odd"><div class="highwire-figure"><div class="fig-inline-img-wrapper"><div class="fig-inline-img"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F3.large.jpg?width=800&height=600&carousel=1" title="Histogram of log(expression) levels for sequences in CREB1 MPRA dataset; expression ranges widely, from less than zero, meaning that number of RNA barcoded reads are less than the original DNA reads, to six." class="highwire-fragment fragment-images colorbox-load" rel="gallery-fragment-images-1794575304" data-figure-caption="<div class="highwire-markup">Histogram of log(expression) levels for sequences in CREB1 MPRA dataset; expression ranges widely, from less than zero, meaning that number of RNA barcoded reads are less than the original DNA reads, to six.</div>" data-icon-position="" data-hide-link-title="0"><span class="hw-responsive-img"><img class="highwire-fragment fragment-image lazyload" alt="Figure 2:" src="" data-src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F3.medium.gif" width="440" height="311"/><noscript><img class="highwire-fragment fragment-image" alt="Figure 2:" src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F3.medium.gif" width="440" height="311"/></noscript></span></a></div></div><ul class="highwire-figure-links inline"><li class="download-fig first"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F3.large.jpg?download=true" class="highwire-figure-link highwire-figure-link-download" title="Download Figure 2:" data-icon-position="" data-hide-link-title="0">Download figure</a></li><li class="new-tab last"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F3.large.jpg" class="highwire-figure-link highwire-figure-link-newtab" target="_blank" data-icon-position="" data-hide-link-title="0">Open in new tab</a></li></ul></div><div class="fig-caption"><span class="fig-label">Figure 2:</span> <p id="p-28" class="first-child">Histogram of log(expression) levels for sequences in CREB1 MPRA dataset; expression ranges widely, from less than zero, meaning that number of RNA barcoded reads are less than the original DNA reads, to six.</p><div class="sb-div caption-clear"></div></div></div><p id="p-29">From analysis of the MPRA dataset, Davis <em>et al</em> find four main correlations between CREB1 binding site configurations in the library and corresponding reporter expression levels: 1) number of strong CRE binding sites positively correlates with expression, 2) weak binding sites increase expression given the presence of at least one strong binding site, 3) higher expression occurs with shorter distance of CRE binding sites to the core promoter, and 4) spacing between CRE binding sites modulates periodicity of expression, as two strong binding sites are moved along the sequence.</p><p id="p-30">Here, the editing task is to optimize sequences for particular expression profiles; in particular, to edit MPRA sequences that have high measured reporter expression (log(expression)) ≥ 5) to new sequences that have low predicted expression, and vice versa. As several correlations between sequence properties and expression are already discussed by Davis <em>et al.,</em> we investigate whether the edited sequences show evidence of these previously discovered rules.</p></div><div id="sec-14" class="subsection"><h3>Independent Analyzer</h3><p id="p-31">As an independent predictor from the analyzer in the EDA architecture, we train a log-linear model of expression levels similar to Davis <em>et al.,</em> with expression predicted from the number of strong and weak binding sites, sequence background, average spacing between CREB1 sites, and distance from the minimal promoter element; in addition, polynomial features of degree 2 were used to model interaction terms. This simple model achieves <em>R</em><sup>2</sup> = 0.801 on a held out test set consisting of 10 percent of the training data, which was the same test set as used for the training of the EDA architecture (<strong><a id="xref-fig-4-1" class="xref-fig" href="#F4">Figure 3a</a>.</strong>). This independent log linear model was used to evaluate the edited sequences from the EDA model.</p><div id="F4" class="fig pos-float type-figure odd"><div class="highwire-figure"><div class="fig-inline-img-wrapper"><div class="fig-inline-img"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F4.large.jpg?width=800&height=600&carousel=1" title="Predicted expression versus measured expression for a) independent log-linear model, and b) analyzer from EDA architecture" class="highwire-fragment fragment-images colorbox-load" rel="gallery-fragment-images-1794575304" data-figure-caption="<div class="highwire-markup">Predicted expression versus measured expression for a) independent log-linear model, and b) analyzer from EDA architecture</div>" data-icon-position="" data-hide-link-title="0"><span class="hw-responsive-img"><img class="highwire-fragment fragment-image lazyload" alt="Figure 3:" src="" data-src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F4.medium.gif" width="440" height="179"/><noscript><img class="highwire-fragment fragment-image" alt="Figure 3:" src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F4.medium.gif" width="440" height="179"/></noscript></span></a></div></div><ul class="highwire-figure-links inline"><li class="download-fig first"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F4.large.jpg?download=true" class="highwire-figure-link highwire-figure-link-download" title="Download Figure 3:" data-icon-position="" data-hide-link-title="0">Download figure</a></li><li class="new-tab last"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F4.large.jpg" class="highwire-figure-link highwire-figure-link-newtab" target="_blank" data-icon-position="" data-hide-link-title="0">Open in new tab</a></li></ul></div><div class="fig-caption"><span class="fig-label">Figure 3:</span> <p id="p-32" class="first-child">Predicted expression versus measured expression for a) independent log-linear model, and b) analyzer from EDA architecture</p><div class="sb-div caption-clear"></div></div></div></div><div id="sec-15" class="subsection"><h3>Training Results</h3><p id="p-33">The components of the EDA model were trained on the CRE MPRA dataset to learn a latent representation of the sequences through the encoder-decoder portion, and to predict reporter expression levels from this latent representation through the analyzer. As described in <a id="xref-sec-2-1" class="xref-sec" href="#sec-2">Section 2</a>, the encoder and decoder were both recurrent neural networks, where the decoder has softmax attention over the encoder outputs. The seq2seq autoencoder achieved an average edit distance of 6.06 between input and output after training for 10,000 iterations. The analyzer, as above, was a CNN architecture with three convolutional layers, each followed by a ReLU activation, average pooling, and two dense layers; this architecture achieved <em>R</em><sup>2</sup> = 0.93 on a held out test set; the analyzer’s predicted expressions correlate well with measured expressions, as shown in <strong><a id="xref-fig-4-2" class="xref-fig" href="#F4">Figure 3b</a></strong>.</p></div><div id="sec-16" class="subsection"><h3>EDA Editing Results</h3><p id="p-34">Here, our editing task is to edit CREB1 MPRA sequences from a held out test set that exhibit high measured expression to new sequences with low expression and vice versa. We further evaluate whether the resulting edited sequences displayed known patterns of CRE binding site placement elucidated by Davis <em>et al</em>. 204 sequences with measured MPRA log(expression) <= 0 were edited using EDA to a higher desired target level of log(expression) = 5.0. 96 sequences with measured MPRA log(expression) >= 5 were edited using EDA to obtain a lower target level of log(expression) = 0.</p><p id="p-35">As evaluated by the independent analyzer, 79.4% of sequences to be edited from low expression to high expression were predicted to have higher expression post editing. 100% of sequences edited from high to low expression were predicted to have lower expression post editing. The histograms of log expression levels both before and after editing are shown in <strong><a id="xref-fig-5-1" class="xref-fig" href="#F5">Figure 4</a></strong>.</p><div id="F5" class="fig pos-float type-figure odd"><div class="highwire-figure"><div class="fig-inline-img-wrapper"><div class="fig-inline-img"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F5.large.jpg?width=800&height=600&carousel=1" title="Histograms of log expression levels before and after editing by the EDA architecture and evaluation on an independent linear model. Log expression of original sequences is shown in orange, while log expression of edited sequences is shown in blue, where expression of edited sequences is predicted by the independent model. Edits targeted from low to high expression are shown in a, and edits from high to low expresssion are shown in b. Here, low expression was defined as log(expression) ≤ 0.0, and high expression was defined as log(expression) ≥ 5.0." class="highwire-fragment fragment-images colorbox-load" rel="gallery-fragment-images-1794575304" data-figure-caption="<div class="highwire-markup">Histograms of log expression levels before and after editing by the EDA architecture and evaluation on an independent linear model. Log expression of original sequences is shown in orange, while log expression of edited sequences is shown in blue, where expression of edited sequences is predicted by the independent model. Edits targeted from low to high expression are shown in a, and edits from high to low expresssion are shown in b. Here, low expression was defined as log(expression) ≤ 0.0, and high expression was defined as log(expression) ≥ 5.0.</div>" data-icon-position="" data-hide-link-title="0"><span class="hw-responsive-img"><img class="highwire-fragment fragment-image lazyload" alt="Figure 4:" src="" data-src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F5.medium.gif" width="440" height="161"/><noscript><img class="highwire-fragment fragment-image" alt="Figure 4:" src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F5.medium.gif" width="440" height="161"/></noscript></span></a></div></div><ul class="highwire-figure-links inline"><li class="download-fig first"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F5.large.jpg?download=true" class="highwire-figure-link highwire-figure-link-download" title="Download Figure 4:" data-icon-position="" data-hide-link-title="0">Download figure</a></li><li class="new-tab last"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F5.large.jpg" class="highwire-figure-link highwire-figure-link-newtab" target="_blank" data-icon-position="" data-hide-link-title="0">Open in new tab</a></li></ul></div><div class="fig-caption"><span class="fig-label">Figure 4:</span> <p id="p-36" class="first-child">Histograms of log expression levels before and after editing by the EDA architecture and evaluation on an independent linear model. Log expression of original sequences is shown in orange, while log expression of edited sequences is shown in blue, where expression of edited sequences is predicted by the independent model. Edits targeted from low to high expression are shown in a, and edits from high to low expresssion are shown in b. Here, low expression was defined as log(expression) ≤ 0.0, and high expression was defined as log(expression) ≥ 5.0.</p><div class="sb-div caption-clear"></div></div></div><p id="p-37">Next, we inspected several representative edited sequences in order to evaluate whether they contained CREB1 binding site configurations that were previously associated with high and low expression read outs, where examples are shown in <strong><a id="xref-fig-6-1" class="xref-fig" href="#F6">Figure 5</a></strong>.</p><div id="F6" class="fig pos-float type-figure odd"><div class="highwire-figure"><div class="fig-inline-img-wrapper"><div class="fig-inline-img"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F6.large.jpg?width=800&height=600&carousel=1" title="Strong CREB1 binding motif is highlighted in yellow, while weak motif is highlighted in orange. Predicted expression, measured by the log of the ratio of RNA to DNA counts, is shown above each sequence." class="highwire-fragment fragment-images colorbox-load" rel="gallery-fragment-images-1794575304" data-figure-caption="<div class="highwire-markup">Strong CREB1 binding motif is highlighted in yellow, while weak motif is highlighted in orange. Predicted expression, measured by the log of the ratio of RNA to DNA counts, is shown above each sequence.</div>" data-icon-position="" data-hide-link-title="0"><span class="hw-responsive-img"><img class="highwire-fragment fragment-image lazyload" alt="Figure 5:" src="" data-src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F6.medium.gif" width="440" height="424"/><noscript><img class="highwire-fragment fragment-image" alt="Figure 5:" src="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F6.medium.gif" width="440" height="424"/></noscript></span></a></div></div><ul class="highwire-figure-links inline"><li class="download-fig first"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F6.large.jpg?download=true" class="highwire-figure-link highwire-figure-link-download" title="Download Figure 5:" data-icon-position="" data-hide-link-title="0">Download figure</a></li><li class="new-tab last"><a href="https://www.biorxiv.org/content/biorxiv/early/2019/07/27/714402/F6.large.jpg" class="highwire-figure-link highwire-figure-link-newtab" target="_blank" data-icon-position="" data-hide-link-title="0">Open in new tab</a></li></ul></div><div class="fig-caption"><span class="fig-label">Figure 5:</span> <p id="p-38" class="first-child">Strong CREB1 binding motif is highlighted in yellow, while weak motif is highlighted in orange. Predicted expression, measured by the log of the ratio of RNA to DNA counts, is shown above each sequence.</p><div class="sb-div caption-clear"></div></div></div><p id="p-39">Example 1 illustrates the result that number of strong CREB1 binding sites is positively correlated with expression levels, as the edited sequence has the third strong CREB1 binding site changed to a weak site, resulting in expression being predicted to be more than 2x lower after editing.</p><p id="p-40">In example 2, the original sequence has four weak binding sites and exhibits low measured expression (−0.101). After editing, the third weak binding site is changed to a strong binding site, the fourth weak binding site is deleted, and four additional strong binding sites are added to the sequence. This edit, with predicted expression 4.58, also displays the result found in Davis <em>et al</em>, that number of weak binding sites increases reporter gene expression given the presence of at least one strong binding site.</p><p id="p-41">In the third example, EDA moves a CREB1 motif moved further away from the minimal promoter element in order to reduce expression. This transformation of the sequence aligns with the previous reported observation from the MPRA study that distance of motifs from the core promoter negatively correlated with expression. These examples are presented with the caveat that they cannot be used to show that the model has “learned” particular rules – only that the results from the EDA architecture align with known experimental correlations between CREB1 binding sites and reporter expression from the MPRA study.</p></div></div><div class="section" id="sec-17"><h2 class="">5 Conclusion</h2><p id="p-42">The EDA architecture proposed for targeted genomic editing brings together a broad array of techniques from attention-based seq2seq models, adversarial example generation, and computer vision. The architecture leverages existing genomic predictors to generate candidate modifications of sequences with diverse properties such as binding probability of a transcription factor or reporter gene expression levels. In the first case study of optimizing binding of the SPI1 transcription factor, we compared the EDA model to existing neural baselines – such as the Sequence VAE model and a rule-based baseline – and showed that EDA vastly improves upon existing models in both predicted binding affinity and similarity of original to edited sequences.</p><p id="p-43">In the second case study, where we optimized binding site configurations of the CREB1 transcription factor in regulatory DNA sequences to tune reporter expression levels, we showed that a high proportion of the edited sequences show the desired shift in expression as predicted by an independent model. Furthermore, several edited sequences displayed CREB1 motif configurations in terms of binding site strength, density and position that agreed with previously derived rules.</p><p id="p-44">This study primarily serves as a proof of concept and introduction of a novel neural architecture for targeted DNA sequence editing. In this work, we used independent predictors of the desired properties of DNA sequences to computationally validate the edited sequences. In the near future, we plan to provide experimental validation of the properties of edited sequences as more definitive support for our approach. EDA is very flexible and can also be easily adapted to other applications involving targeted DNA and RNA editing. We expect further advances in generative models that can perform targeted editing of biological sequences such as DNA, RNA and proteins have the potential to complement and improve the precision of experimental approaches for genome engineering and synthetic biology.</p></div><div class="section ack" id="ack-1"><h2 class="">6 Acknowledgements</h2><p id="p-45">We would like to thank Georgi Marinov for his help with processing and understanding the MPRA dataset. We would like to thank the authors of Davis <em>et al.</em> [<a id="xref-ref-3-3" class="xref-bibr" href="#ref-3">3</a>] for sharing their MPRA data pre-publication. We would like to thank Avanti Shrikumar and other members of the Kundaje lab for helpful discussions.</p><p id="p-46">This work was supported by NIH grants 1DP2GM123485, 1U01HG009431 and 1R01HG00967401 awarded to AK.</p></div><div class="section fn-group" id="fn-group-1"><h2>Footnotes</h2><ul><li class="fn" id="fn-1"><p id="p-1"><span class="em-link"><span class="em-addr">akundaje{at}stanford.edu</span></span></p></li></ul></div><div class="section ref-list" id="ref-list-1"><h2 class="">References</h2><ol class="cit-list ref-use-labels"><li><span class="ref-label">[1].</span><a class="rev-xref-ref" href="#xref-ref-1-1" title="View reference [1] in text" id="ref-1">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.1"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">S. R.</span> <span class="cit-name-surname">Bowman</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">L.</span> <span class="cit-name-surname">Vilnis</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">O.</span> <span class="cit-name-surname">Vinyals</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">A. M.</span> <span class="cit-name-surname">Dai</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">R.</span> <span class="cit-name-surname">Jozefowicz</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">S.</span> <span class="cit-name-surname">Bengio</span></span>. <span class="cit-article-title">Generating sentences from a continuous space</span>. <abbr class="cit-jnl-abbrev">arXiv</abbr> preprint<span class="cit-pub-id-sep cit-pub-id-arxiv-sep"> </span><span class="cit-pub-id-scheme">arXiv:</span><span class="cit-pub-id cit-pub-id-arxiv">1511.06349</span>, <span class="cit-pub-date">2015</span>.</cite></div><div class="cit-extra"></div></div></li><li><span class="ref-label">[2].</span><div class="cit ref-cit ref-journal no-rev-xref" id="cit-714402v2.2" data-doi="10.1038/nature11247"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">E. P.</span> <span class="cit-name-surname">Consortium</span></span> <span class="cit-etal">et al.</span> <span class="cit-article-title">An integrated encyclopedia of dna elements in the human genome</span>. <abbr class="cit-jnl-abbrev">Nature</abbr>, <span class="cit-vol">489</span>(<span class="cit-issue">7414</span>):<span class="cit-fpage">57</span>, <span class="cit-pub-date">2012</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DNature%26rft.stitle%253DNature%26rft.aulast%253DBernstein%26rft.auinit1%253DB.%2BE.%26rft.volume%253D489%26rft.issue%253D7414%26rft.spage%253D57%26rft.epage%253D74%26rft.atitle%253DAn%2Bintegrated%2Bencyclopedia%2Bof%2BDNA%2Belements%2Bin%2Bthe%2Bhuman%2Bgenome.%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnature11247%26rft_id%253Dinfo%253Apmid%252F22955616%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a><a href="/lookup/external-ref?access_num=10.1038/nature11247&link_type=DOI" class="cit-ref-sprinkles cit-ref-sprinkles-doi cit-ref-sprinkles-crossref"><span>CrossRef</span></a><a href="/lookup/external-ref?access_num=22955616&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2019%2F07%2F27%2F714402.atom" class="cit-ref-sprinkles cit-ref-sprinkles-medline"><span>PubMed</span></a><a href="/lookup/external-ref?access_num=000308347000039&link_type=ISI" class="cit-ref-sprinkles cit-ref-sprinkles-newisilink cit-ref-sprinkles-webofscience"><span>Web of Science</span></a></div></div></li><li><span class="ref-label">[3].</span><a class="rev-xref-ref" href="#xref-ref-3-1" title="View reference [3] in text" id="ref-3">↵</a><div class="cit ref-cit ref-web" id="cit-714402v2.3" data-doi="10.1101/625434"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">J. E.</span> <span class="cit-name-surname">Davis</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">K. D.</span> <span class="cit-name-surname">Insigne</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">E. M.</span> <span class="cit-name-surname">Jones</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">Q. B.</span> <span class="cit-name-surname">Hastings</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">S.</span> <span class="cit-name-surname">Kosuri</span></span>. <span class="cit-article-title">Multiplexed dissection of a model human transcription factor binding site architecture</span>. <span class="cit-source">bioRxiv</span>, <span class="cit-pub-date">2019</span>.<span class="cit-pub-id-sep cit-pub-id-doi-sep"> </span><span class="cit-pub-id-scheme">doi: </span><span class="cit-pub-id cit-pub-id-doi">10.1101/625434</span><span class="cit-pub-id-sep cit-pub-id-doi-sep">.</span> URL <a href="https://www.biorxiv.org/content/early/2019/05/02/625434">https://www.biorxiv.org/content/early/2019/05/02/625434</a>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DbioRxiv%26rft_id%253Dinfo%253Adoi%252F10.1101%252F625434%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a><a href="/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYmlvcnhpdiI7czo1OiJyZXNpZCI7czo4OiI2MjU0MzR2MiI7czo0OiJhdG9tIjtzOjM3OiIvYmlvcnhpdi9lYXJseS8yMDE5LzA3LzI3LzcxNDQwMi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=" class="cit-ref-sprinkles cit-ref-sprinkles-ijlink"><span><span class="cit-reflinks-abstract">Abstract</span><span class="cit-sep cit-reflinks-variant-name-sep">/</span><span class="cit-reflinks-full-text"><span class="free-full-text">FREE </span>Full Text</span></span></a></div></div></li><li><span class="ref-label">[4].</span><a class="rev-xref-ref" href="#xref-ref-4-1" title="View reference [4] in text" id="ref-4">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.4"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">M.</span> <span class="cit-name-surname">Ghandi</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">D.</span> <span class="cit-name-surname">Lee</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">M.</span> <span class="cit-name-surname">Mohammad-Noori</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">M. A.</span> <span class="cit-name-surname">Beer</span></span>. <span class="cit-article-title">Enhanced regulatory sequence prediction using gapped k-mer features</span>. <abbr class="cit-jnl-abbrev">PLoS computational biology</abbr>, <span class="cit-vol">10</span>(<span class="cit-issue">7</span>):<span class="cit-fpage">e1003711</span>, <span class="cit-pub-date">2014</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DPLoS%2Bcomputational%2Bbiology%26rft.volume%253D10%26rft.spage%253D1003711e%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a></div></div></li><li><span class="ref-label">[5].</span><a class="rev-xref-ref" href="#xref-ref-5-1" title="View reference [5] in text" id="ref-5">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.5"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">I. J.</span> <span class="cit-name-surname">Goodfellow</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">J.</span> <span class="cit-name-surname">Shlens</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">C.</span> <span class="cit-name-surname">Szegedy</span></span>. <span class="cit-article-title">Explaining and harnessing adversarial examples</span>. <abbr class="cit-jnl-abbrev">stat</abbr>, <span class="cit-vol">1050</span>:<span class="cit-fpage">20</span>, <span class="cit-pub-date">2015</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253Dstat%26rft.volume%253D1050%26rft.spage%253D20%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a></div></div></li><li><span class="ref-label">[6].</span><a class="rev-xref-ref" href="#xref-ref-6-1" title="View reference [6] in text" id="ref-6">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.6"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">A.</span> <span class="cit-name-surname">Gupta</span></span> and <span class="cit-auth"> <span class="cit-name-given-names">J.</span> <span class="cit-name-surname">Zou</span></span>. <span class="cit-article-title">Feedback gan for dna optimizes protein functions</span>. <abbr class="cit-jnl-abbrev">Nature Machine Intelligence</abbr>, <span class="cit-vol">1</span>(<span class="cit-issue">2</span>):<span class="cit-fpage">105</span>, <span class="cit-pub-date">2019</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DNature%2BMachine%2BIntelligence%26rft.volume%253D1%26rft.spage%253D105%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a></div></div></li><li><span class="ref-label">[7].</span><a class="rev-xref-ref" href="#xref-ref-7-1" title="View reference [7] in text" id="ref-7">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.7"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">A.</span> <span class="cit-name-surname">Gupta</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">A. T.</span> <span class="cit-name-surname">Müller</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">B. J.</span> <span class="cit-name-surname">Huisman</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">J. A.</span> <span class="cit-name-surname">Fuchs</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">P.</span> <span class="cit-name-surname">Schneider</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">G.</span> <span class="cit-name-surname">Schneider</span></span>. <span class="cit-article-title">Generative recurrent networks for de novo drug design</span>. <abbr class="cit-jnl-abbrev">Molecular informatics</abbr>, <span class="cit-vol">37</span>(<span class="cit-issue">1-2</span>):<span class="cit-fpage">1700111</span>, <span class="cit-pub-date">2018</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DMolecular%2Binformatics%26rft.volume%253D37%26rft.spage%253D1700111%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a></div></div></li><li><span class="ref-label">[8].</span><a class="rev-xref-ref" href="#xref-ref-8-1" title="View reference [8] in text" id="ref-8">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.8"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">K.</span> <span class="cit-name-surname">Guu</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">T. B.</span> <span class="cit-name-surname">Hashimoto</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">Y.</span> <span class="cit-name-surname">Oren</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">P.</span> <span class="cit-name-surname">Liang</span></span>. <span class="cit-article-title">Generating sentences by editing prototypes</span>. <abbr class="cit-jnl-abbrev">Transactions of the Association of Computational Linguistics</abbr>, <span class="cit-vol">6</span>:<span class="cit-fpage">437</span>–<span class="cit-lpage">450</span>, <span class="cit-pub-date">2018</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DTransactions%2Bof%2Bthe%2BAssociation%2Bof%2BComputational%2BLinguistics%26rft.volume%253D6%26rft.spage%253D437%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a></div></div></li><li><span class="ref-label">[9].</span><a class="rev-xref-ref" href="#xref-ref-9-1" title="View reference [9] in text" id="ref-9">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.9" data-doi="10.1016/j.molcel.2010.05.004"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">S.</span> <span class="cit-name-surname">Heinz</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">C.</span> <span class="cit-name-surname">Benner</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">N.</span> <span class="cit-name-surname">Spann</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">E.</span> <span class="cit-name-surname">Bertolino</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">Y. C.</span> <span class="cit-name-surname">Lin</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">P.</span> <span class="cit-name-surname">Laslo</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">J. X.</span> <span class="cit-name-surname">Cheng</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">C.</span> <span class="cit-name-surname">Murre</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">H.</span> <span class="cit-name-surname">Singh</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">C. K.</span> <span class="cit-name-surname">Glass</span></span>. <span class="cit-article-title">Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and b cell identities</span>. <abbr class="cit-jnl-abbrev">Molecular cell</abbr>, <span class="cit-vol">38</span>(<span class="cit-issue">4</span>):<span class="cit-fpage">576</span>–<span class="cit-lpage">589</span>, <span class="cit-pub-date">2010</span>.</cite></div><div class="cit-extra"><a href="{openurl}?query=rft.jtitle%253DMolecular%2Bcell%26rft.stitle%253DMol%2BCell%26rft.aulast%253DHeinz%26rft.auinit1%253DS.%26rft.volume%253D38%26rft.issue%253D4%26rft.spage%253D576%26rft.epage%253D589%26rft.atitle%253DSimple%2Bcombinations%2Bof%2Blineage-determining%2Btranscription%2Bfactors%2Bprime%2Bcis-regulatory%2Belements%2Brequired%2Bfor%2Bmacrophage%2Band%2BB%2Bcell%2Bidentities.%26rft_id%253Dinfo%253Adoi%252F10.1016%252Fj.molcel.2010.05.004%26rft_id%253Dinfo%253Apmid%252F20513432%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx" class="cit-ref-sprinkles cit-ref-sprinkles-openurl cit-ref-sprinkles-open-url"><span>OpenUrl</span></a><a href="/lookup/external-ref?access_num=10.1016/j.molcel.2010.05.004&link_type=DOI" class="cit-ref-sprinkles cit-ref-sprinkles-doi cit-ref-sprinkles-crossref"><span>CrossRef</span></a><a href="/lookup/external-ref?access_num=20513432&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2019%2F07%2F27%2F714402.atom" class="cit-ref-sprinkles cit-ref-sprinkles-medline"><span>PubMed</span></a><a href="/lookup/external-ref?access_num=000278448100012&link_type=ISI" class="cit-ref-sprinkles cit-ref-sprinkles-newisilink cit-ref-sprinkles-webofscience"><span>Web of Science</span></a></div></div></li><li><span class="ref-label">[10].</span><a class="rev-xref-ref" href="#xref-ref-10-1" title="View reference [10] in text" id="ref-10">↵</a><div class="cit ref-cit ref-journal" id="cit-714402v2.10"><div class="cit-metadata"><cite><span class="cit-auth"> <span class="cit-name-given-names">N.</span> <span class="cit-name-surname">Killoran</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">L. J.</span> <span class="cit-name-surname">Lee</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">A.</span> <span class="cit-name-surname">Delong</span></span>, <span class="cit-auth"> <span class="cit-name-given-names">D.</span> <span class="cit-name-surname">Duvenaud</span></span>, and <span class="cit-auth"> <span class="cit-name-given-names">B. J.</span> <span class="cit-name-surname">Frey</span></span>. <span class="cit-article-title">Generating and designing dna with deep generative models</span>. <abbr class="cit-jnl-abbrev">arXiv</abbr> preprint<span class="cit-pub-id-sep cit-pub-id-arxiv-sep"> </span><span class="cit-pub-id-scheme">arXiv:</span><span class="cit-pub-id cit-pub-id-arxiv">1712.06148</span>, <span class="cit-pub-date">2017</span>.</cite></div><div class="cit-extra"></div></div></li></ol></div><span class="highwire-journal-article-marker-end"></span></div><span class="related-urls"></span></div></div> </div> </div> </div> </div> </div> </div></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-disqus-comment" > <div class="pane-content"> <div id="disqus_thread"><noscript><p><a href="http://biorxivstage.disqus.com/?url=https%3A%2F%2Fwww.biorxiv.org%2Fcontent%2F10.1101%2F714402v2" class="" data-icon-position="" data-hide-link-title="0">View the discussion thread.</a></p></noscript></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-highwire-back-to-top" > <div class="pane-content"> <a href="#page" class="back-to-top" data-icon-position="" data-hide-link-title="0"><span class="icon-chevron-up"></span> Back to top</a> </div> </div> </div> </div> </div> <div class="sidebar-right-wrapper grid-10 omega"> <div class="panel-panel panel-region-sidebar-right"> <div class="inside"><div class="panel-pane pane-highwire-node-pager" > <div class="pane-content"> <div class="pager highwire-pager pager-mini clearfix highwire-node-pager highwire-article-pager"><span class="pager-prev"><a href="/content/10.1101/702621v2" title="The development of methodology and techniques for crop disease identification" rel="prev" class="pager-link-prev link-icon"><span class="icon-circle-arrow-left"></span> <span class="title">Previous</span></a></span><span class="pager-next"><a href="/content/10.1101/569319v4" title="Neonatal morphometric similarity mapping for predicting brain age and characterizing neuroanatomic variation associated with preterm birth" rel="next" class="pager-link-next link-icon-right link-icon"><span class="title">Next</span> <span class="icon-circle-arrow-right"></span></a></span></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-custom pane-1" > <div class="pane-content"> Posted July 28, 2019. </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-panels-mini pane-biorxiv-art-tools" > <div class="pane-content"> <div id="mini-panel-biorxiv_art_tools" class="highwire-2col-stacked panel-display"> <div class="panel-row-wrapper clearfix"> <div class="content-left-wrapper content-column"> <div class="panel-panel panel-region-content-left"> <div class="inside"><div class="panel-pane pane-highwire-variant-link" > <div class="pane-content"> <a href="/content/10.1101/714402v2.full.pdf" target="_self" class="article-dl-pdf-link link-icon"><span class="icon-external-link-sign"></span> <span class="title">Download PDF</span></a> </div> </div> </div> </div> </div> <div class="content-right-wrapper content-column"> <div class="panel-panel panel-region-content-right"> <div class="inside"><div class="panel-pane pane-minipanel-dialog-link pane-biorxiv-art-email" > <div class="pane-content"> <div class='minipanel-dialog-wrapper'><div class='minipanel-dialog-link-link'><a href="/" oncontextmenu="javascript: return false;" class="minipanel-dialog-link-trigger" title="Email this Article" data-icon-position="" data-hide-link-title="0"><i class = 'icon-envelope'></i> Email</a></div><div class='minipanel-dialog-link-mini' style='display:none'><div class="panel-display panel-1col clearfix" id="mini-panel-biorxiv_art_email"> <div class="panel-panel panel-col"> <div><div class="panel-pane pane-block pane-forward-form pane-forward" > <div class="pane-content"> <form action="/content/10.1101/714402v2.full" method="post" id="forward-form" accept-charset="UTF-8"><div><div id="edit-instructions" class="form-item form-item-label-before form-type-item"> <p>Thank you for your interest in spreading the word about bioRxiv.</p><p>NOTE: Your email address is requested solely to identify you as the sender of this article.</p> </div> <div class="form-item form-item-label-before form-type-textfield form-item-email"> <label for="edit-email">Your Email <span class="form-required" title="This field is required.">*</span></label> <input type="text" id="edit-email" name="email" value="" size="58" maxlength="256" class="form-text required" /> </div> <div class="form-item form-item-label-before form-type-textfield form-item-name"> <label for="edit-name">Your Name <span class="form-required" title="This field is required.">*</span></label> <input type="text" id="edit-name" name="name" value="" size="58" maxlength="256" class="form-text required" /> </div> <div class="form-item form-item-label-before form-type-textarea form-item-recipients"> <label for="edit-recipients">Send To <span class="form-required" title="This field is required.">*</span></label> <div class="form-textarea-wrapper resizable"><textarea id="edit-recipients" name="recipients" cols="50" rows="5" class="form-textarea required"></textarea></div> <div class="description">Enter multiple addresses on separate lines or separate them with commas.</div> </div> <div id="edit-page" class="form-item form-item-label-before form-type-item"> <label for="edit-page">You are going to email the following </label> <a href="/content/10.1101/714402v2" class="active" data-icon-position="" data-hide-link-title="0">Targeted optimization of regulatory DNA sequences with neural editing architectures</a> </div> <div id="edit-subject" class="form-item form-item-label-before form-type-item"> <label for="edit-subject">Message Subject </label> (Your Name) has forwarded a page to you from bioRxiv </div> <div id="edit-body" class="form-item form-item-label-before form-type-item"> <label for="edit-body">Message Body </label> (Your Name) thought you would like to see this page from the bioRxiv website. </div> <div class="form-item form-item-label-before form-type-textarea form-item-message"> <label for="edit-message--2">Your Personal Message </label> <div class="form-textarea-wrapper resizable"><textarea id="edit-message--2" name="message" cols="50" rows="10" class="form-textarea"></textarea></div> </div> <input type="hidden" name="path" value="node/843498" /> <input type="hidden" name="path_cid" value="" /> <input type="hidden" name="forward_footer" value=" " /> <input type="hidden" name="form_build_id" value="form-EyjQYT3_xrxOrWSDbsrRqg9ExaOQFmZrv52bX8QXMc4" /> <input type="hidden" name="form_id" value="forward_form" /> <fieldset class="captcha form-wrapper"><legend><span class="fieldset-legend">CAPTCHA</span></legend><div class="fieldset-wrapper"><div class="fieldset-description">This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.</div><input type="hidden" name="captcha_sid" value="853345465" /> <input type="hidden" name="captcha_token" value="206df3429a8c0636a75757b9861df196" /> <input type="hidden" name="captcha_response" value="Google no captcha" /> <div class="g-recaptcha" data-sitekey="6LfnJVIUAAAAAE-bUOMg0MJGki4lqSvDmhJp19fN" data-theme="light" data-type="image"></div></div></fieldset> <div class="form-actions form-wrapper" id="edit-actions"><input type="submit" id="edit-submit" name="op" value="Send Message" class="form-submit" /></div></div></form> </div> </div> </div> </div> </div> </div></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-highwire-share-link highwire_clipboard_link_ajax" id="shareit"> <div class="pane-content"> <a href="/" class="link-icon"><span class="icon-share-alt"></span> <span class="title">Share</span></a> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-panels-mini pane-biorxiv-share highwire_clipboard_form_ajax_shareit" > <div class="pane-content"> <div class="panel-display omega-12-onecol" id="mini-panel-biorxiv_share"> <div class="panel-panel grid-12 panel-region-preface"> <div class="inside"><div class="panel-pane pane-highwire-article-citation" > <div class="pane-content"> <div class="highwire-article-citation highwire-citation-type-highwire-article node843498--3" data-node-nid="843498" id="node-843498--4816892418" data-pisa="biorxiv;714402v2" data-pisa-master="biorxiv;714402" data-seqnum="843498" data-apath="/biorxiv/early/2019/07/27/714402.atom"><div class="highwire-cite highwire-cite-highwire-article highwire-citation-biorxiv-article-pap-list clearfix" > <div class="highwire-cite-title" > <div class="highwire-cite-title">Targeted optimization of regulatory DNA sequences with neural editing architectures</div> </div> <div class="highwire-cite-authors" ><span class="highwire-citation-authors"><span class="highwire-citation-author first" data-delta="0"><span class="nlm-given-names">Anvita</span> <span class="nlm-surname">Gupta</span></span>, <span class="highwire-citation-author" data-delta="1"><span class="nlm-given-names">Anshul</span> <span class="nlm-surname">Kundaje</span></span></span></div> <div class="highwire-cite-metadata" ><span class="highwire-cite-metadata-journal highwire-cite-metadata">bioRxiv </span><span class="highwire-cite-metadata-pages highwire-cite-metadata">714402; </span><span class="highwire-cite-metadata-doi highwire-cite-metadata"><span class="doi_label">doi:</span> https://doi.org/10.1101/714402 </span></div> </div> </div> </div> </div> </div> </div> <div class="panel-panel grid-12 panel-region-content"> <div class="inside"><div class="panel-pane pane-highwire-article-clipboard-copy" > <div class="pane-content"> <div class = "clipboard-copy"> <span class="label-url"> <label for="dynamic">Share This Article:</label> </span> <span class="input-text-url"> <input type="text" id="dynamic" value="https://www.biorxiv.org/content/10.1101/714402v2" size="50"/> </span> <span class="copy-button button"> <button id="copy-dynamic" class="clipboardjs-button" data-clipboard-target="#dynamic" data-clipboard-alert-style="tooltip" data-clipboard-alert-text="Copied!">Copy</button> </span> </div> </div> </div> </div> </div> <div class="panel-panel grid-12 panel-region-postscript"> <div class="inside"><div class="panel-pane pane-service-links text-center" > <div class="pane-content"> <div class="service-links"><a href="http://twitter.com/share?url=https%3A//www.biorxiv.org/content/10.1101/714402v2&text=Targeted%20optimization%20of%20regulatory%20DNA%20sequences%20with%20neural%20editing%20architectures" id="twitter" title="Share this on Twitter" class="service-links-twitter" rel="nofollow" data-icon-position="" data-hide-link-title="0"><img src="https://www.biorxiv.org/sites/all/modules/highwire/highwire/images/twitter.png" alt="Twitter logo" /></a> <a href="http://www.facebook.com/sharer.php?u=https%3A//www.biorxiv.org/content/10.1101/714402v2&t=Targeted%20optimization%20of%20regulatory%20DNA%20sequences%20with%20neural%20editing%20architectures" id="facebook" title="Share on Facebook" class="service-links-facebook" rel="nofollow" data-icon-position="" data-hide-link-title="0"><img src="https://www.biorxiv.org/sites/all/modules/highwire/highwire/images/fb-blue.png" alt="Facebook logo" /></a> <a href="http://www.linkedin.com/shareArticle?mini=true&url=https%3A//www.biorxiv.org/content/10.1101/714402v2&title=Targeted%20optimization%20of%20regulatory%20DNA%20sequences%20with%20neural%20editing%20architectures&summary=&source=bioRxiv" id="linkedin" title="Publish this post to LinkedIn" class="service-links-linkedin" rel="nofollow" data-icon-position="" data-hide-link-title="0"><img src="https://www.biorxiv.org/sites/all/modules/highwire/highwire/images/linkedin-32px.png" alt="LinkedIn logo" /></a> <a href="http://www.mendeley.com/import/?url=https%3A//www.biorxiv.org/content/10.1101/714402v2&title=Targeted%20optimization%20of%20regulatory%20DNA%20sequences%20with%20neural%20editing%20architectures" id="mendeley" title="Share on Mendeley" class="service-links-mendeley" rel="nofollow" data-icon-position="" data-hide-link-title="0"><img src="https://www.biorxiv.org/sites/all/modules/highwire/highwire/images/mendeley.png" alt="Mendeley logo" /></a></div> </div> </div> </div> </div> </div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-minipanel-dialog-link pane-biorxiv-cite-tool" > <div class="pane-content"> <div class='minipanel-dialog-wrapper'><div class='minipanel-dialog-link-link'><a href="/" oncontextmenu="javascript: return false;" class="minipanel-dialog-link-trigger link-icon" title="Citation Tools"><span class="icon-globe"></span> <span class="title">Citation Tools</span></a></div><div class='minipanel-dialog-link-mini' style='display:none'><div class="panel-display panel-1col clearfix" id="mini-panel-biorxiv_cite_tool"> <div class="panel-panel panel-col"> <div><div class="panel-pane pane-highwire-citation-export" > <div class="pane-content"> <div class="highwire-citation-export"> <div class="highwire-citation-info"> <div class="highwire-article-citation highwire-citation-type-highwire-article cite-tool-node843498--5" data-node-nid="843498" id="citation-node-843498--61955336851" data-pisa="biorxiv;714402v2" data-pisa-master="biorxiv;714402" data-seqnum="843498" data-apath="/biorxiv/early/2019/07/27/714402.atom"><div class="highwire-cite highwire-cite-highwire-article highwire-citation-biorxiv-article-pap-list clearfix" > <div class="highwire-cite-title" > <div class="highwire-cite-title">Targeted optimization of regulatory DNA sequences with neural editing architectures</div> </div> <div class="highwire-cite-authors" ><span class="highwire-citation-authors"><span class="highwire-citation-author first" data-delta="0"><span class="nlm-given-names">Anvita</span> <span class="nlm-surname">Gupta</span></span>, <span class="highwire-citation-author" data-delta="1"><span class="nlm-given-names">Anshul</span> <span class="nlm-surname">Kundaje</span></span></span></div> <div class="highwire-cite-metadata" ><span class="highwire-cite-metadata-journal highwire-cite-metadata">bioRxiv </span><span class="highwire-cite-metadata-pages highwire-cite-metadata">714402; </span><span class="highwire-cite-metadata-doi highwire-cite-metadata"><span class="doi_label">doi:</span> https://doi.org/10.1101/714402 </span></div> </div> </div> </div> <div class="highwire-citation-formats"> <h2>Citation Manager Formats</h2> <div class="highwire-citation-formats-links"> <span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.spage&rft.epage&rft.atitle=Targeted%20optimization%20of%20regulatory%20DNA%20sequences%20with%20neural%20editing%20architectures&rft.volume&rft.issue&rft.date=2019-01-01%2000%3A00%3A00&rft.stitle&rft.jtitle=bioRxiv&rft.au=Gupta%2C+Anvita&rft.au=Kundaje%2C+Anshul"></span><ul class="hw-citation-links inline button button-alt button-grid clearfix"><li class="bibtext first"><a href="/highwire/citation/843498/bibtext" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">BibTeX</a></li><li class="bookends"><a href="/highwire/citation/843498/bookends" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">Bookends</a></li><li class="easybib"><a href="/highwire/citation/843498/easybib" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">EasyBib</a></li><li class="endnote-tagged"><a href="/highwire/citation/843498/endnote-tagged" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">EndNote (tagged)</a></li><li class="endnote-8-xml"><a href="/highwire/citation/843498/endnote-8-xml" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">EndNote 8 (xml)</a></li><li class="medlars"><a href="/highwire/citation/843498/medlars" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">Medlars</a></li><li class="mendeley"><a href="/highwire/citation/843498/mendeley" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">Mendeley</a></li><li class="papers"><a href="/highwire/citation/843498/papers" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">Papers</a></li><li class="refworks-tagged"><a href="/highwire/citation/843498/refworks-tagged" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">RefWorks Tagged</a></li><li class="reference-manager"><a href="/highwire/citation/843498/reference-manager" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">Ref Manager</a></li><li class="ris"><a href="/highwire/citation/843498/ris" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">RIS</a></li><li class="zotero last"><a href="/highwire/citation/843498/zotero" class="hw-download-citation-link" data-icon-position="" data-hide-link-title="0">Zotero</a></li></ul> </div> </div> </div> </div> </div> </div> </div> </div> </div></div> </div> </div> </div> </div> </div> </div> <!-- /.panel-row-wrapper --> </div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-service-links" > <div class="pane-content"> <div class="service-links"><div class="item-list"><ul><li class="first"><a href="http://twitter.com/share?url=https%3A//www.biorxiv.org/content/10.1101/714402v2&count=horizontal&via=&text=Targeted%20optimization%20of%20regulatory%20DNA%20sequences%20with%20neural%20editing%20architectures&counturl=https%3A//www.biorxiv.org/content/10.1101/714402v2" class="twitter-share-button service-links-twitter-widget" id="twitter_widget" title="Tweet This" rel="nofollow" data-icon-position="" data-hide-link-title="0"><span class="element-invisible">Tweet Widget</span></a></li><li><a href="http://www.facebook.com/plugins/like.php?href=https%3A//www.biorxiv.org/content/10.1101/714402v2&layout=button_count&show_faces=false&action=like&colorscheme=light&width=100&height=21&font=&locale=" id="facebook_like" title="I Like it" class="service-links-facebook-like" rel="nofollow" data-icon-position="" data-hide-link-title="0"><span class="element-invisible">Facebook Like</span></a></li><li class="last"><a href="https://www.biorxiv.org/content/10.1101/714402v2" id="google_plus_one" title="Plus it" class="service-links-google-plus-one" rel="nofollow" data-icon-position="" data-hide-link-title="0"><span class="element-invisible">Google Plus One</span></a></li></ul></div></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-highwire-article-collections" > <h2 class="pane-title">Subject Area</h2> <div class="pane-content"> <div class="highwire-list-wrapper highwire-article-collections"><div class="highwire-list"><ul class="highwire-article-collection-term-list"><li class="first last odd"><span class="highwire-article-collection-term"><a href="/collection/bioinformatics" class="highlight" data-icon-position="" data-hide-link-title="0">Bioinformatics<i class="icon-caret-right"></i> </a></span></li></ul></div></div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-panels-mini pane-biorxiv-subject-collections block-style-col2" > <div class="pane-content"> <div class="panel-flexible panels-flexible-new clearfix" id="mini-panel-biorxiv_subject_collections"> <div class="panel-flexible-inside panels-flexible-new-inside"> <div class="panels-flexible-region panels-flexible-region-new-center panels-flexible-region-first panels-flexible-region-last"> <div class="inside panels-flexible-region-inside panels-flexible-region-new-center-inside panels-flexible-region-inside-first panels-flexible-region-inside-last"> <div class="panel-pane pane-snippet" > <div class="pane-content"> <div class="snippet biorxiv-subject-areas-table-title" id="biorxiv-subject-areas-table-title"> <div class="snippet-content"> <b>Subject Areas</b> </div> </div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-snippet" > <div class="pane-content"> <div class="snippet biorxiv-subject-areas-view-papers" id="biorxiv-subject-areas-view-papers"> <div class="snippet-content"> <a href="/content/early/recent"><strong>All Articles</strong></a> </div> </div> </div> </div> <div class="panel-separator"></div><div class="panel-pane pane-highwire-subject-collections" > <div class="pane-content"> <ul id="collection" class="collection highwire-list-expand"><li class="outer collection depth-2 child first"><div class = "data-wrapper"><a href="/collection/animal-behavior-and-cognition" class="" data-icon-position="" data-hide-link-title="0">Animal Behavior and Cognition</a> <span class = "article-count">(5936)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/biochemistry" class="" data-icon-position="" data-hide-link-title="0">Biochemistry</a> <span class = "article-count">(13463)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/bioengineering" class="" data-icon-position="" data-hide-link-title="0">Bioengineering</a> <span class = "article-count">(10243)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/bioinformatics" class="" data-icon-position="" data-hide-link-title="0">Bioinformatics</a> <span class = "article-count">(32711)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/biophysics" class="" data-icon-position="" data-hide-link-title="0">Biophysics</a> <span class = "article-count">(16857)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/cancer-biology" class="" data-icon-position="" data-hide-link-title="0">Cancer Biology</a> <span class = "article-count">(13919)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/cell-biology" class="" data-icon-position="" data-hide-link-title="0">Cell Biology</a> <span class = "article-count">(19777)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/clinical-trials" class="" data-icon-position="" data-hide-link-title="0">Clinical Trials</a> <span class = "article-count">(138)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/developmental-biology" class="" data-icon-position="" data-hide-link-title="0">Developmental Biology</a> <span class = "article-count">(10692)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/ecology" class="" data-icon-position="" data-hide-link-title="0">Ecology</a> <span class = "article-count">(15811)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/epidemiology" class="" data-icon-position="" data-hide-link-title="0">Epidemiology</a> <span class = "article-count">(2067)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/evolutionary-biology" class="" data-icon-position="" data-hide-link-title="0">Evolutionary Biology</a> <span class = "article-count">(20139)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/genetics" class="" data-icon-position="" data-hide-link-title="0">Genetics</a> <span class = "article-count">(13277)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/genomics" class="" data-icon-position="" data-hide-link-title="0">Genomics</a> <span class = "article-count">(18436)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/immunology" class="" data-icon-position="" data-hide-link-title="0">Immunology</a> <span class = "article-count">(13527)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/microbiology" class="" data-icon-position="" data-hide-link-title="0">Microbiology</a> <span class = "article-count">(31688)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/molecular-biology" class="" data-icon-position="" data-hide-link-title="0">Molecular Biology</a> <span class = "article-count">(13209)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/neuroscience" class="" data-icon-position="" data-hide-link-title="0">Neuroscience</a> <span class = "article-count">(69030)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/paleontology" class="" data-icon-position="" data-hide-link-title="0">Paleontology</a> <span class = "article-count">(512)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/pathology" class="" data-icon-position="" data-hide-link-title="0">Pathology</a> <span class = "article-count">(2148)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/pharmacology-and-toxicology" class="" data-icon-position="" data-hide-link-title="0">Pharmacology and Toxicology</a> <span class = "article-count">(3691)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/physiology" class="" data-icon-position="" data-hide-link-title="0">Physiology</a> <span class = "article-count">(5767)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/plant-biology" class="" data-icon-position="" data-hide-link-title="0">Plant Biology</a> <span class = "article-count">(11852)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/scientific-communication-and-education" class="" data-icon-position="" data-hide-link-title="0">Scientific Communication and Education</a> <span class = "article-count">(1795)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/synthetic-biology" class="" data-icon-position="" data-hide-link-title="0">Synthetic Biology</a> <span class = "article-count">(3323)</span></div></li> <li class="outer collection depth-2 child"><div class = "data-wrapper"><a href="/collection/systems-biology" class="" data-icon-position="" data-hide-link-title="0">Systems Biology</a> <span class = "article-count">(8074)</span></div></li> <li class="outer collection depth-2 child last"><div class = "data-wrapper"><a href="/collection/zoology" class="" data-icon-position="" data-hide-link-title="0">Zoology</a> <span class = "article-count">(1829)</span></div></li> </ul> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <!-- /.panel-row-wrapper --> </div> </div> </div> </div> </div> </div> </div> </section> </div> <div class="region region-page-bottom" id="region-page-bottom"> <div class="region-inner region-page-bottom-inner"> </div> </div><script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__VNH5GD8Zz7g6_hGCOZjVjdMndxxC6naiExSSJKX3k_A__kzmEdtbqCsJx5Z9yg2Qy0PptB__ufXm-TxUOl0KZzUw__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js"></script> <script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__2WRbxlwOW0MEUc_hSWU5MBepQg6Lch6O5SZwefpJ6IE__HCL0YQJqLkOhrLPZZYGqosGvtFsEHMGghHIkSx4y9vA__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js" defer="defer"></script> <script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__N7ERJBYsOWyRnJgyoM125_Aiez2MOJGaUofG1JdWWBg__2cpwCQ7-xzTVeVvg_KOzwA1jka23oWApDPpgjoZKDCY__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js"></script> <script type="text/javascript" src="//d33xdlntwy0kbs.cloudfront.net/cshl_custom.js"></script> <script type="text/javascript" src="https://www.biorxiv.org/sites/default/files/advagg_js/js__BmqjBnkz3MgYCAoc25s1lDRMEjLhC3mEPVonUFIHi08__Unwv5-ZIuHBfFwytsjEx1niBVJ7n1T4lPws7VrkHXM4__zobGfsaKXlDabWfoD1KpWEvu77rXU7DVt-BhI_kXxVw.js"></script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- function euCookieComplianceLoadScripts() {} //--><!]]> </script> <script type="text/javascript"> <!--//--><![CDATA[//><!-- var eu_cookie_compliance_cookie_name = ""; //--><!]]> </script> </body> </html>