CINXE.COM
Representative submissions of identical sequences for variation studies
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta property="og:title" content="Representative submissions of identical sequences for variation studies" /> <meta property="og:url" content="https://www.ddbj.nig.ac.jp/ddbj/representative-sequence-e.html" /> <meta property="og:description" content="Representative submissions of identical sequences for variation studiesRecently,..." /> <meta property6="og:image" content="/images/thumbnail/logo_ddbj_fb.png" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Representative submissions of identical sequences for variation studies</title> <script async src="https://www.google-analytics.com/analytics.js"></script> <script src="https://code.jquery.com/jquery-3.5.0.js" integrity="sha256-r/AaFHrszJtwpe+tHyNi/XCfMxYpbsRg2Uqn0x3s2zc=" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery.hoverintent/1.10.1/jquery.hoverIntent.min.js" integrity="sha512-gx3WTM6qxahpOC/hBNUvkdZARQ2ObXSp/m+jmsEN8ZNJPymj8/Jamf8+/3kJQY1RZA2DR+KQfT+b3JEB0r9YRg==" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/spin.js/4.1.0/spin.min.js" integrity="sha512-CbohqWjAgarTqRHcX1MbwkF2pujwbsCee1PABpnBWC+VqSldvlNEEI5+4OSsR/HbFQOFFpwY2YvZZNjBMxNnXg==" crossorigin="anonymous"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery.colorbox/1.6.4/jquery.colorbox-min.js"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery-deparam/0.5.3/jquery-deparam.min.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.trace.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.json_search.js"></script> <link rel="icon" href="https://www.ddbj.nig.ac.jp/assets/images/favicon_ddbj.ico"> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/colorbox.css" /> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/main.css" /> <link rel="alternate" type="application/rss+xml" title="My Site RSS" href="/feed.xml" /> <script src="https://www.ddbj.nig.ac.jp/assets/js/main.js"></script> </head> <body data-category="ddbj"> <script src="https://www.ddbj.nig.ac.jp/assets/js/ddbj_common_framework.js" id="DDBJ_common_framework" style="display: block; height: 40px;" data-bottom-menu="true" data-ddbj-home-page="true" data-search="true" ></script> <section class="top-news-view"> <div class="inner"> <ul> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-10-22-e">On Cyber Threats against DDBJ, a node of the International Nucleotide Sequence Database Collaboration</a> </li> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-11-22-e">(27th November 9:00-November 28th 12:00)Announcement of D-way/MSS suspension</a> </li> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/">(12th December 8:00-20th December 12:00(JST)) Suspension of DDBJ services due to NIG supercomputer maintenance</a> </li> </ul> </div> </section> <div id="primary"> <header id="PageHeader"> <div class="inner"> <div class="page-title"> <p class="title -normal">DDBJ Annotated/Assembled Sequences</p> </div> <nav class="tab-menu-view"> <ul class="tabmenucontainer"> <li class=" -current"> <a href="/ddbj/index-e.html">Home</a> </li> <li class=" -haschild"> <a href="/ddbj/submission-e.html">Submission</a> <ul> <li> <a href="/ddbj/submission-e.html">Before Submission</a> </li> <li> <a href="/ddbj/web-submission-e.html">Web submission</a> </li> <li> <a href="/ddbj/mss-e.html">Mass Submission</a> </li> <li> <a href="/ddbj/update-e.html">Data Update</a> </li> </ul> </li> <li class=" -haschild"> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">Search</a> <ul> <li> <a href="http://getentry.ddbj.nig.ac.jp/top-e.html">getentry</a> </li> <li> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">ARSA</a> </li> </ul> </li> <li class=" -haschild"> <a href="/ddbj/flat-file-e.html">Flat file</a> <ul> <li> <a href="/ddbj/feature-table-e.html">Feature Table</a> </li> <li> <a href="/ddbj/features-e.html">Feature key</a> </li> <li> <a href="/ddbj/qualifiers-e.html">Qualifier key</a> </li> <li> <a href="/ddbj/sequence-e.html">Nucleotide Sequences</a> </li> <li> <a href="/ddbj/organism-e.html">Organism qualifier</a> </li> <li> <a href="/ddbj/identifiers-e.html">Identifiers</a> </li> <li> <a href="/ddbj/location-e.html">Description of Location</a> </li> <li> <a href="/ddbj/cds-e.html">Protein Coding Sequence</a> </li> <li> <a href="/ddbj/geneticcode-e.html">The Genetic Codes</a> </li> <li> <a href="/ddbj/code-e.html">Codes Used in Sequence Description</a> </li> <li> <a href="/ddbj/example-e.html">Description Examples of Sequence Data</a> </li> </ul> </li> <li class=" -haschild"> <a href="/ddbj/data-categories-e.html">Data categories</a> <ul> <li> <a href="/ddbj/genome-e.html">Data Submission from Genome Project</a> </li> <li> <a href="/ddbj/pseudohaplotype-e.html">Pseudohaplotype</a> </li> <li> <a href="/ddbj/wgs-e.html">WGS</a> </li> <li> <a href="/ddbj/finished_level_genome-e.html">Finished level genomic sequences</a> </li> <li> <a href="/ddbj/metagenome-assembly-e.html">Metagenome Assembly</a> </li> <li> <a href="/ddbj/single-amplified-genome-e.html">Single amplified genome</a> </li> <li> <a href="/ddbj/htg-e.html">HTG</a> </li> <li> <a href="/ddbj/environmental-e.html">Environmental sample</a> </li> <li> <a href="/ddbj/env-e.html">ENV</a> </li> <li> <a href="/ddbj/tls-e.html">TLS</a> </li> <li> <a href="/ddbj/transcriptome-e.html">Data Submission from Transcriptome Project</a> </li> <li> <a href="/ddbj/tsa-e.html">TSA</a> </li> <li> <a href="/ddbj/est-e.html">EST</a> </li> <li> <a href="/ddbj/htc-e.html">HTC</a> </li> <li> <a href="/ddbj/tpa-e.html">Third Party Data (TPA)</a> </li> </ul> </li> <li class=""> <a href="/faq/en/index-e.html?tag=ddbj">FAQ</a> </li> <li class=" -haschild"> <a href="/ddbj/index-e.html">Other</a> <ul> <li> <a href="/ddbj/patent-data-e.html">Patent</a> </li> <li> <a href="/ddbj/mga-e.html">MGA</a> </li> </ul> </li> </ul> </nav> </div> </header> <section id="NavigationAndMainView"> <div class="inner"> <div class="subview"> <nav id="TableOfContents" class="internal-link"> </nav> </div> <section id="MainContentView" class="mainview"> <header class="header"> <nav class="breadcrumb-view"> <ul> <li> <a href="https://www.ddbj.nig.ac.jp/index-e.html">Home</a> </li> <li> <a href="https://www.ddbj.nig.ac.jp/ddbj/index-e.html">ddbj</a> </li> <li><a>Representative submissions of identical sequences for variation studies</a></li> </ul> </nav> <h1 class="title">Representative submissions of identical sequences for variation studies</h1> </header> <main class="md-content"> <h1 id="representative-submissions-of-identical-sequences-for-variation-studies">Representative submissions of identical sequences for variation studies</h1> <p>Recently, variation studies related to re-sequencing projects are increased, so the sequence data from these projects are also increasd. <br /> <span style="color:red">DDBJ (INSDC) basically accepts all sequence data, regardless of source and sequence identity</span>, however, if the policy is strictly applied, some of data would be very redundant.</p> <p>In order to take advantage of normalisation for variation studies, a single submission to represent multiple identical sequences is also acceptable with frequency and total sample number described by /<a href="/ddbj/qualifiers-e.html#haplotype">haplotype</a> qualifier of <a href="/ddbj/features-e.html#source">source</a> feature and/or /<a href="/ddbj/qualifiers-e.html#frequency">frequency</a> qualifier of <a href="/ddbj/features-e.html#variation">variation</a> feature.</p> <p>The way of representative submission for variation studies is NOT to mean that all identical (or similar) sequences derived from same species would be represented by a single sequence data. <br /> To evaluate research data properly, DDBJ recommends to normalise research data for variation studies by appropriate set of entries; basically, the number of entries should be equal to multiplication of numbers of sequence polymorphisms and sampled populations.</p> <dl> <dt>sequence polymorphism</dt> <dd>a unit of sequence variations that can keep unique descriptions of /<a href="/ddbj/qualifiers-e.html#haplotype">haplotype</a>, /<a href="/ddbj/qualifiers-e.html#allele">allele</a> and/or some other qualifiers.</dd> <dt>sampled population</dt> <dd>a unit of obserbed samples that can keep unique descriptions of /<a href="/ddbj/qualifiers-e.html#geo_loc_name">geo_loc_name</a>, /<a href="/ddbj/qualifiers-e.html#lat_lon">lat_lon</a>, /<a href="/ddbj/qualifiers-e.html#collection_date">collection_date</a>, /<a href="/ddbj/qualifiers-e.html#host">host</a> and/or some other qualifiers.</dd> </dl> <p>For example, a study of a locus on cat genomes comparing Japan with USA shows that there are three haplotypes of sequence polymorphism indicated by below table, and within each haplotype, sequences are identical. DDBJ can accept these results as a submission of 231 sequence data for all indivisuals, however, the set of sequence data seem to be very redundant for both submitters and users.</p> <table> <thead> <tr> <th>polymorphism(haplotype)</th> <th><span style="color:green">A</span></th> <th>B</th> <th>C</th> <th>total</th> </tr> </thead> <tbody> <tr> <th><span style="color:red">Japan</span></th> <td><span style="color:blue">75</span></td> <td>38</td> <td>0</td> <td><span style="color:blue">113</span></td> </tr> <tr> <th>USA</th> <td>26</td> <td>32</td> <td>60</td> <td>118</td> </tr> <tr> <th>totla</th> <td>101</td> <td>60</td> <td>70</td> <td>231</td> </tr> </tbody> </table> <p>Since observed identical sequences are three types, it would be possible for the publication of this study to submit only three representative sequence data to DDBJ. <br /> However, if so, it would be difficult for users to understand what kind of samples were used for this study. <br /> Therefore, it is strongly recommended to submit five representative data (There are 6 patterns; i.e. 3 haplotypes x 2 countries, but haplotype C is not observed in Japan.) to DDBJ in following descriptions for source features, respectively. <br /> Furthermore, when observing at the passage of time, you may like to consider about the /collection_date qualifier as well.</p> <pre> <a href="/ddbj/features-e.html#source">source</a> 1..365 /<a href="/ddbj/qualifiers-e.html#collection_date">collection_date</a>="2007" /<a href="/ddbj/qualifiers-e.html#geo_loc_name">geo_loc_name</a>="<span style="color:red">Japan</span>" /<a href="/ddbj/qualifiers-e.html#haplotype">haplotype</a>="<span style="color:green">A</span> [<span style="color:blue">75</span> in <span style="color:blue">113</span>]" /mol_type="genomic DNA" /<a href="/ddbj/qualifiers-e.html#organism">organism</a>="Felis catus" <a href="/ddbj/features-e.html#variation">variation</a> 124 /<a href="/ddbj/qualifiers-e.html#frequency">frequency</a>="<span style="color:blue">75</span> in <span style="color:blue">113</span>" /<a href="/ddbj/qualifiers-e.html#inference">inference</a>="similar to DNA sequence (same species):INSD:AB012345.1" /<a href="/ddbj/qualifiers-e.html#replace">replace</a>="t" </pre> </main> </section> </div> </section> </div> <footer></footer> <div id="back-top"></div> </body> </html>