CINXE.COM
Haplotype
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta property="og:title" content="Haplotype" /> <meta property="og:url" content="https://www.ddbj.nig.ac.jp/ddbj/haplotype-e.html" /> <meta property="og:description" content="Historically, whole-genome sequencing generated a single consensus sequence with..." /> <meta property6="og:image" content="/images/thumbnail/logo_ddbj_fb.png" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Haplotype</title> <script async src="https://www.google-analytics.com/analytics.js"></script> <script src="https://code.jquery.com/jquery-3.5.0.js" integrity="sha256-r/AaFHrszJtwpe+tHyNi/XCfMxYpbsRg2Uqn0x3s2zc=" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery.hoverintent/1.10.1/jquery.hoverIntent.min.js" integrity="sha512-gx3WTM6qxahpOC/hBNUvkdZARQ2ObXSp/m+jmsEN8ZNJPymj8/Jamf8+/3kJQY1RZA2DR+KQfT+b3JEB0r9YRg==" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/spin.js/4.1.0/spin.min.js" integrity="sha512-CbohqWjAgarTqRHcX1MbwkF2pujwbsCee1PABpnBWC+VqSldvlNEEI5+4OSsR/HbFQOFFpwY2YvZZNjBMxNnXg==" crossorigin="anonymous"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery.colorbox/1.6.4/jquery.colorbox-min.js"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery-deparam/0.5.3/jquery-deparam.min.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.trace.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.json_search.js"></script> <link rel="icon" href="https://www.ddbj.nig.ac.jp/assets/images/favicon_ddbj.ico"> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/colorbox.css" /> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/main.css" /> <link rel="alternate" type="application/rss+xml" title="My Site RSS" href="/feed.xml" /> <script src="https://www.ddbj.nig.ac.jp/assets/js/main.js"></script> </head> <body data-category="ddbj"> <script src="https://www.ddbj.nig.ac.jp/assets/js/ddbj_common_framework.js" id="DDBJ_common_framework" style="display: block; height: 40px;" data-bottom-menu="true" data-ddbj-home-page="true" data-search="true" ></script> <section class="top-news-view"> <div class="inner"> <ul> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-10-22-e">On Cyber Threats against DDBJ, a node of the International Nucleotide Sequence Database Collaboration</a> </li> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-11-22-e">(27th November 9:00-November 28th 12:00)Announcement of D-way/MSS suspension</a> </li> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/">(12th December 8:00-20th December 12:00(JST)) Suspension of DDBJ services due to NIG supercomputer maintenance</a> </li> </ul> </div> </section> <div id="primary"> <header id="PageHeader"> <div class="inner"> <div class="page-title"> <p class="title -normal">DDBJ Annotated/Assembled Sequences</p> </div> <nav class="tab-menu-view"> <ul class="tabmenucontainer"> <li class=""> <a href="/ddbj/index-e.html">Home</a> </li> <li class=" -haschild"> <a href="/ddbj/submission-e.html">Submission</a> <ul> <li> <a href="/ddbj/submission-e.html">Before Submission</a> </li> <li> <a href="/ddbj/web-submission-e.html">Web submission</a> </li> <li> <a href="/ddbj/mss-e.html">Mass Submission</a> </li> <li> <a href="/ddbj/update-e.html">Data Update</a> </li> </ul> </li> <li class=" -haschild"> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">Search</a> <ul> <li> <a href="http://getentry.ddbj.nig.ac.jp/top-e.html">getentry</a> </li> <li> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">ARSA</a> </li> </ul> </li> <li class=" -haschild"> <a href="/ddbj/flat-file-e.html">Flat file</a> <ul> <li> <a href="/ddbj/feature-table-e.html">Feature Table</a> </li> <li> <a href="/ddbj/features-e.html">Feature key</a> </li> <li> <a href="/ddbj/qualifiers-e.html">Qualifier key</a> </li> <li> <a href="/ddbj/sequence-e.html">Nucleotide Sequences</a> </li> <li> <a href="/ddbj/organism-e.html">Organism qualifier</a> </li> <li> <a href="/ddbj/identifiers-e.html">Identifiers</a> </li> <li> <a href="/ddbj/location-e.html">Description of Location</a> </li> <li> <a href="/ddbj/cds-e.html">Protein Coding Sequence</a> </li> <li> <a href="/ddbj/geneticcode-e.html">The Genetic Codes</a> </li> <li> <a href="/ddbj/code-e.html">Codes Used in Sequence Description</a> </li> <li> <a href="/ddbj/example-e.html">Description Examples of Sequence Data</a> </li> </ul> </li> <li class=" -haschild -current"> <a href="/ddbj/data-categories-e.html">Data categories</a> <ul> <li> <a href="/ddbj/genome-e.html">Data Submission from Genome Project</a> </li> <li> <a href="/ddbj/pseudohaplotype-e.html">Pseudohaplotype</a> </li> <li> <a href="/ddbj/wgs-e.html">WGS</a> </li> <li> <a href="/ddbj/finished_level_genome-e.html">Finished level genomic sequences</a> </li> <li> <a href="/ddbj/metagenome-assembly-e.html">Metagenome Assembly</a> </li> <li> <a href="/ddbj/single-amplified-genome-e.html">Single amplified genome</a> </li> <li> <a href="/ddbj/htg-e.html">HTG</a> </li> <li> <a href="/ddbj/environmental-e.html">Environmental sample</a> </li> <li> <a href="/ddbj/env-e.html">ENV</a> </li> <li> <a href="/ddbj/tls-e.html">TLS</a> </li> <li> <a href="/ddbj/transcriptome-e.html">Data Submission from Transcriptome Project</a> </li> <li> <a href="/ddbj/tsa-e.html">TSA</a> </li> <li> <a href="/ddbj/est-e.html">EST</a> </li> <li> <a href="/ddbj/htc-e.html">HTC</a> </li> <li> <a href="/ddbj/tpa-e.html">Third Party Data (TPA)</a> </li> </ul> </li> <li class=""> <a href="/faq/en/index-e.html?tag=ddbj">FAQ</a> </li> <li class=" -haschild"> <a href="/ddbj/index-e.html">Other</a> <ul> <li> <a href="/ddbj/patent-data-e.html">Patent</a> </li> <li> <a href="/ddbj/mga-e.html">MGA</a> </li> </ul> </li> </ul> </nav> </div> </header> <section id="NavigationAndMainView"> <div class="inner"> <div class="subview"> <nav id="TableOfContents" class="internal-link"> </nav> </div> <section id="MainContentView" class="mainview"> <header class="header"> <nav class="breadcrumb-view"> <ul> <li> <a href="https://www.ddbj.nig.ac.jp/index-e.html">Home</a> </li> <li> <a href="https://www.ddbj.nig.ac.jp/ddbj/index-e.html">ddbj</a> </li> <li><a>Haplotype</a></li> </ul> </nav> <h1 class="title">Haplotype</h1> </header> <main class="md-content"> <p>Historically, whole-genome sequencing generated a single consensus sequence without distinguishing between alleles on homologous chromosomes. Long-read sequencing technologies can identify haploid chromosomes. Because two genome sequences are produced from single sample in haplotype sequencing, INSDC establishes the guideline for haplotype sequence submission. At DDBJ we had used the term ‘pseudohaplotype’ for all of these but we use ‘haplotype’ now.</p> <h2 id="haplotype">Haplotype submission</h2> <p>This page describes a typical case of the haplotype sequence submission. To distinguish haplotype assemblies, as an example, name one of the assemblies as “Principal” and another as “Alternate” here. Because each haplotype assembly is derived from the same sample, both assemblies share the same BioSample. Because INSDC manages a genome assembly by unique combination of BioProject and BioSample, create separate BioProject for each principal and alternate haplotype to make the combination unique. Create an umbrella BioProject to group these projects.</p> <p>If the raw DRA sequencing data contain reads from both haplotypes, create a BioProject for DRA apart from those for assemblies. If the DRA data are derived from the same sample for the assemblies, use the same BioSample.</p> <p>When more than one haplotype dataset exist (e.g. three haplotype datasets of species A, B and C), create a common umbrella project and group each primary projects.</p> <div class="figure"> <a class="group1" href="/assets/images/submission/haplotype.jpg" title="Haplotype data submission"> <figure class="image"> <img src="/assets/images/submission/haplotype.jpg" alt="Haplotype data submission" class="w600" /> <figcaption>Haplotype data submission</figcaption> </figure> </a> </div> <h3 id="naming">Naming haplotype assemblies</h3> <p>There are a few naming options to distinguish haplotype assemblies. One of the options must be asserted by the submitter.</p> <ul> <li>Principal haplotype/Alternate haplotype: if one is much better (Principal) than the other (Alternate).</li> <li>Haplotype 1/Haplotype 2: if they are of similar quality. When more than 2 haplotypes are present, increase the number like Haplotype 3/Haplotype 4.</li> <li>Maternal haplotype/Paternal haplotype: when that information is known.</li> </ul> <h3 id="bioproject">BioProject</h3> <p>Create separate BioProject for each principal and alternate haplotype and an umbrella BioProject to group these projects. When <a href="/bioproject/submission-e.html#submit-umbrella-project">submitting the umbrella project</a>, enter accession numbers of primary BioProjects to be linked and their names for haplotypes (e.g. PRJDB1 Principal, PRJDB2 Alternate and PRJDB3 DRA).</p> <ul> <li>BioProject 1: Principal <ul> <li>Add phasing information in the title. For example, Principal haplotype or Primary haplotype.</li> </ul> </li> <li>BioProject 2: Alternate <ul> <li>Add phasing information in the title. For example, Alternate haplotype or Alternate haplotype.</li> </ul> </li> <li>Umbrella BioProject <ul> <li>For grouping BioProject 1, 2 and the other related BioProjects (BioProject 3 for DRA in the figure).</li> </ul> </li> </ul> <h3 id="biosample">BioSample</h3> <p>Because the sample is shared by haplotypes, create single BioSample.</p> <ul> <li>Select <a href="/biosample/sample-info-e.html#Genomic_Sequences_Sample">MIGS</a> package.</li> <li>Create a common BioSample for principal and alternate haplotype.</li> <li>If you add gene annotations to the haplotype sequences, enter a <a href="/ddbj/locus_tag-e.html">locus tag prefix</a> you want to use for the principal and the alternate haplotype in the locus_tag_prefix attribute. The locus tag prefix is shared by the principal and the alternate haplotypes, loci can be distinguished by tags, for example, A1C_p00001 (principal) and A1C_a00001 (alternate).</li> </ul> <h3 id="ddbj">DDBJ</h3> <p>Submit the principal and the alternate haplotype sequences.</p> <ul> <li>Principal haplotype <ul> <li>Reference the BioProject 1 (Principal) in <a href="/ddbj/file-format-e.html#dblink">DBLINK</a>.</li> <li>Add the pre-defined comment in <a href="/ddbj/file-format-e.html#comment">ST_COMMENT</a>. Genome-Assembly-Data ST_COMMENT: Diploid :: Principal Haplotype</li> </ul> </li> <li>Alternate haplotype <ul> <li>Reference the BioProject 2 (Alternate) in <a href="/ddbj/file-format-e.html#dblink">DBLINK</a>.</li> <li>Add the pre-defined comment in <a href="/ddbj/file-format-e.html#comment">ST_COMMENT</a>. Genome-Assembly-Data ST_COMMENT: Diploid :: Alternate Haplotype</li> </ul> </li> </ul> <h3 id="real-examples">Real-world examples</h3> <h4 id="common">Common</h4> <ul> <li>BioProject: <a href="https://www.ncbi.nlm.nih.gov/bioproject/PRJDB10054">PRJDB10054</a> (Umbrella)</li> <li>BioSample: <a href="https://www.ncbi.nlm.nih.gov/biosample/SAMD00229903">SAMD00229903</a></li> </ul> <h4 id="principal">Principal haplotype</h4> <ul> <li>BioProject: <a href="https://www.ncbi.nlm.nih.gov/bioproject/PRJDB10055">PRJDB10055</a></li> <li>DDBJ: <a href="https://www.ncbi.nlm.nih.gov/nuccore/BLYA00000000">BLYA01000001-BLYA01003780</a></li> </ul> <h4 id="alternate">Alternate haplotype</h4> <ul> <li>BioProject: <a href="https://www.ncbi.nlm.nih.gov/bioproject/PRJDB10056">PRJDB10056</a></li> <li>DDBJ: <a href="https://www.ncbi.nlm.nih.gov/nuccore/BLYB00000000">BLYB01000001-BLYB01003780</a></li> </ul> <h4 id="dra">DRA</h4> <ul> <li>BioProject: <a href="https://www.ncbi.nlm.nih.gov/bioproject/PRJDB9979">PRJDB9979</a></li> <li>DRA: <a href="https://www.ncbi.nlm.nih.gov/sra?term=DRP006217">DRR231909-DRR231923</a></li> </ul> </main> </section> </div> </section> </div> <footer></footer> <div id="back-top"></div> </body> </html>