CINXE.COM
Data Categories
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta property="og:title" content="Data Categories" /> <meta property="og:url" content="https://www.ddbj.nig.ac.jp/ddbj/data-categories-e.html" /> <meta property="og:description" content="DivisionGeneral data: classified by source speciesThe data that are not classifi..." /> <meta property6="og:image" content="/images/thumbnail/logo_ddbj_fb.png" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Data Categories</title> <script async src="https://www.google-analytics.com/analytics.js"></script> <script src="https://code.jquery.com/jquery-3.5.0.js" integrity="sha256-r/AaFHrszJtwpe+tHyNi/XCfMxYpbsRg2Uqn0x3s2zc=" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery.hoverintent/1.10.1/jquery.hoverIntent.min.js" integrity="sha512-gx3WTM6qxahpOC/hBNUvkdZARQ2ObXSp/m+jmsEN8ZNJPymj8/Jamf8+/3kJQY1RZA2DR+KQfT+b3JEB0r9YRg==" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/spin.js/4.1.0/spin.min.js" integrity="sha512-CbohqWjAgarTqRHcX1MbwkF2pujwbsCee1PABpnBWC+VqSldvlNEEI5+4OSsR/HbFQOFFpwY2YvZZNjBMxNnXg==" crossorigin="anonymous"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery.colorbox/1.6.4/jquery.colorbox-min.js"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery-deparam/0.5.3/jquery-deparam.min.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.trace.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.json_search.js"></script> <link rel="icon" href="https://www.ddbj.nig.ac.jp/assets/images/favicon_ddbj.ico"> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/colorbox.css" /> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/main.css" /> <link rel="alternate" type="application/rss+xml" title="My Site RSS" href="/feed.xml" /> <script src="https://www.ddbj.nig.ac.jp/assets/js/main.js"></script> </head> <body data-category="ddbj"> <script src="https://www.ddbj.nig.ac.jp/assets/js/ddbj_common_framework.js" id="DDBJ_common_framework" style="display: block; height: 40px;" data-bottom-menu="true" data-ddbj-home-page="true" data-search="true" ></script> <section class="top-news-view"> <div class="inner"> <ul> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-10-22-e">On Cyber Threats against DDBJ, a node of the International Nucleotide Sequence Database Collaboration</a> </li> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-11-22-e">(27st November 9:00-November 28th 12:00)Announcement of D-way/MSS suspension</a> </li> </ul> </div> </section> <div id="primary"> <header id="PageHeader"> <div class="inner"> <div class="page-title"> <p class="title -normal">DDBJ Annotated/Assembled Sequences</p> </div> <nav class="tab-menu-view"> <ul class="tabmenucontainer"> <li class=""> <a href="/ddbj/index-e.html">Home</a> </li> <li class=" -haschild"> <a href="/ddbj/submission-e.html">Submission</a> <ul> <li> <a href="/ddbj/submission-e.html">Before Submission</a> </li> <li> <a href="/ddbj/web-submission-e.html">Web submission</a> </li> <li> <a href="/ddbj/mss-e.html">Mass Submission</a> </li> <li> <a href="/ddbj/update-e.html">Data Update</a> </li> </ul> </li> <li class=" -haschild"> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">Search</a> <ul> <li> <a href="http://getentry.ddbj.nig.ac.jp/top-e.html">getentry</a> </li> <li> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">ARSA</a> </li> </ul> </li> <li class=" -haschild"> <a href="/ddbj/flat-file-e.html">Flat file</a> <ul> <li> <a href="/ddbj/feature-table-e.html">Feature Table</a> </li> <li> <a href="/ddbj/features-e.html">Feature key</a> </li> <li> <a href="/ddbj/qualifiers-e.html">Qualifier key</a> </li> <li> <a href="/ddbj/sequence-e.html">Nucleotide Sequences</a> </li> <li> <a href="/ddbj/organism-e.html">Organism qualifier</a> </li> <li> <a href="/ddbj/identifiers-e.html">Identifiers</a> </li> <li> <a href="/ddbj/location-e.html">Description of Location</a> </li> <li> <a href="/ddbj/cds-e.html">Protein Coding Sequence</a> </li> <li> <a href="/ddbj/geneticcode-e.html">The Genetic Codes</a> </li> <li> <a href="/ddbj/code-e.html">Codes Used in Sequence Description</a> </li> <li> <a href="/ddbj/example-e.html">Description Examples of Sequence Data</a> </li> </ul> </li> <li class=" -haschild -current"> <a href="/ddbj/data-categories-e.html">Data categories</a> <ul> <li> <a href="/ddbj/genome-e.html">Data Submission from Genome Project</a> </li> <li> <a href="/ddbj/pseudohaplotype-e.html">Pseudohaplotype</a> </li> <li> <a href="/ddbj/wgs-e.html">WGS</a> </li> <li> <a href="/ddbj/finished_level_genome-e.html">Finished level genomic sequences</a> </li> <li> <a href="/ddbj/metagenome-assembly-e.html">Metagenome Assembly</a> </li> <li> <a href="/ddbj/single-amplified-genome-e.html">Single amplified genome</a> </li> <li> <a href="/ddbj/htg-e.html">HTG</a> </li> <li> <a href="/ddbj/environmental-e.html">Environmental sample</a> </li> <li> <a href="/ddbj/env-e.html">ENV</a> </li> <li> <a href="/ddbj/tls-e.html">TLS</a> </li> <li> <a href="/ddbj/transcriptome-e.html">Data Submission from Transcriptome Project</a> </li> <li> <a href="/ddbj/tsa-e.html">TSA</a> </li> <li> <a href="/ddbj/est-e.html">EST</a> </li> <li> <a href="/ddbj/htc-e.html">HTC</a> </li> <li> <a href="/ddbj/tpa-e.html">Third Party Data (TPA)</a> </li> </ul> </li> <li class=""> <a href="/faq/en/index-e.html?tag=ddbj">FAQ</a> </li> <li class=" -haschild"> <a href="/ddbj/index-e.html">Other</a> <ul> <li> <a href="/ddbj/patent-data-e.html">Patent</a> </li> <li> <a href="/ddbj/mga-e.html">MGA</a> </li> </ul> </li> </ul> </nav> </div> </header> <section id="NavigationAndMainView"> <div class="inner"> <div class="subview"> <nav id="TableOfContents" class="internal-link"> </nav> </div> <section id="MainContentView" class="mainview"> <header class="header"> <nav class="breadcrumb-view"> <ul> <li> <a href="https://www.ddbj.nig.ac.jp/index-e.html">Home</a> </li> <li> <a href="https://www.ddbj.nig.ac.jp/ddbj/index-e.html">ddbj</a> </li> <li><a>Data Categories</a></li> </ul> </nav> <h1 class="title">Data Categories</h1> </header> <main class="md-content"> <h2 id="division">Division</h2> <h3 id="general">General data: classified by source species</h3> <p>The data that are not classified into any categories described in the sections are called general data and belong here.<br /> In principle, it is required for general data to have at least one source feature and at least one other <a href="/ddbj/file-format-e.html#biological_feature">Biological feature</a>.<br /> Submitted sequences are automatically classified into one of the following divisions on the basis of the taxonomy of the source organisms.</p> <table> <thead> <tr> <th>Division</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td>HUM</td> <td>Human</td> </tr> <tr> <td>PRI</td> <td>Primates (other than human)</td> </tr> <tr> <td>ROD</td> <td>Rodents</td> </tr> <tr> <td>MAM</td> <td>Mammals (other than primates or rodents)</td> </tr> <tr> <td>VRT</td> <td>Vertebrates (other than mammals)</td> </tr> <tr> <td>INV</td> <td>Invertebrates</td> </tr> <tr> <td>PLN</td> <td>Plants or fungi</td> </tr> <tr> <td>BCT</td> <td>Bacteria</td> </tr> <tr> <td>VRL</td> <td>Viruses</td> </tr> <tr> <td>PHG</td> <td>Phages</td> </tr> </tbody> </table> <h3 id="env">ENV/SYN: impossible to identify souce species, Environmental Samples and Synthetic Constructs</h3> <p>Environmental samples and artificially constructed sequences are classified into <a href="/ddbj/env-e.html">ENV</a> and SYN division,respectively.<br /> In principle, it is required for ENV and SYN data to have at least one source feature and at least one other <a href="/ddbj/file-format-e.html#biological_feature">Biological feature</a>.</p> <table> <thead> <tr> <th>Division</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><a href="/ddbj/env-e.html">ENV</a></td> <td>Sequences obtained via environmental sampling methods, direct PCR, DGGE, etc.<br />For ENV submissions, it is necessary to describe an <a href="/ddbj/qualifiers-e.html#environmental_sample">environmental_sample qualifier</a> on the source feature.</td> </tr> <tr> <td>SYN</td> <td>Synthetic constructs; sequences constructed by artificial manipulations<br />For SYN submissions, in general, the entry often has plural source features, so it should be cared.<br /> See also <a href="/ddbj/example-e.html#E05">Description Examples of Sequence Data: E05) synthetic construct.</a>.</td> </tr> </tbody> </table> <!-- ### CON: Contig/Constructed, Tiling of Entries {#con} --> <!-- Many genome projects submitting a lot of [HTG](/ddbj/htg-e.html) and/or --> <!-- [WGS](/ddbj/wgs-e.html) entries can often provide the information to assemble a series of their entries and reconstruct a genome structure. --> <!-- An accession number would be assigned for such contig tiling path, so called "[CON entry](/ddbj/con-e.html)", which is classified into CON division. --> <!-- See also [Steps of genome sequencing, categories of sequence data and their correspondences.](/ddbj/genome-e.html ) --> <!-- At first you have to submit all piece entries to construct the contig, then a CON entry will be constructed. --> <!-- [AGP file](/ddbj/file-format-e.html#agp) is required to submit CON entries. --> <h3 id="est">EST/GSS/HTC/HTG: Divisions for Feasibility of Sequencing</h3> <p>Sequences derived from high throughput projects, such as large scale analyses like EST dataset, ongoing whole genome scale sequencing, and so on, are classified into the following divisions, respectively. <br /> Basically only one source feature should be described for an entry in those divisions. <br /> In this regard, however, the entries including HTC or HTG division can have some <a href="/ddbj/file-format-e.html#biological_feature">Biological features</a> like as generaldata, if necessary.</p> <table> <thead> <tr> <th>Division</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><a href="/ddbj/est-e.html">EST</a></td> <td>Expressed sequence tags, cDNA sequences read short single pass.</td> </tr> <tr> <td><a href="/ddbj/gss-e.html">GSS</a></td> <td>Genome survey sequences, genome sequences read short single pass.</td> </tr> <tr> <td><a href="/ddbj/htc-e.html">HTC</a></td> <td>High throughput cDNA sequences from cDNA sequencing projects, not EST.<br /> This division is to include unfinished high throughput cDNA sequences.</td> </tr> <tr> <td><a href="/ddbj/htg-e.html">HTG</a></td> <td>High throughput genomic sequences mainly from genome sequencing projects.<br /> Unfinished HTG entries are classified into different levels, as follow;<ul><li>phase0锛汼urvey sequence generated for the purpose of library quality assessment and detection of overlaps with other clones before construction of piece contig(s)</li><li>phase1锛沀nfinished sequence having contigs that have NOT been ordered and oriented</li><li>phase2锛沀nfinished sequence having contigs that have been ordered and oriented</li></ul></td> </tr> </tbody> </table> <h2 id="data_type">Data type, bulk sequence data</h2> <h3 id="wgs">WGS: Fragment Sequences during WGS Assembling Process</h3> <p>The large set of contigs from the proceeding genome project can be submitted as one of bulk sequence data, <a href="/ddbj/wgs-e.html">Whole Genome Shotgun (WGS)</a>.<br /> Please note that WGS data is different from others in its <a href="/ddbj/flat-file-e.html#Accession">format of accession number</a>.<br /> See also <a href="/ddbj/genome-e.html">Steps of genome sequencing, categories of sequence data and their correspondences</a> .</p> <h3 id="tsa">TSA: Transcriptome Shotgun Assembly</h3> <p>Since 2008, we have accepted one of bulk sequence data, <a href="/ddbj/tsa-e.html">Transcriptome Shotgun Assembly (TSA)</a> categorized for assembled RNA transcript sequences.<br /> Basically only one source feature should be described for a TSA entry.<br /> TSA entries can have some <a href="/ddbj/file-format-e.html#biological_feature">Biological features</a> like as general data, if necessary.<br /> Please note that TSA data may be different from others in its <a href="/ddbj/flat-file-e.html#Accession">format of accession number</a>.<br /> See also <a href="/ddbj/transcriptome-e.html">steps of transcriptome project, categories of sequence data and their correspondences</a></p> <h3 id="tls">TLS: Targeted Locus Study</h3> <p>Since 2016, we have accepted one of bulk sequence data, <a href="/ddbj/tls-e.html">Targeted Locus Study (TLS)</a>, including 16S rRNA or some other targeted loci mainly to be clustered into operational taxonomic unit.<br /> TLS entries can have some <a href="/ddbj/file-format-e.html#biological_feature">Biological features</a> like as general data.<br /> Please note that TLS data is different from others in its <a href="/ddbj/flat-file-e.html#Accession">format of accession number</a>.</p> <h2 id="whom">Distinguishing that the nucleotide sequences are not determined by the submitters</h2> <h3 id="tpa">TPA: Third Party Data and primary sequence data</h3> <p><a href="/ddbj/tpa-e.html">TPA (Third Party Data)</a> is a nucleotide sequence data collection in which each entry is obtained by assembling primary entries publicized from DDBJ/ENA/GenBank, and/or <a href="/dra/index-e.html">Sequence Read Archive</a> with additional feature annotation(s) determined by experimental or inferential methods by TPA submitter.<br /> Those assemblies include two cases; one or more primary entries are used and newly determined sequence is contained.<br /> TPA sequence data should be submitted to DDBJ/ENA/GenBank as a part of the process to publish biological research for primary nucleotide sequences.<br /> See also <a href="/ddbj/tpa-table-e.html">TPA Submission Guidelines</a>.</p> <h2 id="sub">Data types in MSS submission</h2> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td>WGS: Whole Genome Shotgun</td> <td>The sequences are <a href="/ddbj/wgs-e.html">WGS (draft genome)</a> excluding MAG or SAG.</td> </tr> <tr> <td>GNM: Finished Level Genome Sequence, non-WGS</td> <td>The sequences are <a href="/ddbj/finished_level_genome-e.html">Finished Level Genomic Sequences (not WGS)</a> excluding MAG or SAG.</td> </tr> <tr> <td>MAG: Metagenome-Assembled Genome</td> <td>The sequences are <a href="/ddbj/metagenome-assembly-e.html">MAG</a>.</td> </tr> <tr> <td>SAG: Single Amplified Genome</td> <td>The sequences are <a href="/ddbj/single-amplified-genome-e.html">SAG</a>.</td> </tr> <tr> <td>TLS: Targeted Locus Study</td> <td>The sequences are <a href="/ddbj/tls-e.html">TLS</a>.</td> </tr> <tr> <td>HTG: High Throughput Genomic Sequences</td> <td>The sequences are <a href="/ddbj/htg-e.html">HTG</a>.</td> </tr> <tr> <td>TSA: Transcriptome Shotgun Assembly</td> <td>The sequences are <a href="/ddbj/tsa-e.html">TSA</a>.</td> </tr> <tr> <td>HTC: High Throughput cDNA Sequences</td> <td>The sequences are <a href="/ddbj/htc-e.html">HTC</a>.</td> </tr> <tr> <td>EST: Expressed Sequence Tags</td> <td>The sequences are <a href="/ddbj/est-e.html">EST</a>.</td> </tr> <tr> <td>MISC: Sequences that are not included in above types</td> <td>The sequences do not match any types.</td> </tr> <tr> <td>ASK: Ask DDBJ curator to judge a correct datatype</td> <td>Ask DDBJ curators to counsult the data type.</td> </tr> </tbody> </table> <h2 id="decision-of-the-data-type-and-the-registration-site-for-submitting-the-nucleotide-sequences">Decision of the data type and the registration site for submitting the nucleotide sequences</h2> <ul> <li><a href="/ddbj/genome-e.html">Steps of genome sequencing, categories of sequence data and their correspondences</a></li> <li><a href="/ddbj/transcriptome-e.html">Steps of transcriptome project, categories of sequence data and their correspondences</a></li> <li><a href="/submission-navigation-e.html">Navigation</a></li> </ul> </main> <aside class="related-pages"> <h2 class="caption">Related pages</h2> <div class="navigation"> <nav> <ul> <li> <a href="/ddbj/genome-e.html">Data Submission from Genome Project</a> </li> <li> <a href="/ddbj/environmental-e.html">Submission of environmental sequences</a> </li> <li> <a href="/ddbj/transcriptome-e.html">Data Submission from Transcriptome Project</a> </li> <li> <a href="/ddbj/tpa-e.html">Third Party Data (TPA)</a> </li> </ul> </nav> </div> </aside> </section> </div> </section> </div> <footer></footer> <div id="back-top"></div> </body> </html>