CINXE.COM

Metagenome Assembly

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta property="og:title" content="Metagenome Assembly" /> <meta property="og:url" content="https://www.ddbj.nig.ac.jp/ddbj/metagenome-assembly-e.html" /> <meta property="og:description" content="Microorganisms comprise the majority of the planet’s biologicaldiversity, howeve..." /> <meta property6="og:image" content="/images/thumbnail/logo_ddbj_fb.png" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Metagenome Assembly</title> <script async src="https://www.google-analytics.com/analytics.js"></script> <script src="https://code.jquery.com/jquery-3.5.0.js" integrity="sha256-r/AaFHrszJtwpe+tHyNi/XCfMxYpbsRg2Uqn0x3s2zc=" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery.hoverintent/1.10.1/jquery.hoverIntent.min.js" integrity="sha512-gx3WTM6qxahpOC/hBNUvkdZARQ2ObXSp/m+jmsEN8ZNJPymj8/Jamf8+/3kJQY1RZA2DR+KQfT+b3JEB0r9YRg==" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/spin.js/4.1.0/spin.min.js" integrity="sha512-CbohqWjAgarTqRHcX1MbwkF2pujwbsCee1PABpnBWC+VqSldvlNEEI5+4OSsR/HbFQOFFpwY2YvZZNjBMxNnXg==" crossorigin="anonymous"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery.colorbox/1.6.4/jquery.colorbox-min.js"></script> <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery-deparam/0.5.3/jquery-deparam.min.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.trace.js"></script> <script type="text/javascript" src="https://www.ddbj.nig.ac.jp/assets/js/jquery.json_search.js"></script> <link rel="icon" href="https://www.ddbj.nig.ac.jp/assets/images/favicon_ddbj.ico"> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/colorbox.css" /> <link rel="stylesheet" href="https://www.ddbj.nig.ac.jp/assets/css/main.css" /> <link rel="alternate" type="application/rss+xml" title="My Site RSS" href="/feed.xml" /> <script src="https://www.ddbj.nig.ac.jp/assets/js/main.js"></script> </head> <body data-category="ddbj"> <script src="https://www.ddbj.nig.ac.jp/assets/js/ddbj_common_framework.js" id="DDBJ_common_framework" style="display: block; height: 40px;" data-bottom-menu="true" data-ddbj-home-page="true" data-search="true" ></script> <section class="top-news-view"> <div class="inner"> <ul> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-10-22-e">On Cyber Threats against DDBJ, a node of the International Nucleotide Sequence Database Collaboration</a> </li> <li class="item"> <a href="https://www.ddbj.nig.ac.jp/news/en/2024-11-22-e">(27st November 9:00-November 28th 12:00)Announcement of D-way/MSS suspension</a> </li> </ul> </div> </section> <div id="primary"> <header id="PageHeader"> <div class="inner"> <div class="page-title"> <p class="title -normal">DDBJ Annotated/Assembled Sequences</p> </div> <nav class="tab-menu-view"> <ul class="tabmenucontainer"> <li class=""> <a href="/ddbj/index-e.html">Home</a> </li> <li class=" -haschild"> <a href="/ddbj/submission-e.html">Submission</a> <ul> <li> <a href="/ddbj/submission-e.html">Before Submission</a> </li> <li> <a href="/ddbj/web-submission-e.html">Web submission</a> </li> <li> <a href="/ddbj/mss-e.html">Mass Submission</a> </li> <li> <a href="/ddbj/update-e.html">Data Update</a> </li> </ul> </li> <li class=" -haschild"> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">Search</a> <ul> <li> <a href="http://getentry.ddbj.nig.ac.jp/top-e.html">getentry</a> </li> <li> <a href="http://ddbj.nig.ac.jp/arsa/?lang=en">ARSA</a> </li> </ul> </li> <li class=" -haschild"> <a href="/ddbj/flat-file-e.html">Flat file</a> <ul> <li> <a href="/ddbj/feature-table-e.html">Feature Table</a> </li> <li> <a href="/ddbj/features-e.html">Feature key</a> </li> <li> <a href="/ddbj/qualifiers-e.html">Qualifier key</a> </li> <li> <a href="/ddbj/sequence-e.html">Nucleotide Sequences</a> </li> <li> <a href="/ddbj/organism-e.html">Organism qualifier</a> </li> <li> <a href="/ddbj/identifiers-e.html">Identifiers</a> </li> <li> <a href="/ddbj/location-e.html">Description of Location</a> </li> <li> <a href="/ddbj/cds-e.html">Protein Coding Sequence</a> </li> <li> <a href="/ddbj/geneticcode-e.html">The Genetic Codes</a> </li> <li> <a href="/ddbj/code-e.html">Codes Used in Sequence Description</a> </li> <li> <a href="/ddbj/example-e.html">Description Examples of Sequence Data</a> </li> </ul> </li> <li class=" -haschild -current"> <a href="/ddbj/data-categories-e.html">Data categories</a> <ul> <li> <a href="/ddbj/genome-e.html">Data Submission from Genome Project</a> </li> <li> <a href="/ddbj/pseudohaplotype-e.html">Pseudohaplotype</a> </li> <li> <a href="/ddbj/wgs-e.html">WGS</a> </li> <li> <a href="/ddbj/finished_level_genome-e.html">Finished level genomic sequences</a> </li> <li> <a href="/ddbj/metagenome-assembly-e.html">Metagenome Assembly</a> </li> <li> <a href="/ddbj/single-amplified-genome-e.html">Single amplified genome</a> </li> <li> <a href="/ddbj/htg-e.html">HTG</a> </li> <li> <a href="/ddbj/environmental-e.html">Environmental sample</a> </li> <li> <a href="/ddbj/env-e.html">ENV</a> </li> <li> <a href="/ddbj/tls-e.html">TLS</a> </li> <li> <a href="/ddbj/transcriptome-e.html">Data Submission from Transcriptome Project</a> </li> <li> <a href="/ddbj/tsa-e.html">TSA</a> </li> <li> <a href="/ddbj/est-e.html">EST</a> </li> <li> <a href="/ddbj/htc-e.html">HTC</a> </li> <li> <a href="/ddbj/tpa-e.html">Third Party Data (TPA)</a> </li> </ul> </li> <li class=""> <a href="/faq/en/index-e.html?tag=ddbj">FAQ</a> </li> <li class=" -haschild"> <a href="/ddbj/index-e.html">Other</a> <ul> <li> <a href="/ddbj/patent-data-e.html">Patent</a> </li> <li> <a href="/ddbj/mga-e.html">MGA</a> </li> </ul> </li> </ul> </nav> </div> </header> <section id="NavigationAndMainView"> <div class="inner"> <div class="subview"> <nav id="TableOfContents" class="internal-link"> </nav> </div> <section id="MainContentView" class="mainview"> <header class="header"> <nav class="breadcrumb-view"> <ul> <li> <a href="https://www.ddbj.nig.ac.jp/index-e.html">Home</a> </li> <li> <a href="https://www.ddbj.nig.ac.jp/ddbj/index-e.html">ddbj</a> </li> <li><a>Metagenome Assembly</a></li> </ul> </nav> <h1 class="title">Metagenome Assembly</h1> </header> <main class="md-content"> <p>Microorganisms comprise the majority of the planet’s biological diversity, however, due to the varied environments and conditions in which these organisms reside, many of these cannot be cultured. By standard genome analysis methods requiring isolation and laboratory cultivation, limited knowledge was gained regarding these uncultured microorganisms. Metagenomics is a culture-independent genomic analysis method which surveys genomes of uncultured microorganisms and has brought new discoveries about the genetic diversity, population structure and ecological roles of these uncultured microorganisms.</p> <p>Data from metagenome projects are grouped into four groups depending on their assembly level.</p> <p>(1) NGS raw reads before assembly. (2) Assembled contigs of unknown taxa (Primary metagenome). (3) Binned assemblies asserted to known taxonomies (Binned metagenome). (4) A highest quality (in terms of completeness and contamination) representative binned assembly (Metagenome-Assembled Genome, MAG) for each predicted species.</p> <p>DDBJ Center accepts (1)-(3) in DRA and (4) in DDBJ. Regarding quality of MAG assembly, please refer to <a href="https://www.nature.com/articles/nbt.3893">this publication</a>.</p> <p>This guide explains how to submit these metagenomic sequencing data to the BioProject/BioSample/DRA/DDBJ. Raw sequencing data deposition to DRA is basically required.</p> <h2 id="mag-submission">Submission of metagenome assembly data</h2> <div class="figure"> <a class="group1" href="/assets/images/submission/mag-e.jpg" title="Submission of metagenome assembly data"> <figure class="image"> <img src="/assets/images/submission/mag-e.jpg" alt="Submission of metagenome assembly data" class="w600" /> <figcaption>Submission of metagenome assembly data</figcaption> </figure> </a> </div> <h3 id="raw-reads">(1) Raw reads</h3> <p>Unassembled raw sequence data should be submitted to <a href="/dra/submission-e.html">DRA Run</a>.</p> <h4 id="raw-reads-bioproject">BioProject</h4> <p>Register your BioProject as a <a href="/bioproject/project-info-e.html#Project-type">metagenome/environmental project</a>. For the organism name, choose the most appropriate “xyz metagenome” (e.g., soil metagenome) from this list of <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&amp;id=408169&amp;lvl=3&amp;p=mapview&amp;p=has_linkout&amp;p=blast_url&amp;p=genome_blast&amp;keep=1&amp;srchmode=3&amp;unlock/">metagenome organism names</a> in the taxonomy database.</p> <h4 id="raw-reads-biosample">BioSample</h4> <p>Register your BioSample by using the <a href="/biosample/sample-info-e.html#mixs">MIxS MIMS.me</a> package. For the organism name, choose the most appropriate “xyz metagenome” (e.g., soil metagenome) from this list of <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&amp;id=408169&amp;lvl=3&amp;p=mapview&amp;p=has_linkout&amp;p=blast_url&amp;p=genome_blast&amp;keep=1&amp;srchmode=3&amp;unlock/">metagenome organism names</a> in the taxonomy database. Please provide as much metadata and information as possible about the samples in order to provide context for the experimental data.</p> <h4 id="raw-reads-dra">DRA</h4> <p>Submit unassembled raw sequence data to <a href="/dra/submission-e.html">DRA Run</a>.</p> <h3 id="primary-metagenome">(2) Primary metagenome</h3> <p>Assembled contigs derived from the raw sequence data should be submitted to <a href="/dra/submission-e.html">DRA Analysis</a>.</p> <h4 id="primary-metagenome-bioproject">BioProject</h4> <p>Same as (1) Raw reads.</p> <h4 id="primary-metagenome-biosample">BioSample</h4> <p>Same as (1) Raw reads.</p> <h4 id="primary-metagenome-dra">DRA</h4> <p>Submit assembled contigs derived from the raw sequence data in fasta/bam files to the <a href="/dra/metadata-e.html#Analysis_Type">DRA Analysis</a> (Analysis type = ‘De Novo Assembly’) along with the Run registered in (1). By using <a href="/dra/submission-e.html#excel">the excel for DRA submission</a>, describe analysis software used in Analysis step and quality metrics in Attributes. <br /> If using the DRA submission web interface, include information of a referencing BioSample accession, analysis software used and assembly quality metrics in the description.</p> <ul> <li>BioSample: SAMD00000001</li> <li>Analysis step: canu 2.1, pilon 1.24, CheckM 1.1.3</li> <li>Quality: completeness 85.3, contamination 0</li> </ul> <p>Please note that Analysis data are not shared with NCBI/ENA. Analysis is not indexed by <a href="https://ddbj.nig.ac.jp/search">DDBJ Search</a>. Only analysis metadata XML and data files are provided in ftp. (For example, <a href="https://ddbj.nig.ac.jp/public/ddbj_database/dra/fastq/DRA000/DRA000072/">DRZ000001</a>.</p> <h3 id="binned-metagenome">(3) Binned metagenome</h3> <p>Binned metagenome assemblies derived from a subset of the raw sequence data should be submitted to <a href="/dra/submission-e.html">DRA Analysis</a>.</p> <h4 id="binned-metagenome-bioproject">BioProject</h4> <p>Same as (1) Raw reads.</p> <h4 id="binned-metagenome-biosample">BioSample</h4> <p>Register a virtual BioSample by using the <a href="/biosample/sample-info-e.html#mixs">“MIMAG”</a> package. Describe an organism name without ‘uncultured’ (e.g., “Agrobacterium tumefaciens”, “Agrobacterium sp.”, “Rhizobiaceae bacterium”) in the taxonomy database from which the binned assembly was derived. Please note that a virtual BioSample derived from the MIMS metagenomic sample used in (1) is required for a binned submission.</p> <p>Among organism names assigned by <a href="https://gtdb.ecogenomic.org/">GTDB</a>, please convert ones not registered in <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi">NCBI Taxonomy</a> to corresponding NCBI Taxonomy’s names.</p> <p>Please describe following attributes to show sample source.</p> <p>Describe metagenome source in metagenome_source by using one of <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&amp;id=408169&amp;lvl=3&amp;p=mapview&amp;p=has_linkout&amp;p=blast_url&amp;p=genome_blast&amp;keep=1&amp;srchmode=3&amp;unlock/">metagenome organism names</a>. Example) metagenome_source: soil metagenome</p> <p>Indicate derived metagenome sample registered in (1) by entering BioSample accession(s) in derived_from. Example) derived_from: SAMD00000001 derived_from: SAMD00000002,SAMD00000003,SAMD00000010-SAMD00000015</p> <h4 id="binned-metagenome-dra">DRA</h4> <p>Submit binned assemblies derived from the raw sequence data in fasta/bam files to the <a href="/dra/metadata-e.html#Analysis_Type">DRA Analysis</a> (Analysis type = ‘De Novo Assembly’) along with the Run registered in (1). By using <a href="/dra/submission-e.html#excel">the excel for DRA submission</a>, describe analysis software used in Analysis step, and quality metrics and binning information in Attributes. <br /> If using the DRA submission web interface, include information of a referencing BioSample accession, analysis software used, and assembly quality metrics and binning information in the description.</p> <ul> <li>BioSample: SAMD00000001</li> <li>Analysis step: canu 2.1, pilon 1.24, CheckM 1.1.3</li> <li>Quality: completeness 85.3, contamination 0</li> </ul> <p>Please note that Analysis data are not shared with NCBI/ENA. Analysis is not indexed by <a href="https://ddbj.nig.ac.jp/search">DDBJ Search</a>. Only analysis metadata XML and data files are provided in ftp. (For example, <a href="https://ddbj.nig.ac.jp/public/ddbj_database/dra/fastq/DRA000/DRA000072/">DRZ000001</a></p> <h3 id="mag">(4) MAG</h3> <p>Metagenomic assemblies (Metagenome-Assembled Genomes, MAGs) predicted to be derived from taxonomically defined organisms should be submitted to DDBJ as genome entries of <a href="/ddbj/env-e.html">ENV division</a>.</p> <h4 id="mag-bioproject">BioProject</h4> <p>Register your BioProject as a <a href="/bioproject/project-info-e.html#Project-type">metagenome/environmental project</a>. If you have already registered a BioProject for submission of the corresponding raw reads to DRA, then, in general, you would use the BioProject when you submit the MAG to DDBJ.</p> <h4 id="mag-biosample">BioSample</h4> <p>Register a virtual BioSample by using the <a href="/biosample/sample-info-e.html#mixs">“MIMAG”</a> package. Describe an organism name without ‘uncultured’ (e.g., Agrobacterium tumefaciens) in the taxonomy database from which the MAG was derived. Please note that a virtual BioSample derived from the MIMS metagenomic sample used in (1) is required for a MAG submission.</p> <p>Among organism names assigned by <a href="https://gtdb.ecogenomic.org/">GTDB</a>, please convert ones not registered in <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi">NCBI Taxonomy</a> to corresponding NCBI Taxonomy’s names.</p> <p>Please describe following attributes to show sample source.</p> <p>Describe metagenome source in metagenome_source by using one of <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&amp;id=408169&amp;lvl=3&amp;p=mapview&amp;p=has_linkout&amp;p=blast_url&amp;p=genome_blast&amp;keep=1&amp;srchmode=3&amp;unlock/">metagenome organism names</a>. Example) metagenome_source: soil metagenome</p> <p>Indicate derived metagenome sample registered in (1) by entering BioSample accession(s) in derived_from. Example) derived_from: SAMD00000001 derived_from: SAMD00000002,SAMD00000003,SAMD00000010-SAMD00000015</p> <p><a href="https://docs.google.com/spreadsheets/d/1VCCuSwvIRfp5-DT8cnvvAwWH4C7wbDFSjHQ_q3f3BII/edit#gid=272411182">Example BioSample</a></p> <h4 id="mag-dra">DRA</h4> <p>The raw sequence data used for the MAG assembly should be submitted to the DRA Run.</p> <h4 id="mag-ddbj">DDBJ</h4> <p>Submit the MAG as a genome entry of <a href="/ddbj/env-e.html">ENV division</a> through the <a href="/ddbj/mss-e.html">Mass Submission System (MSS)</a>. Following <a href="/ddbj/qualifiers-e.html">Qualifier</a> of <a href="/ddbj/features-e.html#source">source feature</a> are required for the MAG submission.</p> <p>Required for the MAG entry.</p> <ul> <li><a href="/ddbj/qualifiers-e.html#metagenome_source">/metagenome_source</a> = ‘xyz metagenome’ (‘xyz metagenome’ should be from this list of <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&amp;id=408169&amp;lvl=3&amp;p=mapview&amp;p=has_linkout&amp;p=blast_url&amp;p=genome_blast&amp;keep=1&amp;srchmode=3&amp;unlock/">metagenome organism names</a> in the taxonomy database).</li> </ul> <p>Required for the ENV division entry.</p> <ul> <li><a href="/ddbj/qualifiers-e.html#environmental_sample">/environmental_sample</a></li> <li><a href="/ddbj/qualifiers-e.html#isolation_source">/isolation_source</a></li> <li><a href="/ddbj/qualifiers-e.html#isolate">/isolate</a></li> </ul> <p>Required for All entry.</p> <ul> <li><a href="/ddbj/organism-e.html#MAG">/organism</a></li> <li><a href="/ddbj/qualifiers-e.html#mol_type">/mol_type</a> = “genomic DNA”</li> </ul> <p>The assebly information is necessary in <a href="/ddbj/file-format-e.html#describing_st_comment">ST_COMMENT</a> as a genome entry.</p> <ul> <li>Assembly Method</li> <li>Genome Coverage</li> <li>Sequencing Technology</li> <li>Assembly Name (required in the case of eukaryotes)</li> </ul> <p>In the MAG (ENV division) entry, <a href="/ddbj/qualifiers-e.html#strain">/strain</a> can not be used. <br /> Please describe natural host of the organism from which sequenced molecule was obtained in /host.</p> </main> </section> </div> </section> </div> <footer></footer> <div id="back-top"></div> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10