CINXE.COM
Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org - Software and Tools - Virological
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org - Software and Tools - Virological</title> <meta name="description" content="Emma B Hodcroft, James Hadfield, Richard A Neher, Trevor Bedford Nextstrain provides clade or lineage information for a number of pathogens it already tracks, including influenza and enterovirus D68. Nextstrain also int&hellip;"> <meta name="generator" content="Discourse 3.3.0.beta1 - https://github.com/discourse/discourse version 9c110ccbd87277af793bc93441348e4e1839de81"> <link rel="icon" type="image/png" href="https://virological.org/uploads/default/optimized/1X/7897c4e655617c4b6fe71a98eaa97a38778f0623_2_32x32.png"> <link rel="apple-touch-icon" type="image/png" href="https://virological.org/uploads/default/optimized/1X/7897c4e655617c4b6fe71a98eaa97a38778f0623_2_180x180.png"> <meta name="theme-color" media="all" content="#ffffff"> <meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0, user-scalable=yes, viewport-fit=cover"> <link rel="canonical" href="https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498" /> <link rel="search" type="application/opensearchdescription+xml" href="https://virological.org/opensearch.xml" title="Virological Search"> <link href="/stylesheets/color_definitions_base__2_2ce88ee63cb4b6c91353165224be363ba88fb04f.css?__ws=virological.org" media="all" rel="stylesheet" class="light-scheme"/> <link href="/stylesheets/desktop_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="desktop" /> <link href="/stylesheets/chat_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="chat" /> <link href="/stylesheets/checklist_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="checklist" /> <link href="/stylesheets/discourse-details_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="discourse-details" /> <link href="/stylesheets/discourse-lazy-videos_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="discourse-lazy-videos" /> <link href="/stylesheets/discourse-local-dates_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="discourse-local-dates" /> <link href="/stylesheets/discourse-narrative-bot_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="discourse-narrative-bot" /> <link href="/stylesheets/discourse-presence_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="discourse-presence" /> <link href="/stylesheets/docker_manager_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="docker_manager" /> <link href="/stylesheets/footnote_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="footnote" /> <link href="/stylesheets/poll_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="poll" /> <link href="/stylesheets/spoiler-alert_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="spoiler-alert" /> <link href="/stylesheets/chat_desktop_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="chat_desktop" /> <link href="/stylesheets/poll_desktop_33be3dc8682f6a0819c96460049aa59027e7d5f4.css?__ws=virological.org" media="all" rel="stylesheet" data-target="poll_desktop" /> <link href="/stylesheets/desktop_theme_2_ea55fd37c8940e8b1da4dc94924da96f05d2abc8.css?__ws=virological.org" media="all" rel="stylesheet" data-target="desktop_theme" data-theme-id="2" data-theme-name="default"/> <meta id="data-ga-universal-analytics" data-tracking-code="UA-277246-10" data-json="{"cookieDomain":"auto"}" data-auto-link-domains=""> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-277246-10" nonce="PKIJ1AbMhwxMs6vTdYOz3flPc"></script> <script defer src="/assets/google-universal-analytics-v4-e154af4adb3c483a3aba7f9a7229b8881cdc5cf369290923d965a2ad30163ae8.js" data-discourse-entrypoint="google-universal-analytics-v4" nonce="PKIJ1AbMhwxMs6vTdYOz3flPc"></script> <link rel="alternate nofollow" type="application/rss+xml" title="RSS feed of 'Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org'" href="https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498.rss" /> <meta property="og:site_name" content="Virological" /> <meta property="og:type" content="website" /> <meta name="twitter:card" content="summary" /> <meta name="twitter:image" content="https://virological.org/uploads/default/optimized/1X/235e2ec3b5c2f1f73466386737393cbb51987759_2_1024x559.png" /> <meta property="og:image" content="https://virological.org/uploads/default/optimized/1X/235e2ec3b5c2f1f73466386737393cbb51987759_2_1024x559.png" /> <meta property="og:url" content="https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498" /> <meta name="twitter:url" content="https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498" /> <meta property="og:title" content="Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org" /> <meta name="twitter:title" content="Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org" /> <meta property="og:description" content="Emma B Hodcroft, James Hadfield, Richard A Neher, Trevor Bedford Nextstrain provides clade or lineage information for a number of pathogens it already tracks, including influenza and enterovirus D68. Nextstrain also introduced informal clade designations for SARS-CoV-2 on 4 March 2020, largely to aid internal discussions and to create URL links allowing ‘automatic zoom’ to an area of the tree that was of interest. These clades names were ad-hoc letter-number combinations (e.g. A2a) and were nev..." /> <meta name="twitter:description" content="Emma B Hodcroft, James Hadfield, Richard A Neher, Trevor Bedford Nextstrain provides clade or lineage information for a number of pathogens it already tracks, including influenza and enterovirus D68. Nextstrain also introduced informal clade designations for SARS-CoV-2 on 4 March 2020, largely to aid internal discussions and to create URL links allowing ‘automatic zoom’ to an area of the tree that was of interest. These clades names were ad-hoc letter-number combinations (e.g. A2a) and were nev..." /> <meta property="og:article:section" content="SARS-CoV-2 coronavirus" /> <meta property="og:article:section:color" content="3AB54A" /> <meta property="og:article:section" content="Software and Tools" /> <meta property="og:article:section:color" content="F7941D" /> <meta name="twitter:label1" value="Reading time" /> <meta name="twitter:data1" value="2 mins 🕑" /> <meta name="twitter:label2" value="Likes" /> <meta name="twitter:data2" value="2 ❤" /> <meta property="article:published_time" content="2020-06-02T17:41:19+00:00" /> <meta property="og:ignore_canonical" content="true" /> </head> <body class="crawler browser-update"> <header> <a href="/"> Virological </a> </header> <div id="main-outlet" class="wrap" role="main"> <div id="topic-title"> <h1> <a href="/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498">Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org</a> </h1> <div class="topic-category" itemscope itemtype="http://schema.org/BreadcrumbList"> <span itemprop="itemListElement" itemscope itemtype="http://schema.org/ListItem"> <a href="/c/novel-2019-coronavirus/software-and-tools/42" class="badge-wrapper bullet" itemprop="item"> <span class='badge-category-bg' style='background-color: #3AB54A'></span> <span class='badge-category clear-badge'> <span class='category-name' itemprop='name'>SARS-CoV-2 coronavirus</span> </span> </a> <meta itemprop="position" content="1" /> </span> <span itemprop="itemListElement" itemscope itemtype="http://schema.org/ListItem"> <a href="/c/novel-2019-coronavirus/software-and-tools/42" class="badge-wrapper bullet" itemprop="item"> <span class='badge-category-bg' style='background-color: #F7941D'></span> <span class='badge-category clear-badge'> <span class='category-name' itemprop='name'>Software and Tools</span> </span> </a> <meta itemprop="position" content="2" /> </span> </div> </div> <div itemscope itemtype='http://schema.org/DiscussionForumPosting'> <meta itemprop='headline' content='Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org'> <link itemprop='url' href='https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498'> <meta itemprop='datePublished' content='2020-06-02T17:41:18Z'> <meta itemprop='articleSection' content='Software and Tools'> <meta itemprop='keywords' content=''> <div itemprop='publisher' itemscope itemtype="http://schema.org/Organization"> <meta itemprop='name' content='Virological'> <div itemprop='logo' itemscope itemtype="http://schema.org/ImageObject"> <meta itemprop='url' content='https://virological.org/uploads/default/original/1X/7897c4e655617c4b6fe71a98eaa97a38778f0623.png'> </div> </div> <div id='post_1' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" href='https://virological.org/u/emmahodcroft'><span itemprop='name'>emmahodcroft</span></a> </span> <link itemprop="mainEntityOfPage" href="https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstrain-org/498"> <link itemprop="image" href="https://virological.org/uploads/default/original/1X/235e2ec3b5c2f1f73466386737393cbb51987759.png"> <span class="crawler-post-infos"> <time datetime='2020-06-02T17:41:19Z' class='post-time'> June 2, 2020, 5:41pm </time> <meta itemprop='dateModified' content='2020-06-26T08:59:25Z'> <span itemprop='position'>1</span> </span> </div> <div class='post' itemprop='text'> <p><strong>Emma B Hodcroft, James Hadfield, Richard A Neher, Trevor Bedford</strong></p> <p>Nextstrain provides clade or lineage information for a number of pathogens it already tracks, including influenza and enterovirus D68. Nextstrain also introduced informal clade designations for SARS-CoV-2 on 4 March 2020, largely to aid internal discussions and to create URL links allowing ‘automatic zoom’ to an area of the tree that was of interest. These clades names were ad-hoc letter-number combinations (e.g. A2a) and were never intended to be a permanent naming system (and never visible by default). Nevertheless, these clades have been used by some to discuss different aspects of the phylogeny on Nextstrain, underscoring the need for a more long-term, formal proposal to designate SARS-CoV-2 clades.</p> <p>Our objective here is to introduce a naming system that facilitates discussion of large scale diversity patterns of SARS-CoV-2 and label clades that persist for at least several months and have significant geographic spread. This objective is distinct from and complementary to Rambaut et al. 2020 (1), who proposed a dynamic system for labeling transient lineages that have local epidemiological significance. Their proposal results in a large number of short lived labels that maintain some information on the hierarchical structure and facilitate discussion of local short term dynamics, but are not intended to provide large scale structure.</p> <p>Our nomenclature proposal has the following objectives:</p> <ul> <li>label genetically well defined clades that have reached significant frequency and geographic spread,</li> <li>allow for transient clade designations that are elevated to major clades if they persist and rise in frequency,</li> <li>provide memorable but informative names,</li> <li>gracefully handle clade naming in the upcoming years as SARS-CoV-2 becomes a seasonal virus.</li> </ul> <p>To provide memorable and pronounceable names, we want to avoid the common pattern of stringing together alternating numbers and letters, as this pattern tends to become arcane with time, e.g. 3c2.A1b for A/H3N2 influenza. Instead, we propose to name major clades by the year they are estimated to have emerged and a letter, e.g. 19A, 19B, 20A. The yearly reset of letters will ensure that we don’t progress too far into the alphabet, while the year-prefix provides immediate context on the origin of the clade that will become increasingly important going forward. We aim for ~5 clades per year. These are meant as major genetic groupings and not intended to completely resolve genetic diversity. The hierarchical structure of clades is sometimes of interest. Here, the “derivation” of a major clade can be labeled with the familiar “.” notation as in 19A.20A.20C for the major clade 20C.</p> <p>Within these major clades, we will monitor potential ‘emerging clades’, which we will label by their parent clade and the nucleotide mutation(s) that defines them (ex: 19A/28688C). It should be noted however, that these mutations are only meaningful in that they define the clade; no other functional or epidemiological significance should be attributed to them. Once a subclade reaches (soft) criteria on frequency, spread, and genetic distinctiveness, it will be renamed to a major clade (hypothetically 19A/28688C to 20D). The use of mutations as emergent clade labels has the desirable property that the labels stay meaningful after a renaming. Additionally, anyone can refer to any clade in the SARS-CoV-2 phylogeny using this nomenclature. For example, we can use 19A/28688C,8653T to refer to a more narrow subclade. This is obviously verbose, but is fully descriptive and has the advantage that the definition is embedded in the name itself which allows subclades to be discussed by the community without needing a centralized database to be updated.</p> <p><strong>Definition of major clades</strong><br> We propose to name a new major clade when it reaches a frequency of 20% globally. When calculating these frequencies, care has to be taken to achieve approximately even sampling of sequences in time and space since sequencing effort varies strongly between countries. A clade name consists of the year it emerged and the next available letter in the alphabet. A new clade should be at least 2 mutations away from its parent major clade.</p> <p><div class="lightbox-wrapper"><a class="lightbox" href="https://virological.org/uploads/default/original/1X/235e2ec3b5c2f1f73466386737393cbb51987759.png" data-download-href="https://virological.org/uploads/default/235e2ec3b5c2f1f73466386737393cbb51987759" title="image"><img src="https://virological.org/uploads/default/optimized/1X/235e2ec3b5c2f1f73466386737393cbb51987759_2_690x376.png" alt="image" data-base62-sha1="52SuO8rQPq19FQGL4zwg2gDqtfz" width="690" height="376" srcset="https://virological.org/uploads/default/optimized/1X/235e2ec3b5c2f1f73466386737393cbb51987759_2_690x376.png, https://virological.org/uploads/default/optimized/1X/235e2ec3b5c2f1f73466386737393cbb51987759_2_1035x564.png 1.5x, https://virological.org/uploads/default/original/1X/235e2ec3b5c2f1f73466386737393cbb51987759.png 2x" data-dominant-color="DBDFCF"><div class="meta"> <svg class="fa d-icon d-icon-far-image svg-icon" aria-hidden="true"><use href="#far-image"></use></svg><span class="filename">image</span><span class="informations">1108×605 270 KB</span><svg class="fa d-icon d-icon-discourse-expand svg-icon" aria-hidden="true"><use href="#discourse-expand"></use></svg> </div></a></div><br> <strong>Fig 1.</strong> Nextstrain ‘global’ run with the new Nextstrain major clades labelled.</p> <p>By these criteria, the first two clades are 19A and 19B which correspond to the split marked by mutations C8782T and T28144C. These clades were both prevalent in Asia during the first months of the outbreak. The next clade that was named is 20A corresponding to the clade that dominated large European outbreak in early 2020. It is distinguished from its parent 19A by the mutations C3037T, C14408T and A23403G.</p> <p>After this, we’ve seen two further clades appear: 20B, another European clade separated clearly by three consecutive mutations: G28881A, G28882A, and G28883C and 20C, a largely North American clade, distinguished by mutations C1059T and G25563T.</p> <p>The clade definitions are coded in as a tabular file that defines a genotypic signature for each clade. We provide a script that generates a table with clade assignments for a set of sequences.</p> <p>Additionally, definitions of the clades, as well as a table of the current clades, with some detail on their characteristics and definition, is available on the ‘ncov’ github repository documentation (<a href="https://github.com/nextstrain/ncov" class="inline-onebox" rel="noopener nofollow ugc">GitHub - nextstrain/ncov: Nextstrain build for novel coronavirus SARS-CoV-2</a>) <a href="https://github.com/nextstrain/ncov/blob/master/docs/naming_clades.md" rel="noopener nofollow ugc">here</a>.</p> <p>.md). In parallel to the large-scale clade annotation, Nextstrain will support the Pangolin lineage nomenclature developed by Rambaut et al. 2020 (1), and continue to offer color-by for the ‘old’ Nextstrain clades so that exiting references to these clades are still identifiable.</p> <p><strong>Using Nextstrain Clade Definitions</strong><br> To make it easy for users to identify the Nextstrain clade of their own sequences, we provide a <a href="https://github.com/nextstrain/ncov/blob/master/assign_clades.py" rel="noopener nofollow ugc">simple python script</a> that can be run on any Fasta file to assign appropriate clades. This script is part of the ‘ncov’ github repository, but does not require running any other part of the pipeline. However ‘augur’ must be installed to run the script. This can be done <a href="https://nextstrain.org/docs/getting-started/local-installation#install-augur-with-python" rel="noopener nofollow ugc">a number of different ways</a>, but is often most easily done <a href="https://nextstrain-augur.readthedocs.io/en/stable/installation/installation.html#using-pip-from-pypi" rel="noopener nofollow ugc">using ‘pip’</a>.</p> <p><strong>Links:</strong></p> <ul> <li>An up-to-date description of our approach to clade names and the current clades can be found on the Nextstrain ‘ncov’ github <a href="https://github.com/nextstrain/ncov/blob/master/docs/naming_clades.md" rel="noopener nofollow ugc">here</a>.</li> <li>Clades show on the current Nextstrain ‘Global’ SARS-CoV-2 tree can be viewed <a href="https://nextstrain.org/ncov/global?branchLabel=clade&c=clade_membership" rel="noopener nofollow ugc">here</a>.</li> <li>Pangolin clades can be displayed on the Nextstrain runs, as demonstrated <a href="https://nextstrain.org/ncov/global?branchLabel=none&c=pangolin_lineage" rel="noopener nofollow ugc">here</a>.</li> <li>The ‘old’ Nextstrain clades can still be viewed <a href="https://nextstrain.org/ncov/global?branchLabel=none&c=legacy_clade_membership" rel="noopener nofollow ugc">here</a> as a color-by for those papers/references that have used them</li> </ul> <p>References:</p> <p>(1) Andrew Rambaut, Edward C. Holmes, Verity Hill, Áine O’Toole, JT McCrone, Chris Ruis, Louis du Plessis, Oliver G. Pybus. “A dynamic nomenclature proposal for SARS-CoV-2 to assist genomic epidemiology”. bioRxiv 2020.04.17.046086; doi: <a href="https://doi.org/10.1101/2020.04.17.046086" class="inline-onebox" rel="noopener nofollow ugc">A dynamic nomenclature proposal for SARS-CoV-2 to assist genomic epidemiology | bioRxiv</a></p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="2" /> <span class='post-likes'>2 Likes</span> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/CommentAction"/> <meta itemprop="userInteractionCount" content="0" /> </div> </div> <div id='post_2' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" href='https://virological.org/u/trvrb'><span itemprop='name'>trvrb</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2020-06-02T20:20:17Z' class='post-time'> June 2, 2020, 8:20pm </time> <meta itemprop='dateModified' content='2020-06-02T20:20:17Z'> <span itemprop='position'>2</span> </span> </div> <div class='post' itemprop='text'> <p>Feedback is of course appreciated <img src="//virological.org/images/emoji/twitter/slight_smile.png?v=9" title=":slight_smile:" class="emoji" alt=":slight_smile:"></p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/CommentAction"/> <meta itemprop="userInteractionCount" content="0" /> </div> </div> </div> </div> <footer class="container wrap"> <nav class='crawler-nav'> <ul> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/' itemprop="url">Home </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/categories' itemprop="url">Categories </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/guidelines' itemprop="url">FAQ/Guidelines </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/tos' itemprop="url">Terms of Service </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/privacy' itemprop="url">Privacy Policy </a> </span> </li> </ul> </nav> <p class='powered-by-link'>Powered by <a href="https://www.discourse.org">Discourse</a>, best viewed with JavaScript enabled</p> </footer> <div class="buorg"><div>Unfortunately, <a href="https://www.discourse.org/faq/#browser">your browser is unsupported</a>. Please <a href="https://browsehappy.com">switch to a supported browser</a> to view rich content, log in and reply.</div></div> </body> </html>