CINXE.COM
PDB-101: Learn: Guide to Understanding PDB Data: Primary Sequences
<!DOCTYPE html> <html> <head> <script src="https://www.googletagmanager.com/gtag/js?id=G-EPQ9202NVY" async></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); //- gtag('config', 'UA-71059016-1'); gtag('config', 'G-EPQ9202NVY'); </script> <title>PDB-101: Learn: Guide to Understanding PDB Data: Primary Sequences</title> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta property="og:title" content="PDB101: Learn: Guide to Understanding PDB Data: Primary Sequences"> <meta property="og:description" content="PDB-101: Training, Outreach, and Education portal of RCSB PDB"> <meta property="og:image" content="https://cdn.rcsb.org/pdb101/common/images/pdb101-logo-sm.png"> <meta property="og:site_name" content="RCSB: PDB-101"> <meta name="twitter:card" content="summary"> <meta name="twitter:title" content="PDB101: Learn: Guide to Understanding PDB Data: Primary Sequences"> <meta name="twitter:description" content="PDB-101: Training, Outreach, and Education portal of RCSB PDB"> <meta name="description" content="PDB-101: Training, Outreach, and Education portal of RCSB PDB"> <meta name="keywords" content="protein sequence,dna sequence,rna sequence,seqres,fasta,residue name,amino acid,nucleotide"> <!-- associate with our Google Analytics--> <meta name="google-site-verification" content="A8M31jAX8SUgQYzbnF5r-wgykna2i5Hp4J9fziVD9Sg"> <link href="https://cdn.rcsb.org/pdb101/common/jquery-ui-1.11.4/jquery-ui.min.css" rel="stylesheet"> <link href="https://cdn.rcsb.org/javascript/bootstrap/latest/css/bootstrap.min.css" rel="stylesheet"> <link href="/ekko-lightbox/ekko-lightbox.css" rel="stylesheet"> <link href="https://cdn.rcsb.org/javascript/fontawesome/latest/css/font-awesome.min.css" rel="stylesheet"> <link href="https://cdn.rcsb.org/javascript/timeline3/css/timeline.css" rel="stylesheet"> <link href="/css/style.css?v=20171201" rel="stylesheet"> <link href="https://cdn.rcsb.org/jira-feedback/css/jira-fdbck.css" rel="stylesheet"><!--[if lt IE 9]> <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script><![endif]--> <script src="https://cdn.rcsb.org/javascript/jquery/jquery-2.2.4.min.js"></script> <script src="https://cdn.rcsb.org/pdb101/common/jquery-ui-1.11.4/jquery-ui.min.js"></script> <script src="https://cdn.rcsb.org/javascript/bootstrap/latest/js/bootstrap.min.js"></script> <script src="/ekko-lightbox/ekko-lightbox.min.js"></script> <script src="/js/common.js"></script> <script src="https://cdn.rcsb.org/jira-feedback/js/jira-fdbck.min.js"></script> </head> <body> <div data-elastic-exclude> <nav class="navbar navbar-inverse navbar-fixed-top hidden-print"> <div class="container"> <div class="navbar-header"> <button type="button" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar" class="navbar-toggle collapsed"><span class="sr-only">Toggle navigation</span><span class="icon-bar"></span><span class="icon-bar"></span><span class="icon-bar"></span></button><a href="/" class="navbar-brand">PDB-101</a> </div> <div id="navbar" class="collapse navbar-collapse"> <ul class="nav navbar-nav"> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle"><span class="hidden-sm hidden-xs">Molecule of the Month</span><span class="hidden-lg hidden-md">MotM</span><b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/motm">Current Feature</a></li> <li><a href="/motm/motm-by-category">By Category</a></li> <li><a href="/motm/motm-by-date">By Date</a></li> <li><a href="/motm/motm-by-title">By Title</a></li> <li><a href="/motm/motm-about">About Molecule of the Month</a></li> <li><a href="/motm/motm-image-download">Image Download</a></li> <li><a href="https://my.forms.app/rcsb-pdb/motm-newsletter" target="_blank">Newsletter Subscription</a></li> </ul> </li> <li><a href="/browse">Browse</a></li> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle">Learn<b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/learn/paper-models">Paper Models</a></li> <li><a href="/learn/flyers-posters-and-calendars">Flyers, Posters, & Calendars</a></li> <li><a href="/learn/videos">Videos</a></li> <li><a href="/learn/interactive-animations">Interactive Animations</a></li> <li><a href="/learn/coloring-books">Coloring Books</a></li> <li><a href="/learn/structural-biology-highlights">Structural Biology Highlights</a></li> <li><a href="/learn/3d-printing">3D Printing</a></li> <li><a href="/learn/exploring-the-structural-biology-of-cancer">Exploring the Structural Biology of Cancer</a></li> <li><a href="/learn/exploring-the-structural-biology-of-bioenergy">Exploring the Structural Biology of Bioenergy</a></li> <li><a href="/learn/exploring-the-structural-biology-of-viruses">Exploring the Structural Biology of Viruses</a></li> <li><a href="/learn/exploring-the-structural-biology-of-health-and-nutrition">Exploring the Structural Biology of Health and Nutrition</a></li> <li><a href="/learn/exploring-the-structural-biology-of-evolution">Exploring the Structural Biology of Evolution</a></li> <li><a href="/learn/exploring-structural-biology-with-computed-structure-models-csms">Exploring Structural Biology with Computed Structure Models (CSMs)</a></li> <li><a href="/learn/resources-to-fight-the-covid-19-pandemic">COVID-19 Pandemic Resources</a></li> <li><a href="/learn/other-resources">Other Resources</a></li> </ul> </li> <li class="dropdown"> <a href="#" data-toggle="dropdown" class="dropdown-toggle">Train <b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/train/guide-to-understanding-pdb-data/introduction">Guide to Understanding PDB Data</a></li> <li><a href="/train/training-events">Training Courses</a></li> <li><a href="/train/education-corner">Education Corner</a></li> <li><a href="/train/pdb-and-data-archiving-curriculum/about">PDB and Data Archiving Curriculum</a></li> </ul> </li> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle">Teach<b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/teach/overview">Overview of Curriculum Modules</a></li> <li role="separator" class="divider"></li> <!--each item, i in pdb101.teach.itemsli: a(href='/teach/' + item.id_string, style='text-indent: 20px;')= item.name --> <li><a href="/teach/biomolecular-structures-and-models" style="text-indent: 20px;">Biomolecular Structures and Models</a></li> <li><a href="/teach/covid-19/topics/getting-started-hand-washing" style="text-indent: 20px;">COVID-19 in Molecular Detail</a></li> <li><a href="/teach/diabetes-at-a-molecular-level" style="text-indent: 20px;">Diabetes at a Molecular Level</a></li> <li><a href="/teach/molecular-immunology" style="text-indent: 20px;">Molecular Immunology</a></li> <li><a href="/teach/molecular-view-of-hiv-aids" style="text-indent: 20px;">Molecular View of HIV/AIDS</a></li> <li><a href="/teach/box-of-lessons/topics/biological-macromolecules" style="text-indent: 20px;">Box of Lessons</a></li> </ul> </li> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle"><span class="hidden-sm hidden-xs">Global Health</span><span class="hidden-lg hidden-md">Health</span><b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/global-health/diabetes-mellitus/about/what-is-diabetes">Diabetes Mellitus</a></li> <li><a href="/global-health/cancer/about/what-is-cancer">Cancer </a></li> </ul> </li> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle">SciArt<b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/sci-art/geis-archive/about">Irving Geis</a></li> <li><a href="/sci-art/goodsell-gallery">David Goodsell</a></li> <li><a href="/sci-art/bezsonova-gallery">Irina Bezsonova</a></li> </ul> </li> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle">Events<b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/events/art-of-science">Art of Science</a></li> <li><a href="http://www.rcsb.org/pages/awards/poster_prize" target="_blank">Poster Prize </a></li> <p style="font-size: 90%; margin: 8px 15px 0 20px; font-weight: bold; margin-bottom: 0; border-top: 1px solid #ddd; padding-top: 5px">Past events</p> <li><a href="https://www.rcsb.org/pages/pdb50" target="_blank"> PDB50</a></li> <li><a href="/events/science-olympiad">Science Olympiad</a></li> <li><a href="/events/video-challenge/the-challenge">Video Challenge</a></li> </ul> </li> <li class="dropdown"><a href="#" data-toggle="dropdown" class="dropdown-toggle">About<b class="caret"></b></a> <ul class="dropdown-menu"> <li><a href="/more/about-pdb-101">About PDB-101</a></li> <li><a href="/more/contact-us">Contact us</a></li> <li><a href="/more/how-to-cite">How to Cite</a></li> <li><a href="https://www.rcsb.org/pages/about-us/pdb-user-community">User Community</a></li> <li><a href="https://www.rcsb.org/pages/about-us/deia">Diversity, Equity, Inclusion, and Access</a></li> </ul> </li> </ul> </div> </div> </nav> <style> /* search section */ #search-button { padding: 8px 15px; border-top-left-radius: 0; border-bottom-left-radius: 0; } #query-input { width: 100%; border: 1px solid #ccc; padding: 8px; } #query-input:focus { outline: none; } #query-table { width: 100%; } #query-table td:first-child { width: 100%; } #autosuggest { position: relative; width: 100%; text-align: left; } #autosuggest-items { position: absolute; border: 1px solid #eee; border-bottom: none; border-top: none; z-index: 99; top: 100%; left: 0; right: 0; } #autosuggest-items div { padding: 10px; cursor: pointer; background-color: #fff; border-bottom: 1px solid #d4d4d4; } #autosuggest-items div:hover { background-color: #e9e9e9; } .autosuggest-active { background-color: DodgerBlue !important; color: #fff; } </style> <div id="print-header" class="row"> <div class="col-xs-6"> <div id="pdb101-logo"><a href="/"><img src="https://cdn.rcsb.org/pdb101/common/images/logo-pdb101.png" alt="RCSB PDB"></a></div> </div> <div id="rcsb-logo" class="col-xs-6 text-right">Training and outreach portal of <a href='http://www.rcsb.org' target='_blank'><img src='https://cdn.rcsb.org/pdb101/common/images/logo-rcsb.png' height='57'></a></div> </div> <div id="header" class="hidden-print"> <div class="container"> <div style="margin-top:10px;" class="row"> <div id="logo_container" class="col-sm-12 col-md-6 hidden-xs"> <div id="pdb101-logo"><a href="/"><img src="https://cdn.rcsb.org/pdb101/common/images/logo-pdb101.png" alt="RCSB PDB"></a></div> <p style="margin-top: 7px;"><em>Molecular explorations<br />through biology and medicine</em></p> </div> <div style="margin-bottom:10px;" class="col-sm-12 col-md-6 text-right"> <form id="searchForm" action="/search" method="post" onsubmit="return validateQueryInput();" autocomplete="off"> <input type="hidden" name="querySource" value="user"> <table id="query-table"> <tr> <td> <div id="autosuggest"> <input id="query-input" type="text" name="query" value="" placeholder="Search Molecule of the Month articles and more"> <div id="autosuggest-items"></div> </div> </td> <td><span id="search-button" onclick="doSearch();" class="btn btn-primary">Go</span></td> </tr> </table> </form> </div> </div> <!--* for mobile view, user survey--> <!--* for mobile view, 50 years of PDB--> <div class="row"> <div style="margin:5px 0px;" class="col-xs-6 col-sm-6 col-md-6"> <table> <tr> <td>Training and outreach portal of </td> <td rowspan="2"><a href='http://www.rcsb.org' target='_blank'><img src='https://cdn.rcsb.org/pdb101/common/images/logo-rcsb.png' width='75'></a></td> </tr> </table> </div> <div class="hidden-xs col-sm-6 col-md-6 col-lg-6 col-xl-6 text-right"> <!--if (pdb101.instance_type == 'local')--> <!-- div.local local--> <!--if (pdb101.instance_type == 'local' || pdb101.instance_type == 'beta')--> <!--div.survey <a class='no-underline' href='//www.surveymonkey.com/r/G5N8BKC' target='_blank'><img src='//cdn.rcsb.org/pdb101/common/images/checkbox-marked-outline (11).png' style='margin-top:-12px;'>Take the RCSB PDB User Survey</a>--> <!--* user survey for all screen except extra-small/mobile--> <!--* for 50 years of PDB header logo--> <div class="social-media"><a target="_blank" href="/motm/rss.xml" data-original-title="Molecule of the Month RSS Feed" data-toggle="tooltip"><i class="fa fa-rss-square fa-lg"></i></a><a target="_blank" href="https://www.facebook.com/RCSBPDB" data-original-title="Facebook" data-toggle="tooltip"><i class="fa fa-facebook-square fa-lg"></i></a><a target="_blank" href="https://twitter.com/buildmodels" data-original-title="Twitter" data-toggle="tooltip"><i class="fa fa-twitter fa-lg"></i></a><a target="_blank" href="https://www.youtube.com/user/RCSBProteinDataBank" data-original-title="YouTube" data-toggle="tooltip"><i class="fa fa-youtube-play fa-lg"></i></a><a target="_blank" href="https://github.com/rcsb" data-original-title="Github" data-toggle="tooltip"><i class="fa fa-github fa-lg"></i></a><a target="_blank" href="https://www.linkedin.com/company/rcsb-protein-data-bank/" data-original-title="LinkedIn" data-toggle="tooltip"><i class="fa fa-linkedin fa-lg"></i></a></div> </div> <div id="socialMedaiMobile" class="col-xs-6 hidden-sm hidden-md hidden-lg hidden-xl text-right"> <div class="social-media"><a target="_blank" href="/motm/rss.xml" data-original-title="Molecule of the Month RSS Feed" data-toggle="tooltip"><i class="fa fa-rss-square fa-lg"></i></a><a target="_blank" href="https://www.facebook.com/RCSBPDB" data-original-title="Facebook" data-toggle="tooltip"><i class="fa fa-facebook-square fa-lg"></i></a><a target="_blank" href="https://twitter.com/buildmodels" data-original-title="Twitter" data-toggle="tooltip"><i class="fa fa-twitter fa-lg"></i></a><a target="_blank" href="https://www.youtube.com/user/RCSBProteinDataBank" data-original-title="YouTube" data-toggle="tooltip"><i class="fa fa-youtube-play fa-lg"></i></a><a target="_blank" href="https://www.linkedin.com/company/rcsb-protein-data-bank/" data-original-title="LinkedIn" data-toggle="tooltip"><i class="fa fa-linkedin fa-lg"></i></a></div> </div> </div> </div> </div> <div id="main_content" class="container"> <style> .table-display { width:100%; border-collapse:true; margin-bottom:20px; } .table-display th { background-color: #def; text-align: center; } .table-display th, .table-display td { border:1px solid #ddd; padding: 8px; } table.data { width:100%; border-collapse:true; margin:10px 0; } table.data td { border:1px solid #ddd; padding: 4px; } div.figures { margin-bottom:20px; border: 1px solid #ccc; border-radius: 4px; } div.figures th, div.figures td { text-align: center; padding: 10px; } div.figures tr { vertical-align: top; } .caption { font-style: italic; margin-top: 10px; } .figure { border: 1px solid #ccc; border-radius: 4px; padding:10px; margin: 10px 0; } img.img-border { border: 1px solid #ccc; border-radius: 4px; padding:10px; margin: 10px 0; } .tip { border: 1px solid #ccc; border-radius: 4px; padding:10px; background-color: #ffe; margin:10px 0; } .img-pc { display:block; max-width:60%; height:auto; } td.content-img-caption { font-size: 12px; font-style: italic; padding: 5px 0; } img.content-img-border { border: 1px solid #ccc; } /* div.ekko-lightbox div.modal-dialog { width: auto; max-width: 1170px; height: auto; max-height: 2340px; } */ </style> <div class="row"> <div class="col-sm-12 col-md-3 hidden-print"><style> div.pdb101-page-content { border-left:1px dotted #ddd; } div.page-content-left-half { border-left:1px dotted #ddd; } .pdb101-left-menu { margin-top: 20px; font-size: 16px; color:#337ac7; } .pdb101-left-menu-header { font-size: 16px; color: #333; background-color: #eee; border:1px solid #ddd; padding:10px 20px; border-top-left-radius:4px; border-top-right-radius:4px; } .pdb101-left-menu-item { border-bottom: 1px dashed #ccc; cursor: pointer; text-align:right; } .pdb101-left-menu-item:hover { background-color: #f8f8f8; } .pdb101-left-menu-item-text { display: inline-block; width: 85%; text-align: right; padding: 10px; } .pdb101-left-menu-item-chevron { display: inline-block; width: 10%; text-align: center; padding: 10px 0; vertical-align: top; margin-top:4px; font-size: 12px; } .fade { opacity: 0.25; } </style> <div class="pdb101-left-menu"> <div class="pdb101-left-menu-header">Guide to Understanding PDB Data</div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/introduction';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Introduction</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/pdb-overview';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">PDB Overview</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/beginner’s-guide-to-pdbx-mmcif';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Beginner’s Guide to PDBx/mmCIF</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/dealing-with-coordinates';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Dealing with Coordinates</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/biological-assemblies';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Biological Assemblies</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/missing-coordinates';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Missing Coordinates</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/computed-structure-models';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Computed Structure Models</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/primary-sequences';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Primary Sequences</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/protein-hierarchical-structure';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Protein Hierarchical Structure</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/small-molecule-ligands';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Small Molecule Ligands</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/exploring-carbohydrates';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Exploring Carbohydrates</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/methods-for-determining-structure';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Methods for Determining Structure</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/crystallographic-data';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Crystallographic Data</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/molecular-graphics-programs';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Molecular Graphics Programs</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> <div onclick="location.href='/learn/guide-to-understanding-pdb-data/introduction-to-rcsb-pdb-apis';" class="pdb101-left-menu-item"> <div class="pdb101-left-menu-item-text">Introduction to RCSB PDB APIs</div> <div class="pdb101-left-menu-item-chevron"><span class="fa fa-chevron-right fade"></span></div> </div> </div> </div> <div data-elastic-include class="col-sm-12 col-md-9 pdb101-page-content"><h1>Primary Sequences</h1> <p> The primary sequence of the polymeric molecules contained in an entry are presented primarily in the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/entity_poly.html" target="_blank">_entity_poly</a> and <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/entity_poly_seq.html" target="_blank">entity_poly_seq </a> categories of the mmCIF/PDBx file. These listings include the sequence of each chain of linear, covalently-linked standard or modified amino acids or nucleotides. It may also include other residues that are linked to the standard backbone in the polymer. </p> <p> As described in the “<a href="/learn/guide-to-understanding-pdb-data/beginner%E2%80%99s-guide-to-pdbx-mmcif">Beginner’s Guide to PDB Structures and the PDBx/mmCIF Format</a>”, a tabular style is used as there are multiple values for each token. Here, a loop_ token is followed by rows of data item names and then white-space delimited data values. Additional information (and correspondence with legacy PDB format) can be found in the <a href="https://mmcif.wwpdb.org/docs/user-guide/guide.html" target="_blank">PDBx/mmCIF User Guide</a> and complete file <a href="https://mmcif.wwpdb.org/" target="_blank">format documentation is available</a>. </p> <p> The example below from entry <a href="https://www.rcsb.org/structure/4HHB" target="_blank">4HHB</a> shows the one-letter code sequence given in the <strong>_entity_poly</strong> category. Each residue from chains A and C (entity 1), and then chains B and D (entity 2) are listed in sequential order in <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entity_poly.pdbx_seq_one_letter_code.html">_entity_poly.pdbx_seq_one_letter_code</a>. Modified residues are listed using their canonical parent residue in <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entity_poly.pdbx_seq_one_letter_code_can.html">_entity_poly.pdbx_seq_one_letter_code_can</a>. </p> <div class="quoted"> <p style='text-autospace:none'> <span style='font-size:10.0pt;font-family:Courier'>loop_</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.entity_id</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.type</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.nstd_linkage</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.nstd_monomer</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.pdbx_seq_one_letter_code</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.pdbx_seq_one_letter_code_can</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.pdbx_strand_id</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly</span>.pdbx_target_identifier</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 'polypeptide(L)' no no</span><br> <span style='font-size:10.0pt;font-family:Courier'>;VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNAL</span><br> <span style='font-size:10.0pt;font-family:Courier'>SALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR</span><br> <span style='font-size:10.0pt;font-family:Courier'>;</span><br> <span style='font-size:10.0pt;font-family:Courier'>;VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNAL</span><br> <span style='font-size:10.0pt;font-family:Courier'>SALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR</span><br> <span style='font-size:10.0pt;font-family:Courier'>;</span><br> <span style='font-size:10.0pt;font-family:Courier'>A,C ?</span><br> <span style='font-size:10.0pt;font-family:Courier'>2 'polypeptide(L)' no no</span><br> <span style='font-size:10.0pt;font-family:Courier'>;VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDN</span><br> <span style='font-size:10.0pt;font-family:Courier'>LKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH</span><br> <span style='font-size:10.0pt;font-family:Courier'>;</span><br> <span style='font-size:10.0pt;font-family:Courier'>;VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDN</span><br> <span style='font-size:10.0pt;font-family:Courier'>LKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH</span><br> <span style='font-size:10.0pt;font-family:Courier'>;</span><br> <span style='font-size:10.0pt;font-family:Courier'>B,D ?</span><br> <span style='font-size:10.0pt;font-family:Courier'>#</span><br> </p> </div> <p>The sequence in three-letter code format can be found in the <strong>entity_poly_seq</strong> category. Here again from entry 4HHB.</p> <div class="quoted"> <p style='text-autospace:none'> <span style='font-size:10.0pt;font-family:Courier'>loop_</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly_seq</span>.entity_id</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly_seq</span>.num</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly_seq</span>.mon_id</span><br> <span style='font-size:10.0pt;font-family:Courier'><span style='color:red'>_entity_poly_seq</span>.hetero</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 1 VAL n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 2 LEU n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 3 SER n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 4 PRO n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 5 ALA n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 6 ASP n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 7 LYS n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 8 THR n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 9 ASN n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 10 VAL n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 11 LYS n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 12 ALA n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 13 ALA n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 14 TRP n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 15 GLY n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 16 LYS n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 17 VAL n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 18 GLY n</span><br> <span style='font-size:10.0pt;font-family:Courier'>1 19 ALA n</span><br> <span style='font-size:10.0pt;font-family:Courier'><<truncated for brevity>></span><br> </p> </div> <p>The <strong>_entity_poly</strong> and <strong>entity_poly_seq</strong> categories provide correspondence between the 1-letter and 3-letter formats for primary sequence and are equivalent to what is reported in the FASTA sequence and the "SEQRES” records found in the legacy PDB file format.</p> <p>The three-letter residue code found in <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entity_poly_seq.mon_id.html" target="_blank">_entity_poly_seq.mon_id</a> item is a pointer to <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_chem_comp.id.html" target="_blank">_chem_comp.id</a> in the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/chem_comp.html" target="_blank">chem_comp</a> category. This is analogous to the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_atom_site.label_comp_id.html" target="_blank">_atom_site.label_comp_id</a> in the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/atom_site.html" target="_blank">atom_site</a> category which is also a pointer to the <strong>_chem_comp.id</strong> in the <strong>chem_comp</strong> category. </p> <p>Here is an example from entry <a href="https://files.rcsb.org/view/4hhb.cif" target="_blank">4HHB</a>:</p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <span>loop_</span><br> <span>_chem_comp.id</span><br> <span>_chem_comp.type</span><br> <span>_chem_comp.mon_nstd_flag</span><br> <span>_chem_comp.name</span><br> <span>_chem_comp.pdbx_synonyms</span><br> <span>_chem_comp.formula</span><br> <span>_chem_comp.formula_weight</span><br> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> ALA 'L-peptide linking' y ALANINE ? 'C3 H7 N O2' 89.093 ARG 'L-peptide linking' y ARGININE ? 'C6 H15 N4 O2 1' 175.209 ASN 'L-peptide linking' y ASPARAGINE ? 'C4 H8 N2 O3' 132.118 ASP 'L-peptide linking' y 'ASPARTIC ACID' ? 'C4 H7 N O4' 133.103 CYS 'L-peptide linking' y CYSTEINE ? 'C3 H7 N O2 S' 121.158 GLN 'L-peptide linking' y GLUTAMINE ? 'C5 H10 N2 O3' 146.144 GLU 'L-peptide linking' y 'GLUTAMIC ACID' ? 'C5 H9 N O4' 147.129 GLY 'peptide linking' y GLYCINE ? 'C2 H5 N O2' 75.067 HEM non-polymer . 'PROTOPORPHYRIN IX CONTAINING FE' HEME 'C34 H32 Fe N4 O4' 616.487 HIS 'L-peptide linking' y HISTIDINE ? 'C6 H10 N3 O2 1' 156.162 HOH non-polymer . WATER ? 'H2 O' 18.015 LEU 'L-peptide linking' y LEUCINE ? 'C6 H13 N O2' 131.173 LYS 'L-peptide linking' y LYSINE ? 'C6 H15 N2 O2 1' 147.195 MET 'L-peptide linking' y METHIONINE ? 'C5 H11 N O2 S' 149.211 PHE 'L-peptide linking' y PHENYLALANINE ? 'C9 H11 N O2' 165.189 PO4 non-polymer . 'PHOSPHATE ION' ? 'O4 P -3' 94.971 PRO 'L-peptide linking' y PROLINE ? 'C5 H9 N O2' 115.130 SER 'L-peptide linking' y SERINE ? 'C3 H7 N O3' 105.093 THR 'L-peptide linking' y THREONINE ? 'C4 H9 N O3' 119.119 TRP 'L-peptide linking' y TRYPTOPHAN ? 'C11 H12 N2 O2' 204.225 TYR 'L-peptide linking' y TYROSINE ? 'C9 H11 N O3' 181.189 VAL 'L-peptide linking' y VALINE ? 'C5 H11 N O2' 117.146 <span><<truncated for brevity>></span> </pre> </div> <p>In many cases, you may find that the coordinates presented in the <strong>atom_site</strong> records of the mmCIF format file may not exactly match the sequence in the <strong>_entity_poly</strong> and <strong>entity_poly_seq</strong> categories. The ends of chains and mobile loops are often not observed in PDB experimental structures, and coordinates are not included as <strong>atom_site</strong> records in the file. However, these amino acids will often be included in the sequence records, since the portion of the chain was present during the experiment. In these cases, information will be included in the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/pdbx_unobs_or_zero_occ_residues.html" target="_blank">pdbx_unobs_or_zero_occ_residues</a> category to identify each missing residue. This category is analogous to the information presented in REMARK 465 of legacy PDB format files. </p> <p> You may also notice some differences with sequences in other databases. For example, a researcher may change or mutate particular residues to see the effect this will have on the overall structure, or a particular portion of it. The <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/struct_ref_seq.html" target="_blank">_struct_ref_seq</a> category (corresponding to the DBREF record in legacy PDB format files) provides cross-reference information between the sequence studied and a corresponding database sequence. </p> <p>Here is an example from entry <a href="https://files.rcsb.org/view/4hhb.cif" target="_blank">4HHB</a>:</p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <span>loop_</span><br> <span><span style='color:red'>_struct_ref_seq</span>.align_id</span><br> <span><span style='color:red'>_struct_ref_seq</span>.ref_id</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_PDB_id_code</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_strand_id</span><br> <span><span style='color:red'>_struct_ref_seq</span>.seq_align_beg</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_seq_align_beg_ins_code</span><br> <span><span style='color:red'>_struct_ref_seq</span>.seq_align_end</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_seq_align_end_ins_code</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_db_accession</span><br> <span><span style='color:red'>_struct_ref_seq</span>.db_align_beg</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_db_align_beg_ins_code</span><br> <span><span style='color:red'>_struct_ref_seq</span>.db_align_end</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_db_align_end_ins_code</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_auth_seq_align_beg</span><br> <span><span style='color:red'>_struct_ref_seq</span>.pdbx_auth_seq_align_end</span><br> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> 1 1 4HHB A 1 ? 141 ? P69905 2 ? 142 ? 1 141 2 2 4HHB B 1 ? 146 ? P68871 2 ? 147 ? 1 146 3 1 4HHB C 1 ? 141 ? P69905 2 ? 142 ? 1 141 4 2 4HHB D 1 ? 146 ? P68871 2 ? 147 ? 1 146 <span><<truncated for brevity>></span> </pre> </div> <p>The <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Categories/struct_ref_seq_dif.html" target="_blank">_struct_ref_seq_dif</a> category (corresponding to the SEQADV record in legacy PDB format files) identifies differences between sequence information in the sequence records (<strong>_entity_poly</strong> and <strong>entity_poly_seq</strong> categories) of the PDB entry and the sequence database entry given in <strong>_struct_ref_seq</strong>.</p> <p>Here is an example from entry <a href="https://www.rcsb.org/structure/8JK1" target="_blank">8JK1</a>:</p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <span>loop_</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.align_id</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_pdb_id_code</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.mon_id</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_pdb_strand_id</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.seq_num</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_pdb_ins_code</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_seq_db_name</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_seq_db_accession_code</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.db_mon_id</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_seq_db_seq_num</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.details</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_auth_seq_num</span><br> <span><span style='color:red'>_struct_ref_seq_dif</span>.pdbx_ordinal</span><br> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> 1 8JK1 GLN A 1 ? UNP A0A024R9I8 ? ? 'expression tag' -1 1 1 8JK1 ALA A 2 ? UNP A0A024R9I8 ? ? 'expression tag' 0 2 1 8JK1 SER A 30 ? UNP A0A024R9I8 CYS 28 'engineered mutation' 28 3 1 8JK1 SER A 95 ? UNP A0A024R9I8 ASN 93 variant 93 4 1 8JK1 ILE A 118 ? UNP A0A024R9I8 PHE 116 variant 116 5 1 8JK1 CYS A 136 ? UNP A0A024R9I8 ARG 134 variant 134 6 2 8JK1 GLN B 1 ? UNP A0A024R9I8 ? ? 'expression tag' -1 7 2 8JK1 ALA B 2 ? UNP A0A024R9I8 ? ? 'expression tag' 0 8 2 8JK1 SER B 30 ? UNP A0A024R9I8 CYS 28 'engineered mutation' 28 9 2 8JK1 SER B 95 ? UNP A0A024R9I8 ASN 93 variant 93 10 2 8JK1 ILE B 118 ? UNP A0A024R9I8 PHE 116 variant 116 11 2 8JK1 CYS B 136 ? UNP A0A024R9I8 ARG 134 variant 134 12 <span><<truncated for brevity>></span> </pre> </div> <p> Structural biologists often work with fragments of macromolecules which are more amenable to study than the full macromolecule. Thus, the <strong>_entity_poly</strong> and <strong>entity_poly_seq</strong> and <strong>atom_site</strong> records may include only a portion of the molecule, not the whole protein. The numbering of residues can also provide an additional complication. In some cases, researchers number <strong>atom_site</strong> records based on the numbering of the whole protein, while in other cases, they number the chain based on the fragment. Any number (negative, 0, positive) can be used. The numbering in the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entity_poly_seq.num.html" target="_blank">_entity_poly_seq.num </a> and <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_atom_site.label_seq_id.html" target="_blank">_atom_site.label_seq_id</a> categories is always sequential beginning with “1”, while the <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_atom_site.auth_seq_id.html" target="_blank">_atom_site.auth_seq_id</a> category provides the author’s residue numbering with any insertions given in <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_atom_site.pdbx_PDB_ins_code.html" target="_blank">_atom_site.pdbx_PDB_ins_code</a>. </p> <p> The example below from entry <a href="https://www.rcsb.org/structure/5JZY" target="_blank">5JZY</a>, shows author residue numbering in the coordinates starting at 0 (in <span style='color:blue'>blue</span>) and sequential numbering (shown in <span style='color:#FF69B4'>pink</span>) starting with 5. Residues 1-4 in sequential numbering ( corresponding to residues 4, -3,-2, and -1 in author numbering) were not experimentally observed. An insertion (shown in <span style='color:red'>red</span>) is shown beginning with sequential residue number 23 (corresponding to author numbering 14). </p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <span>Loop_</span><br> <span>_atom_site.group_PDB</span><br> <span>_atom_site.id</span><br> <span>_atom_site.type_symbol</span><br> <span>_atom_site.label_atom_id</span><br> <span>_atom_site.label_alt_id</span><br> <span>_atom_site.label_comp_id</span><br> <span>_atom_site.label_asym_id</span><br> <span>_atom_site.label_entity_id</span><br> <span><span style='color:#FF69B4'>_atom_site.label_seq_id</span></span><br> <span><span style='color:red'>_atom_site.pdbx_PDB_ins_code</span></span><br><br> <span>_atom_site.Cartn_x</span><br> <span>_atom_site.Cartn_y</span><br> <span>_atom_site.Cartn_z</span><br> <span>_atom_site.occupancy</span><br> <span>_atom_site.B_iso_or_equiv</span><br> <span>_atom_site.pdbx_formal_charge</span><br> <span><span style='color:blue'>_atom_site.auth_seq_id</span></span><br> <span>_atom_site.auth_comp_id</span><br> <span>_atom_site.auth_asym_id</span><br> <span>_atom_site.auth_atom_id</span><br> <span>_atom_site.pdbx_PDB_model_num</span><br> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> ATOM 1 N N . GLY A 1 <span style="color:#FF69B4">5</span> <span style="color:red">?</span> 2.001 16.815 86.112 1.00 44.36 <span style="color:blue">?</span> <span style="color:blue">0</span> GLY <span style="color:red">L</span> N 1 ATOM 2 C CA . GLY A 1 <span style="color:#FF69B4">5</span> <span style="color:red">?</span> 1.482 18.136 86.547 1.00 45.80 <span style="color:blue">?</span> <span style="color:blue">0</span> GLY <span style="color:red">L</span> CA 1 ATOM 3 C C . GLY A 1 <span style="color:#FF69B4">5</span> <span style="color:red">?</span> 1.277 18.094 88.040 1.00 46.03 <span style="color:blue">?</span> <span style="color:blue">0</span> GLY <span style="color:red">L</span> C 1 ATOM 4 O O . GLY A 1 <span style="color:#FF69B4">5</span> <span style="color:red">?</span> 0.165 17.865 88.516 1.00 47.41 <span style="color:blue">?</span> <span style="color:blue">0</span> GLY <span style="color:red">L</span> O 1 <span><<truncated for brevity>></span> ATOM 317 N N . LYS A 1 <span style="color:#FF69B4">23</span> <span style="color:red">A</span> -18.855 6.491 92.634 1.00 13.38 <span style="color:blue">?</span> <span style="color:blue">14</span> LYS <span style="color:red">L</span> N 1 ATOM 318 C CA . LYS A 1 <span style="color:#FF69B4">23</span> <span style="color:red">A</span> -19.863 5.707 93.333 1.00 15.01 <span style="color:blue">?</span> <span style="color:blue">14</span> LYS <span style="color:red">L</span> CA 1 ATOM 319 C C . LYS A 1 <span style="color:#FF69B4">23</span> <span style="color:red">A</span> -19.372 5.069 94.627 1.00 14.76 <span style="color:blue">?</span> <span style="color:blue">14</span> LYS <span style="color:red">L</span> C 1 ATOM 320 O O . LYS A 1 <span style="color:#FF69B4">23</span> <span style="color:red">A</span> -20.210 4.671 95.446 1.00 16.30 <span style="color:blue">?</span> <span style="color:blue">14</span> LYS <span style="color:red">L</span> O 1 <span><<truncated for brevity>></span> ATOM 330 N N . THR A 1 <span style="color:#FF69B4">24</span> <span style="color:red">B</span> -18.053 4.956 94.853 1.00 12.77 <span style="color:blue">?</span> <span style="color:blue">14</span> THR <span style="color:red">L</span> N 1 ATOM 331 C CA . THR A 1 <span style="color:#FF69B4">24</span> <span style="color:red">B</span> -17.592 4.286 96.064 1.00 13.26 <span style="color:blue">?</span> <span style="color:blue">14</span> THR <span style="color:red">L</span> CA 1 ATOM 332 C C . THR A 1 <span style="color:#FF69B4">24</span> <span style="color:red">B</span> -16.544 5.065 96.851 1.00 12.53 <span style="color:blue">?</span> <span style="color:blue">14</span> THR <span style="color:red">L</span> C 1 ATOM 333 O O . THR A 1 <span style="color:#FF69B4">24</span> <span style="color:red">B</span> -16.062 4.553 97.860 1.00 12.69 <span style="color:blue">?</span> <span style="color:blue">14</span> THR <span style="color:red">L</span> O 1 <span><<truncated for brevity>></span> ATOM 344 N N . GLU A 1 <span style="color:#FF69B4">25</span> <span style="color:red">C</span> -16.216 6.305 96.471 1.00 11.57 <span style="color:blue">?</span> <span style="color:blue">14</span> GLU <span style="color:red">L</span> N 1 ATOM 345 C CA . GLU A 1 <span style="color:#FF69B4">25</span> <span style="color:red">C</span> -15.205 7.037 97.226 1.00 10.78 <span style="color:blue">?</span> <span style="color:blue">14</span> GLU <span style="color:red">L</span> CA 1 ATOM 346 C C . GLU A 1 <span style="color:#FF69B4">25</span> <span style="color:red">C</span> -15.638 7.307 98.654 1.00 10.66 <span style="color:blue">?</span> <span style="color:blue">14</span> GLU <span style="color:red">L</span> C 1 ATOM 347 O O . GLU A 1 <span style="color:#FF69B4">25</span> <span style="color:red">C</span> -14.787 7.406 99.548 1.00 11.35 <span style="color:blue">?</span> <span style="color:blue">14</span> GLU <span style="color:red">L</span> O 1 <span><<truncated for brevity>></span> </pre> </div> <p> For more information, see <a href="/learn/guide-to-understanding-pdb-data/missing-coordinates">“Missing Loops and Tails” and “Fragments and Domains”</a> sections (in this Guide) and the “Macromolecules”>>”Sample Sequence” section of the <a href="https://mmcif.wwpdb.org/docs/user-guide/guide.html#" target="_blank">PDBx/mmCIF User Guide</a>. </p> <h4>Amino Acid and Nucleotide Nomenclature</h4> <p> In the SEQRES records, the standard 3-character code is used for standard amino acids, and standard nucleotides are specified by 1 or 2 characters:</p> <table class="data"> <tr> <td><p>Standard (L-) Amino Acids</p></td> <td><p>ALA, CYS, ASP, GLU, PHE, GLY, HIS, ILE, LYS, LEU, MET, ASN, PRO, GLN, ARG, SER, THR, VAL, TRP, TYR, PYL (pyrrolysine)*, SEC (selenocysteine) *</p></td> </tr> <tr> <td><p>D-Amino Acids (present in the PDB Archive)</p></td> <td><p>DAL ('ALA'), DSN ('SER'), DCY ('CYS'), DPR ('PRO'), DVA ('VAL'), DTH ('THR'), DLE ('LEU'), DIL ('ILE'), DSG ('ASN'), DAS ('ASP'), MED ('MET'), DGN ('GLN'); DGL ('GLU'), DLY ('LYS'), DHI ('HIS'), DPN ('PHE'), DAR ('ARG'), DTY ('TYR'), DTR ('TRP')</p></td> </tr> <tr> <td><p>Deoxyribonucleotides</p></td> <td><p>DA, DC, DG, DT, DI</p></td> </tr> <tr> <td><p>Ribonucleotides</p></td> <td><p>A, C, G, U, I</p></td> </tr> </table> <p><sup>*</sup> SEC and PYL are considered as standard amino acids as <a href="http://www.wwpdb.org/news/news?year=2014#5764490799cccf749a90cddf" target="_blank">announced</a> by the wwPDB. </p> <p class=FreeForm>Other codes are used for modified amino acids (such as <a href="https://www.rcsb.org/ligand/MSE" target="_blank">MSE</a> for selenomethionine) and for modified nucleotides (such as <a href="https://www.rcsb.org/ligand/CBR" target="_blank">CBR</a> for bromocytosine). </p> <p class=FreeForm>Several additional records are included in the PDB format to define modifications as they appear in the ATOM records.</p> <p >As an example, here are the records that describe HYP (hydroxyproline, a modified version of PRO, or proline) in the ATOM records for collagen entry <a href="https://www.rcsb.org/structure/1CAG#smallMoleculespanel">1CAG</a>:</p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <span>loop_</span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.id</span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.label_asym_id</span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.label_comp_id._pdbx_struct_mod_residue.label_seq_id </span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.auth_asym_id._pdbx_struct_mod_residue.auth_comp_id </span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.auth_seq_id</span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.PDB_ins_code _pdbx_struct_mod_residue.parent_comp_id </span><br> <span><span style='color:red'>_pdbx_struct_mod_residue</span>.details</span><br> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> 1 A HYP 2 A HYP 2 ? PRO 4-HYDROXYPROLINE 2 A HYP 5 A HYP 5 ? PRO 4-HYDROXYPROLINE 3 A HYP 8 A HYP 8 ? PRO 4-HYDROXYPROLINE 4 A HYP 11 A HYP 11 ? PRO 4-HYDROXYPROLINE 5 A HYP 14 A HYP 14 ? PRO 4-HYDROXYPROLINE 6 A HYP 17 A HYP 17 ? PRO 4-HYDROXYPROLINE 7 A HYP 20 A HYP 20 ? PRO 4-HYDROXYPROLINE 8 A HYP 23 A HYP 23 ? PRO 4-HYDROXYPROLINE 9 A HYP 26 A HYP 26 ? PRO 4-HYDROXYPROLINE 10 A HYP 29 A HYP 29 ? PRO 4-HYDROXYPROLINE 11 B HYP 2 B HYP 32 ? PRO 4-HYDROXYPROLINE 12 B HYP 5 B HYP 35 ? PRO 4-HYDROXYPROLINE <span><<truncated for brevity>></span> </pre> </div> <p>Additional specifics about the nature of the modified residue can be found in the <strong>_chem_comp</strong> category:</p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <span>loop_</span><br> <span><span style='color:red'>_chem_comp</span>.id</span><br> <span><span style='color:red'>_chem_comp</span>.type</span><br> <span><span style='color:red'>_chem_comp</span>.mon_nstd_flag</span><br> <span><span style='color:red'>_chem_comp</span>.name</span><br> <span><span style='color:red'>_chem_comp</span>.pdbx_synonyms</span><br> <span><span style='color:red'>_chem_comp</span>.formula</span><br> <span><span style='color:red'>_chem_comp</span>.formula_weight</span><br> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> ACY non-polymer . 'ACETIC ACID' ? 'C2 H4 O2' 60.052 ALA 'L-peptide linking' y ALANINE ? 'C3 H7 N O2' 89.093 GLY 'peptide linking' y GLYCINE ? 'C2 H5 N O2' 75.067 HOH non-polymer . WATER ? 'H2 O' 18.015 <span style="color:red">HYP 'L-peptide linking' n 4-HYDROXYPROLINE HYDROXYPROLINE 'C5 H9 N O3' 131.130</span> PRO 'L-peptide linking' y PROLINE ? 'C5 H9 N O2' 115.130 </pre> </div> <p>In the case of a modified residue, it is presented in <a href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entity_poly.pdbx_seq_one_letter_code.html" target="_blank">_entity_poly.pdbx_seq_one_letter_code</a> with three letters in parenthesis. Again from entry <a href="https://www.rcsb.org/structure/1CAG#smallMoleculespanel" target="_blank">1CAG</a>:</p> <div class="quoted" style="border: 1px solid #ccc; padding: 10px; font-family: Courier; font-size: 10.0pt;"> <pre style="margin: 0; padding: 0; border: none; font-family: inherit; font-size: inherit;"> _entity_poly.entity_id 1 _entity_poly.type 'polypeptide(L)' _entity_poly.nstd_linkage no _entity_poly.nstd_monomer yes <span style='color:red'>_entity_poly</span>.pdbx_seq_one_letter_code 'P<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>AP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>GP<span style='color:red'>(HYP)</span>G' _entity_poly.pdbx_seq_one_letter_code_can PPGPPGPPGPPGPPAPPGPPGPPGPPGPPG _entity_poly.pdbx_strand_id A,B,C _entity_poly.pdbx_target_identifier ? </pre> </div> </div> </div> </div> <div id="footer_main" class="hidden-print"> <div class="container"> <div class="row"> <div class="col-sm-12 col-md-7"> <p><strong>About PDB-101</strong></p> <p>Researchers around the globe make 3D structures freely available from the Protein Data Bank (PDB) archive. PDB-101 training materials help graduate students, postdoctoral scholars, and researchers use PDB data and RCSB PDB tools. Outreach content demonstrate how PDB data impact fundamental biology, biomedicine, bioengineering/biotechnology, and energy sciences in 3D for a diverse and multidisciplinary user community. Education Materials provide lessons and activities for teaching and learning.</p> <p>PDB-101 is developed by the <a href="https://rcsb.org" target="_blank">RCSB PDB</a>.</p> </div> <div class="col-sm-12 col-md-5 text-center"> <p>RCSB PDB (<a href="http://nar.oxfordjournals.org/content/28/1/235.abstract" target="blank">citation</a>) is hosted by</p><img src="//cdn.rcsb.org/pdb101/common/images/Logo_Rutgers-2024.png" alt="Rutgers University logo" usemap="#rutgers-map" height="32" style="padding:0px 10px; border-right: 1px solid #000000;"><img src="//cdn.rcsb.org/pdb101/common/images/Logo_UCSD-SDSC.png" alt="University of California San Diego/San Diego Supercomputer Center logos" usemap="#UCSD-SDSC-map" height="26" style="padding: 0px 10px; border-right: 1px solid #000000;"><img src="//cdn.rcsb.org/pdb101/common/images/Logo_UCSF.png" alt="Univesity of California San Franciscologo" usemap="#UCSF-map" height="26" style="padding-left: 10px;"><map name="rutgers-map"> <area shape="rect" coords="0,0,115,26" href="http://www.rutgers.edu/" alt="Rutgers" target="_blank"></map><map name="UCSD-SDSC-map"> <area shape="rect" coords="0,0,150,26" href="http://ucsd.edu/" alt="UCSD" target="_blank"> <area shape="rect" coords="151,0,255,26" href="http://www.sdsc.edu/" alt="SDSC" target="_blank"></map><map name="UCSF-map"> <area shape="rect" coords="0,0,64,26" href="http://www.ucsf.edu/" alt="UCSF" target="_blank"></map><br><br> <p>RCSB PDB is a member of <span id='pdbmembers_footer'><a href='http://www.wwpdb.org/' target='_blank'><img src='//cdn.rcsb.org/pdb101/common/images/Logo_wwpdb.png' width='70'></a><span class='pipe'></span><a href='http://www.emdatabank.org/' target='_blank'><img src='https://cdn.rcsb.org/pdb101/common/images/EMDataResourcelogo.png' width='100'></a></p> </div> </div> </div> </div> <div id="footer_grant" class="hidden-print"> <div class="container"> <div class="row"> <p>RCSB PDB Core Operations are funded by the <a href="http://www.nsf.gov/" target="_blank">U.S. National Science Foundation</a> (DBI-2321666), the <a href="http://science.energy.gov/" target="_blank">US Department of Energy</a> (DE-SC0019749), and the <a href="http://www.cancer.gov/" target="_blank">National Cancer Institute</a>, <a href="http://www.niaid.nih.gov/" target="_blank">National Institute of Allergy and Infectious Diseases</a>, and <a href="http://www.nigms.nih.gov/" target="_blank">National Institute of General Medical Sciences</a> of the <a href="http://www.nih.gov/" target="_blank">National Institutes of Health</a> under grant R01GM157729.</p> </div> </div> </div> <div id="jira-fdbck"></div><script> //Google Analytics tracking code {/* (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-71059016-1', 'auto'); ga('send', 'pageview'); */} </script> </div> </body> </html>