CINXE.COM
Unicode Character Database
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"><html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="Content-Language" content="en-us"> <title>Unicode Character Database</title> <link rel="stylesheet" type="text/css" href="http://www.unicode.org/webscripts/standard_styles.css"> </head> <body> <table width="100%" cellpadding="0" cellspacing="0" border="0"> <!-- BEGIN HEADER BAR --> <tr> <td colspan="2"> <table width="100%" border="0" cellpadding="0" cellspacing="0"> <tr> <td class="icon" style="width:38px; height:35px"> <a href="https://www.unicode.org/"> <img border="0" src="https://www.unicode.org/webscripts/logo60s2.gif" align="middle" alt="[Unicode]" width="34" height="33"></a> </td> <td class="icon" style="vertical-align:middle"> <a class="bar"> </a> <a class="bar" href="https://www.unicode.org/ucd"><font size="3">Unicode Character Database</font></a> </td> <td class="bar"> <a href="https://www.unicode.org/main.html" class="bar">Tech Site</a> | <a href="https://www.unicode.org/sitemap/" class="bar">Site Map</a> | <a href="https://www.unicode.org/search" class="bar">Search </a> </td> </tr> </table> </td> </tr> <tr> <td colspan="2" class="gray"> </td> </tr> <!-- END HEADER BAR --> <tr> <!-- BEGIN CONTENTS --> <td class="contents" valign="top"> <blockquote> <h1>About the Unicode Character Database</h1> <p>The Unicode Character Database (UCD) consists of a number of data files listing Unicode character properties and related data. It also includes data files containing test data for conformance to several important Unicode algorithms. Full documentation for the UCD can be found in <a href="https://www.unicode.org/reports/tr44/">Unicode Standard Annex #44, Unicode Character Database</a>. </p> <h2><a name="Latest">Latest Version of the Unicode Character Database</a></h2> <p>All files for the <b>most up-to-date</b> version of the Unicode Character Database can be found at: <a href="https://www.unicode.org/Public/UCD/latest/">https://www.unicode.org/Public/UCD/latest/</a>. </p> <p>Files in the UCD/latest/ subdirectories are <i>unversioned:</i> they do not contain any version indicator in their file name. However, most of the data files contain a file header in a standard format, which indicates the Unicode version and the date of last revision of that file.</p> <p>The latest version of the Unicode Standard, which corresponds to the latest version of the UCD, can be found at: <a href="https://www.unicode.org/versions/latest/">https://www.unicode.org/versions/latest/</a>.</p> <h2><a name="UCDVersions">Specific Versions of the UCD</a></h2> <p>Each specific version of the UCD is available for archival access in a versioned directory. For example, the UCD for Unicode 14.0 specifically is available at:<br> <a href="https://www.unicode.org/Public/14.0.0/">https://www.unicode.org/Public/14.0.0/</a> </p> <p>The UCD for Unicode 13.0 is available at:<br> <a href="https://www.unicode.org/Public/13.0.0/">https://www.unicode.org/Public/13.0.0/</a> and so on for each earlier version of the standard. </p> <p>For access to versions of the UCD earlier than Version 4.1, the structure of the archival directories differed somewhat. For full details, see <a href="https://www.unicode.org/reports/tr44/#Directory_Structure">Unicode Standard Annex #44, Unicode Character Database</a>.</p> <p>A comprehensive list of the exact data files that make up a given version of the UCD can be found in the component lists at <a href="https://www.unicode.org/versions/enumeratedversions.html">Enumerated Versions of the Unicode Standard</a>.</p> <h2><a name="UCDinXML">The UCD in XML</a></h2> <p>The contents of each version of the UCD is also available in XML format. The XML files are in zipped format and are stored in a subdirectory for each version. For example, the XML version of UCD Version 14.0 can be found in:<br> <a href="https://www.unicode.org/Public/14.0.0/ucdxml/">https://www.unicode.org/Public/14.0.0/ucdxml/</a></p> <p>Full documentation about the XML versions of the UCD can be found in <a href="https://www.unicode.org/reports/tr42/">Unicode Standard Annex #42, Unicode Character Database in XML</a>.</p> <h2><a name="Beta">BETA Versions</a></h2> <p>During periods when a preliminary (beta) version of the standard is being released for public comment Public Beta files are available. For more information about any ongoing public betas see the <a href="https://www.unicode.org/versions/beta.html">BETA notice</a> as well as <a href="https://www.unicode.org/review/">Public Review Issues</a>.</p> <h2><a name="FTP">FTP Access</a></h2> <p>All files and directories in the Unicode Character Database are accessible both via HTTPS and FTP. For FTP access use an FTP client and anonymous access.</p> <p>For example, to access the contents of <a href="https://www.unicode.org/Public/UCD/latest/"> https://www.unicode.org/Public/UCD/latest/</a> by FTP, point an FTP client to www.unicode.org as the host, and /Public/UCD/latest as the path.</p> </blockquote> </td> </tr> <tr> <td align="center"> <hr width="50%"> <div align="center"> <center> <table cellspacing="0" cellpadding="0" border="0"> <tr> <td><a href="https://www.unicode.org/copyright.html"> <img src="https://www.unicode.org/img/hb_notice.gif" border="0" alt="Access to Copyright and terms of use" width="216" height="50"></a></td> </tr> </table> </center> </div> </td> </tr> </table> </body> </html>