CINXE.COM

Unicode

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>Unicode</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta name="keywords" content="unicode"> <link rel="stylesheet" href="/cms/assets/misc/css/default.css" type="text/css"> <link rel="stylesheet" href="/cms/sites/nrsi/themes/default/_css/default.css" type="text/css"> <style type="text/css"> <!-- A.GlobalNavLink, A.GlobalNavLink:visited { color: #FFFF00; font-size: smaller; font-weight: bold; } --> </style> <!-- 2023-05-25 PKM Added for Google Analytics 4 --> <!-- Google tag (gtag.js) --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-FVXRGR2Q9V"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-FVXRGR2Q9V'); </script> <title>Unicode</title> </head> <body style="padding:0; margin:0"> <style> .archive_notice { /* box-shadow: black 0pt 4pt 20px -8px inset; */ display: block; background-color: orange; font-size: 12pt; font-style: normal; font-weight: lighter; line-height: 100%; padding: 5pt; text-align: center; width: auto; } form { display: none } .webform::before { content: "Forms are disabled on this static version of the site."; display: block; width: fit-content; } </style> <div class="archive_notice"> This is an archive of the original scripts.sil.org site, preserved as a historical reference. Some of the content is outdated. Please consult our other sites for more current information: <a href="https://software.sil.org">software.sil.org</a>, <a href="https://scriptsource.org">ScriptSource</a>, <a href="https://silnrsi.github.io/FDBP/">FDBP</a>, and <a href="https://silnrsi.github.io/silfontdev/">silfontdev</a> </div> <table width="100%" height="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td style="background: #0068a6; padding-left:20; padding-top:10; white-space:nowrap;" width="110" valign="top"> <p><a href="http://www.sil.org/"> <!-- <img src="/cms/sites/nrsi/themes/default/_media/SIL_logo_left_column.gif" width="86" height="80" border="0"> --> <img src="/cms/sites/nrsi/themes/default/_media/SIL_Logo_TM_Blue_2014.png" width="85" height="95" border="0" alt=""> </a><br><br></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dhome%26site_id%3Dnrsi.html">Home</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dcontactus%26site_id%3Dnrsi.html">Contact Us</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dgeneral%26site_id%3Dnrsi.html">General</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dbabel%26site_id%3Dnrsi.html">Initiative B@bel</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dwsi_guidelines%26site_id%3Dnrsi.html">WSI Guidelines</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dencoding%26site_id%3Dnrsi.html">Encoding</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dencodingprinciples%26site_id%3Dnrsi.html">Principles</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dunicode%26site_id%3Dnrsi.html">Unicode</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dunicodetraining%26site_id%3Dnrsi.html">Training</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dunicodetutorials%26site_id%3Dnrsi.html">Tutorials</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dunicodepua%26site_id%3Dnrsi.html">PUA</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dconversion%26site_id%3Dnrsi.html">Conversion</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dencconvres%26site_id%3Dnrsi.html">Resources</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dconversionutilities%26site_id%3Dnrsi.html">Utilities</a></p> <p class="Cat4"><a class="Cat4" href="/cms/scripts/page.php%3Fid%3Dteckit%26site_id%3Dnrsi.html">TECkit</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dconversionmaps%26site_id%3Dnrsi.html">Maps</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dencodingresources%26site_id%3Dnrsi.html">Resources</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dinput%26site_id%3Dnrsi.html">Input</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dinputprinciples%26site_id%3Dnrsi.html">Principles</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dinpututilities%26site_id%3Dnrsi.html">Utilities</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dinputtutorials%26site_id%3Dnrsi.html">Tutorials</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dinputresources%26site_id%3Dnrsi.html">Resources</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dtypedesign%26site_id%3Dnrsi.html">Type Design</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dtypedesignprinciples%26site_id%3Dnrsi.html">Principles</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dfontdesigntools%26site_id%3Dnrsi.html">Design Tools</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dfontformats%26site_id%3Dnrsi.html">Formats</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Dtypedesignresources%26site_id%3Dnrsi.html">Resources</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dfontdownloads%26site_id%3Dnrsi.html">Font Downloads</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dfontdownloadsgentium%26site_id%3Dnrsi.html">Gentium</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dfontdownloadsdoulos%26site_id%3Dnrsi.html">Doulos</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dfontdownloadsipa%26site_id%3Dnrsi.html">IPA</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Drendering%26site_id%3Dnrsi.html">Rendering</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Drenderingprinciples%26site_id%3Dnrsi.html">Principles</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Drenderingtechnologies%26site_id%3Dnrsi.html">Technologies</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Drenderingopentype%26site_id%3Dnrsi.html">OpenType</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Drenderinggraphite%26site_id%3Dnrsi.html">Graphite</a></p> <p class="Cat2"><a class="Cat2" href="/cms/scripts/page.php%3Fid%3Drenderingresources%26site_id%3Dnrsi.html">Resources</a></p> <p class="Cat3"><a class="Cat3" href="/cms/scripts/page.php%3Fid%3Dfontfaq%26site_id%3Dnrsi.html">Font FAQ</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dlinks%26site_id%3Dnrsi.html">Links</a></p> <p class="Cat1"><a class="Cat1" href="/cms/scripts/page.php%3Fid%3Dglossary%26site_id%3Dnrsi.html">Glossary</a></p> <br> </td> <td valign="top" style="padding:0" xwidth="650"> <div style="background: #6699CC url(/cms/sites/nrsi/themes/default/_media/home_banner_gradient.gif) no-repeat right; padding:0 0 0 25; height:36px; margin:0; color:#FFFFFF;"> <p style="font-family:Times New Roman; font-size:25px; color:#FFFFFF; padding:10 0 0 0; margin:0 0 0 0">Computers & Writing Systems</p> </div> <div style="padding:0 0 0 0; background-color:#000000; color:#FFFFFF"> <table width='100%'> <tr> <td style="padding: 0 0 0 25px"><a class="GlobalNavLink" href="http://www.sil.org/">SIL HOME</a> | <a class="GlobalNavLink" href="https://software.sil.org/products/">SIL SOFTWARE</a> | <a class="GlobalNavLink" href="/support.html">SUPPORT</a> | <a class="GlobalNavLink" href="https://www.givedirect.org/donate/?cid=13536">DONATE</a> | <a class="GlobalNavLink" href="/privacy-policy.html">PRIVACY POLICY</a> </td> <td align='right' width='20%'> <script async src="https://cse.google.com/cse.js?cx=0760bf09a6bff4b0c"></script><style>.gsc-control-cse {padding: 0.6em; min-width: 10em; width: 18em; max-width: 20em} form.gsc-search-box {display: unset;}</style><div class="gcse-search"></div> </td> </tr> </table> </div> <div style="padding:0 25 25 25"> <p class='CategoryPath'>You are here: <a class='CategoryPath' href='/cms/scripts/page.php%3Fid%3Dencoding%26site_id%3Dnrsi.html'>Encoding</a> &gt; <a class='CategoryPath' href='/cms/scripts/page.php%3Fid%3Dunicode%26site_id%3Dnrsi.html'>Unicode</a><br> Short URL: <a href='/catunicode.html'>https://scripts.sil.org/CatUnicode</a></p> <!-- --> <!-- <div class='Warning' > <p class='Warning_heading' > Site unavailability </p> <p> Due to essential repairs, this website may be unavailable at times during September 6 (Tue) and 7 (Wed). We apologize for the inconvenience. </p> </div> --> <h1>Unicode </h1> <p></p><p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dorthographydev%26site_id%3Dnrsi.html'>Orthography development in relation to Unicode</a></span> <span class='author_date_hits'>Lorna A. Priest, 2004-11-18</span> <br>It is out of our scope to give complete guidelines for developing an orthography. However, we would like to give you a process to work through from a Unicode perspective. <br>In designing a writing system, one must decide what symbols will be used and how. Here we list Unicode factors that should be taken into account.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Diws-chapter04b%26site_id%3Dnrsi.html'>Understanding Unicode™ - II</a></span> <span class='author_date_hits'>Peter Constable, 2001-06-13</span> <br>Unicode is a hot topic these days among computer users that work with multilingual text. They know it is important, and they hear it will solve problems, especially for dealing with text involving multiple scripts. They may not know where to go to learn about it, though. Or they may have read a few things about it and perhaps have seen some code charts, but they are at a point at which they need to gain a firmer understanding so that they can start to develop implementations or create content. This introduction is intended to give such people the basic grounding that they need.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Diws-chapter04a%26site_id%3Dnrsi.html'>Understanding Unicode™ - I</a></span> <span class='author_date_hits'>Peter Constable, 2001-06-13</span> <br>Unicode is a hot topic these days among computer users that work with multilingual text. They know it is important, and they hear it will solve problems, especially for dealing with text involving multiple scripts. They may not know where to go to learn about it, though. Or they may have read a few things about it and perhaps have seen some code charts, but they are at a point at which they need to gain a firmer understanding so that they can start to develop implementations or create content. This introduction is intended to give such people the basic grounding that they need.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dunicodesupport%26site_id%3Dnrsi.html'>Software requirements for different levels of Unicode Support</a></span> <span class='author_date_hits'>Lorna A. Priest, 2009-11-30</span> <br>This page provides information on levels of Unicode support provided by different software applications.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dreversednun_inthebhs%26site_id%3Dnrsi.html'>Reversed Nun in the BHS</a></span> <span class='author_date_hits'>Joan Wardell, Peter Constable and Christopher Samuel, 2003-11-05</span> <br>This short discussion of reversed nun explains how it is used in the Ezra SIL fonts.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dpunctum_inthebhs%26site_id%3Dnrsi.html'>Puncta in the BHS</a></span> <span class='author_date_hits'>Joan Wardell and Christopher Samuel, 2003-09-30</span> <br>This short discussion on Puncta dots in biblical Hebrew and Unicode explains how they are used in the Ezra SIL fonts.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dmeteg_inthebhs%26site_id%3Dnrsi.html'>Meteg and Siluq in the BHS</a></span> <span class='author_date_hits'>Joan Wardell and Christopher Samuel, 2003-09-30</span> <br>This short discussion on Meteg in biblical Hebrew explains how to encode various placements with a single codepoint.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dpsgsymbolsvstus4%26site_id%3Dnrsi.html'>Symbols in Phonetic Symbol Guide 2nd edn. in relation to Unicode 5.1</a></span> <span class='author_date_hits'>Peter Constable, 2009-01-21</span> <br>Assesses the symbols listed in the Phonetic Symbol Guide, by Pullum and Ladusaw, giving mappings through Unicode 5.1, and comments on those symbols now supported in Unicode 5.1.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dencodingfaq%26site_id%3Dnrsi.html'>How do I encode...?</a></span> <span class='author_date_hits'>Lorna A. Priest, 2009-03-31</span> <br>These are questions (and the answers!) people have asked about what the encoding is for various characters in the Unicode standard. This page is also a helpful teaching tool for understanding Unicode.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dunicode40_sorted%26site_id%3Dnrsi.html'>Unicode 8.0 Latin and Cyrillic characters – sorted</a></span> <span class='author_date_hits'>Lorna Evans, 2016-04-05</span> <br>PDF documents with tables of Latin and Cyrillic characters from Unicode 8.0 sorted in Unicode Collation Algorithm default order. Useful for finding characters in Unicode.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Diws-appendixa%26site_id%3Dnrsi.html'>Mapping codepoints to Unicode encoding forms</a></span> <span class='author_date_hits'>Peter Constable, 2001-06-13</span> <br>This appendix describes in detail the mappings from Unicode codepoints to the code unit sequences used in each encoding form.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Diws-appendixb%26site_id%3Dnrsi.html'>A review of characters with compatibility decompositions</a></span> <span class='author_date_hits'>Peter Constable, 2003-06-09</span> <br>This appendix is intended, therefore, to provide an introduction to this set of characters, which constitute perhaps the least principled elements of the Standard.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dunicode5quotemirroring%26site_id%3Dnrsi.html'>Why are my quote marks backwards?</a></span> <span class='author_date_hits'>Bob Hallissy, 2009-03-02</span> <br>If you work with right-to-left text in Unicode and have certain quote marks in your text, then this article is for you. More specifically: if you suddenly see your smart quotes reverse direction, this article will explain why and what to do (or not do) about it.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dvaiunicode%26site_id%3Dnrsi.html'>Encoding the Vai Syllabary in Unicode</a></span> <span class='author_date_hits'>Lorna A. Priest, 2004-12-01</span> <br>This document is a reference for those who are interested in encoding the Vai Syllabary in Unicode. It contains information compiled during the time SIL was working on the Vai fonts.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dpcunicodedocs%26site_id%3Dnrsi.html'>SIL Unicode proposals and other standards-related documents</a></span> <span class='author_date_hits'>NRSI staff, 2009-01-20</span> <br>Unicode proposals and other standards-related documents</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dutconvertq2%26site_id%3Dnrsi.html'>Is Unicode ready for you?</a></span> <span class='author_date_hits'>Albert Bickford, Jim Brase and Lorna Priest, 2007-05-11</span> <br>This article will help you decide whether Unicode will meet the needs for your given orthography or character needs.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dthailaoseq%26site_id%3Dnrsi.html'>Sequence Checking in Thai & Lao</a></span> <span class='author_date_hits'>Martin Hosken, 2008-04-25</span> <br>With the Unicode character set being so large, it is natural for system and application implementors to want to provide some mechanism for indicating what are clearly illegal sequences of Unicode characters...</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dunicodewordmacrosintro%26site_id%3Dnrsi.html'>Unicode Word Macros Template</a></span> <span class='author_date_hits'>Peter G. Constable, 2007-11-14</span> <br>This template provides some VBA macros designed to deal with various Unicode-related issues in Word 97 and later versions. These include providing a means to display the Unicode value of any character, to enter any Unicode character, and to search for any Unicode character. Each of the macros can be accessed from a toolbar that is provided.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dunicodepagesofinterest%26site_id%3Dnrsi.html'>Unicode Web site pages of interest</a></span> <span class='author_date_hits'>Peter Constable, 2002-10-06</span> <br>Links to pages of interest on the Unicode web site</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dcharstories_0140%26site_id%3Dnrsi.html'>Character Stories: U+013F, U+0140 Latin Capital / Small L with Middle Dot</a></span> <span class='author_date_hits'>Peter Constable, 2004-04-16</span> <br>The Catalan l-middle dot is usually encoded as a sequence of two characters. U+0140 was added to Unicode for compatibility with ISO 6937.</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dcatunicodecharacterstories%26site_id%3Dnrsi.html'>Unicode Character Stories</a></span> <span class='author_date_hits'>Peter Constable, 2003-06-19</span></p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dcharstories_2024%26site_id%3Dnrsi.html'>Character Stories: U+2024 ONE DOT LEADER</a></span> <span class='author_date_hits'>Peter Constable, 2003-06-02</span> <br>U+2024 ONE DOT LEADER is a graphic character, whose glyph consists of a small baseline dot, and whose General Category is Po (Other Punctuation).</p> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dcharstories_02ea%26site_id%3Dnrsi.html'>Character Stories: U+02EA, U+02EB Yin / Yang Departing Tone Marks</a></span> <span class='author_date_hits'>Peter Constable, 2003-06-19</span> <br>U+02EB and U+02EA come from the TCA submissions regarding for Minnan and Hakka languages, for use with extended Bopomofo.</p> <a name='130ddfce'></a> <h2>Tutorials</h2> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dutwtutoriallinks%26site_id%3Dnrsi.html'>Unicode Transition Tutorial Links</a></span> <span class='author_date_hits'>Lorna A. Priest, 2009-02-17</span> <br>Because of special character needs, SIL teams have long used custom encoded fonts. This was often the only solution and worked fairly well until newer software began "breaking" our solutions. Unicode obviates the need for custom encoded fonts. The tutorials in this section were developed for helping people in their transition to Unicode. You will find tools for helping you figure out what the Unicode encoding should be, tutorials and tools for actually converting legacy encoded documents to Unicode encoded documents and tutorials to help you with keyboarding issues.</p> <a name='740f5988'></a> <h2>Helpful Utilities</h2> <p><span class='ItemLeaderTitle'><a href='/cms/scripts/page.php%3Fid%3Dexcelunicodedata%26site_id%3Dnrsi.html'>Unicode Character Properties Excel and LibreOffice Calc spreadsheet</a></span> <span class='author_date_hits'>Peter Constable, Bob Hallissy, and Bobby de Vos, 2022-10-04</span> <br>Various files from the Unicode Character Database compiled into a spreadsheet (Excel and LibreOffice Calc) workbook.</p> <a href='http://earthlingsoft.net/UnicodeChecker' target='_blank'><img src='/cms/assets/icons/offsite_link.png'>&nbsp;UnicodeChecker</a> &mdash; UnicodeChecker for Mac OS X is an application that displays information for every code point from the Unicode Standard.<p></p> <p><a href='http://www.unicode.org/unibook/' target='_blank'><img src='/cms/assets/icons/offsite_link.png'>&nbsp;Unibook</a> &mdash; The Unibook Character browser is a small utility for offline viewing of the character charts and character properties for The Unicode Standard. </p> <p><a href='http://www.ling.upenn.edu/unicode/' target='_blank'><img src='/cms/assets/icons/offsite_link.png'>&nbsp;http://www.ling.upenn.edu/unicode/</a> &mdash; Unicode Character Finder is webpage which allows you to search for Unicode characters, view characters by Unicode block and copy characters to the clipboard for use in other applications.</p> <hr> <p><small>© 2003-2024 <a href='http://www.sil.org/' target='_blank'>SIL International</a>, all rights reserved, unless otherwise noted elsewhere on this page.<br> Provided by SIL's Writing Systems Technology team (formerly known as NRSI). Read our <a href="/privacy-policy.html">Privacy Policy</a>. <a href='/support.html'>Contact us here.</a></small></p> </div> </td> </table> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10