CINXE.COM
utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code - Perldoc Browser
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code - Perldoc Browser</title> <link rel="search" href="/opensearch.xml" type="application/opensearchdescription+xml" title="Perldoc Browser"> <link rel="canonical" href="https://perldoc.perl.org/utf8"> <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css" integrity="sha384-JcKb8q3iqJ61gNV9KGb8thSsNjpSL0n8PARn9HuZOnIxN0hoP+VmmDGMN5t9UJ0Z" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.5.0/styles/stackoverflow-light.min.css" integrity="sha512-cG1IdFxqipi3gqLmksLtuk13C+hBa57a6zpWxMeoY3Q9O6ooFxq50DayCdm0QrDgZjMUn23z/0PMZlgft7Yp5Q==" crossorigin="anonymous" /> <style> body { background: #f4f4f5; color: #020202; } .navbar-dark { background-image: -webkit-linear-gradient(top, #005f85 0, #002e49 100%); background-image: -o-linear-gradient(top, #005f85 0, #002e49 100%); background-image: linear-gradient(to bottom, #005f85 0, #002e49 100%); filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ff005f85', endColorstr='#ff002e49', GradientType=0); background-repeat: repeat-x; } .navbar-dark .navbar-nav .nav-link, .navbar-dark .navbar-nav .nav-link:focus { color: #fff } .navbar-dark .navbar-nav .nav-link:hover { color: #ffef68 } #wrapperlicious { margin: 0 auto; font: 0.9em 'Helvetica Neue', Helvetica, sans-serif; font-weight: normal; line-height: 1.5em; margin: 0; padding: 0; } #wrapperlicious h1 { font-size: 1.5em } #wrapperlicious h2 { font-size: 1.3em } #wrapperlicious h3 { font-size: 1.1em } #wrapperlicious h4 { font-size: 0.9em } #wrapperlicious h1, #wrapperlicious h2, #wrapperlicious h3, #wrapperlicious h4, #wrapperlicious dt { color: #020202; margin-top: 1em; margin-bottom: 1em; position: relative; font-weight: bold; } #wrapperlicious a { color: inherit; text-decoration: underline } #wrapperlicious #toc { text-decoration: none } #wrapperlicious a:hover { color: #2a2a2a } #wrapperlicious a img { border: 0 } #wrapperlicious :not(pre) > code { color: inherit; background-color: rgba(0, 0, 0, 0.04); border-radius: 3px; font: 0.9em Consolas, Menlo, Monaco, monospace; padding: 0.3em; } #wrapperlicious dd { margin: 0; margin-left: 2em; } #wrapperlicious dt { color: #2a2a2a; font-weight: bold; margin-left: 0.9em; } #wrapperlicious p { margin-bottom: 1em; margin-top: 1em; } #wrapperlicious li > p { margin-bottom: 0; margin-top: 0; } #wrapperlicious pre { border: 1px solid #c1c1c1; border-radius: 3px; font: 100% Consolas, Menlo, Monaco, monospace; margin-bottom: 1em; margin-top: 1em; } #wrapperlicious pre > code { display: block; background-color: #f6f6f6; font: 0.9em Consolas, Menlo, Monaco, monospace; line-height: 1.5em; text-align: left; white-space: pre; padding: 1em; } #wrapperlicious dl, #wrapperlicious ol, #wrapperlicious ul { margin-bottom: 1em; margin-top: 1em; } #wrapperlicious ul { list-style-type: square; } #wrapperlicious ul ul { margin-bottom: 0px; margin-top: 0px; } #footer { font-size: 0.8em; padding-top: 0.5em; text-align: center; } #more { display: inline; font-size: 0.8em; } #perldocdiv { background-color: #fff; border: 1px solid #c1c1c1; border-bottom-left-radius: 5px; border-bottom-right-radius: 5px; margin-left: auto; margin-right: auto; padding: 3em; padding-top: 1em; max-width: 960px; } #moduleversion { float: right } #wrapperlicious .leading-notice { font-style: italic; padding-left: 1em; margin-top: 1em; margin-bottom: 1em; } #wrapperlicious .permalink { display: none; left: -0.75em; position: absolute; padding-right: 0.25em; text-decoration: none; } #wrapperlicious h1:hover .permalink, #wrapperlicious h2:hover .permalink, #wrapperlicious h3:hover .permalink, #wrapperlicious h4:hover .permalink, #wrapperlicious dt:hover .permalink { display: block; } </style> <!-- Global site tag (gtag.js) - Google Analytics --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-KVNWBNT5FB"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-KVNWBNT5FB'); gtag('config', 'UA-50555-3'); </script> </head> <body> <nav class="navbar navbar-expand-md navbar-dark bg-dark justify-content-between"> <button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <a class="navbar-brand" href="/"><img src="/images/perl_camel_30.png" width="30" height="30" class="d-inline-block align-top" alt="Perl Camel Logo"> Perldoc Browser</a> <div class="collapse navbar-collapse" id="navbarNav"> <ul class="navbar-nav mr-auto"> <li class="nav-item dropdown text-nowrap"> <a class="nav-link dropdown-toggle" href="#" id="dropdownlink-stable" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">5.20.1</a> <div class="dropdown-menu" aria-labelledby="dropdownlink-stable"> <a class="dropdown-item" href="/utf8">Latest</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.40.1/utf8">5.40.1</a> <a class="dropdown-item" href="/5.40.0/utf8">5.40.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.38.3/utf8">5.38.3</a> <a class="dropdown-item" href="/5.38.2/utf8">5.38.2</a> <a class="dropdown-item" href="/5.38.1/utf8">5.38.1</a> <a class="dropdown-item" href="/5.38.0/utf8">5.38.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.36.3/utf8">5.36.3</a> <a class="dropdown-item" href="/5.36.2/utf8">5.36.2</a> <a class="dropdown-item" href="/5.36.1/utf8">5.36.1</a> <a class="dropdown-item" href="/5.36.0/utf8">5.36.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.34.3/utf8">5.34.3</a> <a class="dropdown-item" href="/5.34.2/utf8">5.34.2</a> <a class="dropdown-item" href="/5.34.1/utf8">5.34.1</a> <a class="dropdown-item" href="/5.34.0/utf8">5.34.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.32.1/utf8">5.32.1</a> <a class="dropdown-item" href="/5.32.0/utf8">5.32.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.30.3/utf8">5.30.3</a> <a class="dropdown-item" href="/5.30.2/utf8">5.30.2</a> <a class="dropdown-item" href="/5.30.1/utf8">5.30.1</a> <a class="dropdown-item" href="/5.30.0/utf8">5.30.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.28.3/utf8">5.28.3</a> <a class="dropdown-item" href="/5.28.2/utf8">5.28.2</a> <a class="dropdown-item" href="/5.28.1/utf8">5.28.1</a> <a class="dropdown-item" href="/5.28.0/utf8">5.28.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.26.3/utf8">5.26.3</a> <a class="dropdown-item" href="/5.26.2/utf8">5.26.2</a> <a class="dropdown-item" href="/5.26.1/utf8">5.26.1</a> <a class="dropdown-item" href="/5.26.0/utf8">5.26.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.24.4/utf8">5.24.4</a> <a class="dropdown-item" href="/5.24.3/utf8">5.24.3</a> <a class="dropdown-item" href="/5.24.2/utf8">5.24.2</a> <a class="dropdown-item" href="/5.24.1/utf8">5.24.1</a> <a class="dropdown-item" href="/5.24.0/utf8">5.24.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.22.4/utf8">5.22.4</a> <a class="dropdown-item" href="/5.22.3/utf8">5.22.3</a> <a class="dropdown-item" href="/5.22.2/utf8">5.22.2</a> <a class="dropdown-item" href="/5.22.1/utf8">5.22.1</a> <a class="dropdown-item" href="/5.22.0/utf8">5.22.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.20.3/utf8">5.20.3</a> <a class="dropdown-item" href="/5.20.2/utf8">5.20.2</a> <a class="dropdown-item active" href="/5.20.1/utf8">5.20.1</a> <a class="dropdown-item" href="/5.20.0/utf8">5.20.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.18.4/utf8">5.18.4</a> <a class="dropdown-item" href="/5.18.3/utf8">5.18.3</a> <a class="dropdown-item" href="/5.18.2/utf8">5.18.2</a> <a class="dropdown-item" href="/5.18.1/utf8">5.18.1</a> <a class="dropdown-item" href="/5.18.0/utf8">5.18.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.16.3/utf8">5.16.3</a> <a class="dropdown-item" href="/5.16.2/utf8">5.16.2</a> <a class="dropdown-item" href="/5.16.1/utf8">5.16.1</a> <a class="dropdown-item" href="/5.16.0/utf8">5.16.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.14.4/utf8">5.14.4</a> <a class="dropdown-item" href="/5.14.3/utf8">5.14.3</a> <a class="dropdown-item" href="/5.14.2/utf8">5.14.2</a> <a class="dropdown-item" href="/5.14.1/utf8">5.14.1</a> <a class="dropdown-item" href="/5.14.0/utf8">5.14.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.12.5/utf8">5.12.5</a> <a class="dropdown-item" href="/5.12.4/utf8">5.12.4</a> <a class="dropdown-item" href="/5.12.3/utf8">5.12.3</a> <a class="dropdown-item" href="/5.12.2/utf8">5.12.2</a> <a class="dropdown-item" href="/5.12.1/utf8">5.12.1</a> <a class="dropdown-item" href="/5.12.0/utf8">5.12.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.10.1/utf8">5.10.1</a> <a class="dropdown-item" href="/5.10.0/utf8">5.10.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.8.9/utf8">5.8.9</a> <a class="dropdown-item" href="/5.8.8/utf8">5.8.8</a> <a class="dropdown-item" href="/5.8.7/utf8">5.8.7</a> <a class="dropdown-item" href="/5.8.6/utf8">5.8.6</a> <a class="dropdown-item" href="/5.8.5/utf8">5.8.5</a> <a class="dropdown-item" href="/5.8.4/utf8">5.8.4</a> <a class="dropdown-item" href="/5.8.3/utf8">5.8.3</a> <a class="dropdown-item" href="/5.8.2/utf8">5.8.2</a> <a class="dropdown-item" href="/5.8.1/utf8">5.8.1</a> <a class="dropdown-item" href="/5.8.0/utf8">5.8.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.6.2/utf8">5.6.2</a> <a class="dropdown-item" href="/5.6.1/utf8">5.6.1</a> <a class="dropdown-item" href="/5.6.0/utf8">5.6.0</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.005_04/utf8">5.005_04</a> <a class="dropdown-item" href="/5.005_03/utf8">5.005_03</a> <a class="dropdown-item" href="/5.005_02/utf8">5.005_02</a> <a class="dropdown-item" href="/5.005_01/utf8">5.005_01</a> <a class="dropdown-item" href="/5.005/utf8">5.005</a> </div> </li> <li class="nav-item dropdown text-nowrap"> <a class="nav-link dropdown-toggle" href="#" id="dropdownlink-dev" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Dev</a> <div class="dropdown-menu" aria-labelledby="dropdownlink-dev"> <a class="dropdown-item" href="/blead/utf8">blead</a> <a class="dropdown-item" href="/5.41.10/utf8">5.41.10</a> <a class="dropdown-item" href="/5.41.9/utf8">5.41.9</a> <a class="dropdown-item" href="/5.41.8/utf8">5.41.8</a> <a class="dropdown-item" href="/5.41.7/utf8">5.41.7</a> <a class="dropdown-item" href="/5.41.6/utf8">5.41.6</a> <a class="dropdown-item" href="/5.41.5/utf8">5.41.5</a> <a class="dropdown-item" href="/5.41.4/utf8">5.41.4</a> <a class="dropdown-item" href="/5.41.3/utf8">5.41.3</a> <a class="dropdown-item" href="/5.41.2/utf8">5.41.2</a> <a class="dropdown-item" href="/5.41.1/utf8">5.41.1</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.40.1-RC1/utf8">5.40.1-RC1</a> <a class="dropdown-item" href="/5.40.0-RC2/utf8">5.40.0-RC2</a> <a class="dropdown-item" href="/5.40.0-RC1/utf8">5.40.0-RC1</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.39.10/utf8">5.39.10</a> <a class="dropdown-item" href="/5.39.9/utf8">5.39.9</a> <a class="dropdown-item" href="/5.39.8/utf8">5.39.8</a> <a class="dropdown-item" href="/5.39.7/utf8">5.39.7</a> <a class="dropdown-item" href="/5.39.6/utf8">5.39.6</a> <a class="dropdown-item" href="/5.39.5/utf8">5.39.5</a> <a class="dropdown-item" href="/5.39.4/utf8">5.39.4</a> <a class="dropdown-item" href="/5.39.3/utf8">5.39.3</a> <a class="dropdown-item" href="/5.39.2/utf8">5.39.2</a> <a class="dropdown-item" href="/5.39.1/utf8">5.39.1</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.38.3-RC1/utf8">5.38.3-RC1</a> </div> </li> <li class="nav-item dropdown text-nowrap"> <a class="nav-link dropdown-toggle" href="#" id="dropdownlink-nav" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Documentation</a> <div class="dropdown-menu" aria-labelledby="dropdownlink-nav"> <a class="dropdown-item" href="/5.20.1/perl">Perl</a> <a class="dropdown-item" href="/5.20.1/perlintro">Intro</a> <a class="dropdown-item" href="/5.20.1/perl#Tutorials">Tutorials</a> <a class="dropdown-item" href="/5.20.1/perlfaq">FAQs</a> <a class="dropdown-item" href="/5.20.1/perl#Reference-Manual">Reference</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.20.1/perlop">Operators</a> <a class="dropdown-item" href="/5.20.1/functions">Functions</a> <a class="dropdown-item" href="/5.20.1/variables">Variables</a> <a class="dropdown-item" href="/5.20.1/modules">Modules</a> <a class="dropdown-item" href="/5.20.1/perlutil">Utilities</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/5.20.1/perldelta">Release Notes</a> <a class="dropdown-item" href="/5.20.1/perlcommunity">Community</a> <a class="dropdown-item" href="/5.20.1/perlhist">History</a> </div> </li> </ul> <ul class="navbar-nav"> <script> function set_expand (expand) { var perldocdiv = document.getElementById('perldocdiv'); var width = window.getComputedStyle(perldocdiv).getPropertyValue('max-width'); var expanded = (width == '' || width == 'none') ? true : false; if (expand === null) { expand = !expanded; } if ((expand && !expanded) || (!expand && expanded)) { perldocdiv.style.setProperty('max-width', expand ? 'none' : '960px'); var button_classlist = document.getElementById('content-expand-button').classList; if (expand) { button_classlist.add('btn-light'); button_classlist.remove('btn-outline-light'); } else { button_classlist.add('btn-outline-light'); button_classlist.remove('btn-light'); } } return expand; } function toggle_expand () { var expand = set_expand(null); document.cookie = 'perldoc_expand=' + (expand ? 1 : 0) + '; path=/; expires=Tue, 19 Jan 2038 03:14:07 UTC'; } function read_expand () { return document.cookie.split(';').some(function (item) { return item.indexOf('perldoc_expand=1') >= 0 }); } if (document.readyState === 'loading') { document.addEventListener('DOMContentLoaded', function () { if (read_expand()) { set_expand(true); } }); } else if (read_expand()) { set_expand(true); } </script> <button id="content-expand-button" type="button" class="btn btn-outline-light d-none d-lg-inline-block mr-4" onclick="toggle_expand()">Expand</button> </ul> <form class="form-inline" method="get" action="/5.20.1/search"> <input class="form-control mr-3" type="search" name="q" placeholder="Search" aria-label="Search" value=""> </form> </div> </nav> <div id="wrapperlicious" class="container-fluid"> <div id="perldocdiv"> <div id="links"> <a href="/5.20.1/utf8">utf8</a> <div id="more"> (<a href="/5.20.1/utf8.txt">source</a>, <a href="https://metacpan.org/pod/utf8">CPAN</a>) </div> <div id="moduleversion">version 1.13_01</div> </div> <div class="leading-notice"> You are viewing the version of this documentation from Perl 5.20.1. <a href="/utf8">View the latest version</a> </div> <h1><a id="toc">CONTENTS</a></h1> <ul> <li> <a class="text-decoration-none" href="#NAME">NAME</a> </li> <li> <a class="text-decoration-none" href="#SYNOPSIS">SYNOPSIS</a> </li> <li> <a class="text-decoration-none" href="#DESCRIPTION">DESCRIPTION</a> <ul> <li> <a class="text-decoration-none" href="#Utility-functions">Utility functions</a> </li> </ul> </li> <li> <a class="text-decoration-none" href="#BUGS">BUGS</a> </li> <li> <a class="text-decoration-none" href="#SEE-ALSO">SEE ALSO</a> </li> </ul> <h1 id="NAME"><a class="permalink" href="#NAME">#</a>NAME</h1> <p>utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code</p> <h1 id="SYNOPSIS"><a class="permalink" href="#SYNOPSIS">#</a>SYNOPSIS</h1> <pre><code>use utf8; no utf8; # Convert the internal representation of a Perl scalar to/from UTF-8. $num_octets = utf8::upgrade($string); $success = utf8::downgrade($string[, $fail_ok]); # Change each character of a Perl scalar to/from a series of # characters that represent the UTF-8 bytes of each original character. utf8::encode($string); # "\x{100}" becomes "\xc4\x80" utf8::decode($string); # "\xc4\x80" becomes "\x{100}" $flag = utf8::is_utf8($string); # since Perl 5.8.1 $flag = utf8::valid($string);</code></pre> <h1 id="DESCRIPTION"><a class="permalink" href="#DESCRIPTION">#</a>DESCRIPTION</h1> <p>The <code>use utf8</code> pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope (allow UTF-EBCDIC on EBCDIC based platforms). The <code>no utf8</code> pragma tells Perl to switch back to treating the source text as literal bytes in the current lexical scope.</p> <p><b>Do not use this pragma for anything else than telling Perl that your script is written in UTF-8.</b> The utility functions described below are directly usable without <code>use utf8;</code>.</p> <p>Because it is not possible to reliably tell UTF-8 from native 8 bit encodings, you need either a Byte Order Mark at the beginning of your source code, or <code>use utf8;</code>, to instruct perl.</p> <p>When UTF-8 becomes the standard source format, this pragma will effectively become a no-op. For convenience in what follows the term <i>UTF-X</i> is used to refer to UTF-8 on ASCII and ISO Latin based platforms and UTF-EBCDIC on EBCDIC based platforms.</p> <p>See also the effects of the <code>-C</code> switch and its cousin, the <code>$ENV{PERL_UNICODE}</code>, in <a href="/5.20.1/perlrun">perlrun</a>.</p> <p>Enabling the <code>utf8</code> pragma has the following effect:</p> <ul> <li><p>Bytes in the source text that have their high-bit set will be treated as being part of a literal UTF-X sequence. This includes most literals such as identifier names, string constants, and constant regular expression patterns.</p> <p>On EBCDIC platforms characters in the Latin 1 character set are treated as being part of a literal UTF-EBCDIC character.</p> </li> </ul> <p>Note that if you have bytes with the eighth bit on in your script (for example embedded Latin-1 in your string literals), <code>use utf8</code> will be unhappy since the bytes are most probably not well-formed UTF-X. If you want to have such bytes under <code>use utf8</code>, you can disable this pragma until the end the block (or file, if at top level) by <code>no utf8;</code>.</p> <h2 id="Utility-functions"><a class="permalink" href="#Utility-functions">#</a><a id="Utility"></a>Utility functions</h2> <p>The following functions are defined in the <code>utf8::</code> package by the Perl core. You do not need to say <code>use utf8</code> to use these and in fact you should not say that unless you really want to have UTF-8 source code.</p> <ul> <li><p><code>$num_octets = utf8::upgrade($string)</code></p> <p>Converts in-place the internal representation of the string from an octet sequence in the native encoding (Latin-1 or EBCDIC) to <i>UTF-X</i>. The logical character sequence itself is unchanged. If <i>$string</i> is already stored as <i>UTF-X</i>, then this is a no-op. Returns the number of octets necessary to represent the string as <i>UTF-X</i>. Can be used to make sure that the UTF-8 flag is on, so that <code>\w</code> or <code>lc()</code> work as Unicode on strings containing characters in the range 0x80-0xFF (on ASCII and derivatives).</p> <p><b>Note that this function does not handle arbitrary encodings.</b> Therefore Encode is recommended for the general purposes; see also <a href="/5.20.1/Encode">Encode</a>.</p> </li> <li><p><code>$success = utf8::downgrade($string[, $fail_ok])</code></p> <p>Converts in-place the internal representation of the string from <i>UTF-X</i> to the equivalent octet sequence in the native encoding (Latin-1 or EBCDIC). The logical character sequence itself is unchanged. If <i>$string</i> is already stored as native 8 bit, then this is a no-op. Can be used to make sure that the UTF-8 flag is off, e.g. when you want to make sure that the substr() or length() function works with the usually faster byte algorithm.</p> <p>Fails if the original <i>UTF-X</i> sequence cannot be represented in the native 8 bit encoding. On failure dies or, if the value of <i>$fail_ok</i> is true, returns false.</p> <p>Returns true on success.</p> <p><b>Note that this function does not handle arbitrary encodings.</b> Therefore Encode is recommended for the general purposes; see also <a href="/5.20.1/Encode">Encode</a>.</p> </li> <li><p><code>utf8::encode($string)</code></p> <p>Converts in-place the character sequence to the corresponding octet sequence in <i>UTF-X</i>. That is, every (possibly wide) character gets replaced with a sequence of one or more characters that represent the individual <i>UTF-X</i> bytes of the character. The UTF8 flag is turned off. Returns nothing.</p> <pre><code>my $a = "\x{100}"; # $a contains one character, with ord 0x100 utf8::encode($a); # $a contains two characters, with ords 0xc4 and # 0x80</code></pre> <p><b>Note that this function does not handle arbitrary encodings.</b> Therefore Encode is recommended for the general purposes; see also <a href="/5.20.1/Encode">Encode</a>.</p> </li> <li><p><code>$success = utf8::decode($string)</code></p> <p>Attempts to convert in-place the octet sequence encoded as <i>UTF-X</i> to the corresponding character sequence. That is, it replaces each sequence of characters in the string whose ords represent a valid UTF-X byte sequence, with the corresponding single character. The UTF-8 flag is turned on only if the source string contains multiple-byte <i>UTF-X</i> characters. If <i>$string</i> is invalid as <i>UTF-X</i>, returns false; otherwise returns true.</p> <pre><code>my $a = "\xc4\x80"; # $a contains two characters, with ords # 0xc4 and 0x80 utf8::decode($a); # $a contains one character, with ord 0x100</code></pre> <p><b>Note that this function does not handle arbitrary encodings.</b> Therefore Encode is recommended for the general purposes; see also <a href="/5.20.1/Encode">Encode</a>.</p> </li> <li><p><code>$flag = utf8::is_utf8($string)</code></p> <p>(Since Perl 5.8.1) Test whether <i>$string</i> is marked internally as encoded in UTF-8. Functionally the same as Encode::is_utf8().</p> </li> <li><p><code>$flag = utf8::valid($string)</code></p> <p>[INTERNAL] Test whether <i>$string</i> is in a consistent state regarding UTF-8. Will return true if it is well-formed UTF-8 and has the UTF-8 flag on <b>or</b> if <i>$string</i> is held as bytes (both these states are 'consistent'). Main reason for this routine is to allow Perl's test suite to check that operations have left strings in a consistent state. You most probably want to use utf8::is_utf8() instead.</p> </li> </ul> <p><code>utf8::encode</code> is like <code>utf8::upgrade</code>, but the UTF8 flag is cleared. See <a href="/5.20.1/perlunicode">perlunicode</a> for more on the UTF8 flag and the C API functions <code>sv_utf8_upgrade</code>, <code>sv_utf8_downgrade</code>, <code>sv_utf8_encode</code>, and <code>sv_utf8_decode</code>, which are wrapped by the Perl functions <code>utf8::upgrade</code>, <code>utf8::downgrade</code>, <code>utf8::encode</code> and <code>utf8::decode</code>. Also, the functions utf8::is_utf8, utf8::valid, utf8::encode, utf8::decode, utf8::upgrade, and utf8::downgrade are actually internal, and thus always available, without a <code>require utf8</code> statement.</p> <h1 id="BUGS"><a class="permalink" href="#BUGS">#</a>BUGS</h1> <p>One can have Unicode in identifier names, but not in package/class or subroutine names. While some limited functionality towards this does exist as of Perl 5.8.0, that is more accidental than designed; use of Unicode for the said purposes is unsupported.</p> <p>One reason of this unfinishedness is its (currently) inherent unportability: since both package names and subroutine names may need to be mapped to file and directory names, the Unicode capability of the filesystem becomes important-- and there unfortunately aren't portable answers.</p> <h1 id="SEE-ALSO"><a class="permalink" href="#SEE-ALSO">#</a><a id="SEE"></a>SEE ALSO</h1> <p><a href="/5.20.1/perlunitut">perlunitut</a>, <a href="/5.20.1/perluniintro">perluniintro</a>, <a href="/5.20.1/perlrun">perlrun</a>, <a href="/5.20.1/bytes">bytes</a>, <a href="/5.20.1/perlunicode">perlunicode</a></p> </div> <div id="footer"> <p>Perldoc Browser is maintained by Dan Book (<a href="https://metacpan.org/author/DBOOK">DBOOK</a>). Please contact him via the <a href="https://github.com/Grinnz/perldoc-browser/issues">GitHub issue tracker</a> or <a href="mailto:dbook@cpan.org">email</a> regarding any issues with the site itself, search, or rendering of documentation.</p> <p>The Perl documentation is maintained by the Perl 5 Porters in the development of Perl. Please contact them via the <a href="https://github.com/Perl/perl5/issues">Perl issue tracker</a>, the <a href="https://lists.perl.org/list/perl5-porters.html">mailing list</a>, or <a href="https://kiwiirc.com/client/irc.perl.org/p5p">IRC</a> to report any issues with the contents or format of the documentation.</p> </div> </div> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.slim.min.js" integrity="sha512-/DXTXr6nQodMUiq+IUJYCt2PPOUjrHJ9wFrqpJ3XkgPNOZVfMok7cRw6CSxyCQxXn6ozlESsSh1/sMCTF1rL/g==" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.16.1/umd/popper.min.js" integrity="sha512-ubuT8Z88WxezgSqf3RLuNi5lmjstiJcyezx34yIU2gAHonIi27Na7atqzUZCOoY4CExaoFumzOsFQ2Ch+I/HCw==" crossorigin="anonymous"></script> <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js" integrity="sha384-B4gt1jrGC7Jh4AgTPSdUtOBvfO8shuf57BaghqFfPlYxofvL8/KUEfYiJOMMV+rV" crossorigin="anonymous"></script> <script src="/js/highlight.pack.js"></script> <script>hljs.highlightAll();</script> </body> </html>