CINXE.COM
PEP 349 – Allow str() to return unicode strings | peps.python.org
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="color-scheme" content="light dark"> <title>PEP 349 – Allow str() to return unicode strings | peps.python.org</title> <link rel="shortcut icon" href="../_static/py.png"> <link rel="canonical" href="https://peps.python.org/pep-0349/"> <link rel="stylesheet" href="../_static/style.css" type="text/css"> <link rel="stylesheet" href="../_static/mq.css" type="text/css"> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light"> <link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark"> <link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss"> <meta property="og:title" content='PEP 349 – Allow str() to return unicode strings | peps.python.org'> <meta property="og:description" content="This PEP proposes to change the str() built-in function so that it can return unicode strings. This change would make it easier to write code that works with either string type and would also make some existing code handle unicode strings. The C funct..."> <meta property="og:type" content="website"> <meta property="og:url" content="https://peps.python.org/pep-0349/"> <meta property="og:site_name" content="Python Enhancement Proposals (PEPs)"> <meta property="og:image" content="https://peps.python.org/_static/og-image.png"> <meta property="og:image:alt" content="Python PEPs"> <meta property="og:image:width" content="200"> <meta property="og:image:height" content="200"> <meta name="description" content="This PEP proposes to change the str() built-in function so that it can return unicode strings. This change would make it easier to write code that works with either string type and would also make some existing code handle unicode strings. The C funct..."> <meta name="theme-color" content="#3776ab"> </head> <body> <svg xmlns="http://www.w3.org/2000/svg" style="display: none;"> <symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all"> <title>Following system colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="9"></circle> <path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path> </svg> </symbol> <symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all"> <title>Selected dark colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <path stroke="none" d="M0 0h24v24H0z" fill="none"></path> <path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path> </svg> </symbol> <symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all"> <title>Selected light colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="5"></circle> <line x1="12" y1="1" x2="12" y2="3"></line> <line x1="12" y1="21" x2="12" y2="23"></line> <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line> <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line> <line x1="1" y1="12" x2="3" y2="12"></line> <line x1="21" y1="12" x2="23" y2="12"></line> <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line> <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line> </svg> </symbol> </svg> <script> document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto" </script> <section id="pep-page-section"> <header> <h1>Python Enhancement Proposals</h1> <ul class="breadcrumbs"> <li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> » </li> <li><a href="../pep-0000/">PEP Index</a> » </li> <li>PEP 349</li> </ul> <button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())"> <svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg> <span class="visually-hidden">Toggle light / dark / auto colour theme</span> </button> </header> <article> <section id="pep-content"> <h1 class="page-title">PEP 349 – Allow str() to return unicode strings</h1> <dl class="rfc2822 field-list simple"> <dt class="field-odd">Author<span class="colon">:</span></dt> <dd class="field-odd">Neil Schemenauer <nas at arctrix.com></dd> <dt class="field-even">Status<span class="colon">:</span></dt> <dd class="field-even"><abbr title="Formally declined and will not be accepted">Rejected</abbr></dd> <dt class="field-odd">Type<span class="colon">:</span></dt> <dd class="field-odd"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd> <dt class="field-even">Created<span class="colon">:</span></dt> <dd class="field-even">02-Aug-2005</dd> <dt class="field-odd">Python-Version<span class="colon">:</span></dt> <dd class="field-odd">2.5</dd> <dt class="field-even">Post-History<span class="colon">:</span></dt> <dd class="field-even">06-Aug-2005</dd> <dt class="field-odd">Resolution<span class="colon">:</span></dt> <dd class="field-odd"><a class="reference external" href="https://mail.python.org/archives/list/python-dev@python.org/message/M2Y3PUFLAE23NPRJPVBYF6P5LW5LVN6F/">Python-Dev message</a></dd> </dl> <hr class="docutils" /> <section id="contents"> <details><summary>Table of Contents</summary><ul class="simple"> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#specification">Specification</a></li> <li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a></li> <li><a class="reference internal" href="#alternative-solutions">Alternative Solutions</a></li> <li><a class="reference internal" href="#references">References</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> </details></section> <section id="abstract"> <h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2> <p>This PEP proposes to change the <code class="docutils literal notranslate"><span class="pre">str()</span></code> built-in function so that it can return unicode strings. This change would make it easier to write code that works with either string type and would also make some existing code handle unicode strings. The C function <code class="docutils literal notranslate"><span class="pre">PyObject_Str()</span></code> would remain unchanged and the function <code class="docutils literal notranslate"><span class="pre">PyString_New()</span></code> would be added instead.</p> </section> <section id="rationale"> <h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2> <p>Python has had a Unicode string type for some time now but use of it is not yet widespread. There is a large amount of Python code that assumes that string data is represented as str instances. The long-term plan for Python is to phase out the str type and use unicode for all string data. Clearly, a smooth migration path must be provided.</p> <p>We need to upgrade existing libraries, written for str instances, to be made capable of operating in an all-unicode string world. We can’t change to an all-unicode world until all essential libraries are made capable for it. Upgrading the libraries in one shot does not seem feasible. A more realistic strategy is to individually make the libraries capable of operating on unicode strings while preserving their current all-str environment behaviour.</p> <p>First, we need to be able to write code that can accept unicode instances without attempting to coerce them to str instances. Let us label such code as Unicode-safe. Unicode-safe libraries can be used in an all-unicode world.</p> <p>Second, we need to be able to write code that, when provided only str instances, will not create unicode results. Let us label such code as str-stable. Libraries that are str-stable can be used by libraries and applications that are not yet Unicode-safe.</p> <p>Sometimes it is simple to write code that is both str-stable and Unicode-safe. For example, the following function just works:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">def</span><span class="w"> </span><span class="nf">appendx</span><span class="p">(</span><span class="n">s</span><span class="p">):</span> <span class="k">return</span> <span class="n">s</span> <span class="o">+</span> <span class="s1">'x'</span> </pre></div> </div> <p>That’s not too surprising since the unicode type is designed to make the task easier. The principle is that when str and unicode instances meet, the result is a unicode instance. One notable difficulty arises when code requires a string representation of an object; an operation traditionally accomplished by using the <code class="docutils literal notranslate"><span class="pre">str()</span></code> built-in function.</p> <p>Using the current <code class="docutils literal notranslate"><span class="pre">str()</span></code> function makes the code not Unicode-safe. Replacing a <code class="docutils literal notranslate"><span class="pre">str()</span></code> call with a <code class="docutils literal notranslate"><span class="pre">unicode()</span></code> call makes the code not str-stable. Changing <code class="docutils literal notranslate"><span class="pre">str()</span></code> so that it could return unicode instances would solve this problem. As a further benefit, some code that is currently not Unicode-safe because it uses <code class="docutils literal notranslate"><span class="pre">str()</span></code> would become Unicode-safe.</p> </section> <section id="specification"> <h2><a class="toc-backref" href="#specification" role="doc-backlink">Specification</a></h2> <p>A Python implementation of the <code class="docutils literal notranslate"><span class="pre">str()</span></code> built-in follows:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">def</span><span class="w"> </span><span class="nf">str</span><span class="p">(</span><span class="n">s</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Return a nice string representation of the object. The</span> <span class="sd"> return value is a str or unicode instance.</span> <span class="sd"> """</span> <span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="ow">is</span> <span class="nb">str</span> <span class="ow">or</span> <span class="nb">type</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="ow">is</span> <span class="n">unicode</span><span class="p">:</span> <span class="k">return</span> <span class="n">s</span> <span class="n">r</span> <span class="o">=</span> <span class="n">s</span><span class="o">.</span><span class="fm">__str__</span><span class="p">()</span> <span class="k">if</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">unicode</span><span class="p">)):</span> <span class="k">raise</span> <span class="ne">TypeError</span><span class="p">(</span><span class="s1">'__str__ returned non-string'</span><span class="p">)</span> <span class="k">return</span> <span class="n">r</span> </pre></div> </div> <p>The following function would be added to the C API and would be the equivalent to the <code class="docutils literal notranslate"><span class="pre">str()</span></code> built-in (ideally it be called <code class="docutils literal notranslate"><span class="pre">PyObject_Str</span></code>, but changing that function could cause a massive number of compatibility problems):</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">PyObject</span> <span class="o">*</span><span class="n">PyString_New</span><span class="p">(</span><span class="n">PyObject</span> <span class="o">*</span><span class="p">);</span> </pre></div> </div> <p>A reference implementation is available on Sourceforge <a class="footnote-reference brackets" href="#id2" id="id1">[1]</a> as a patch.</p> </section> <section id="backwards-compatibility"> <h2><a class="toc-backref" href="#backwards-compatibility" role="doc-backlink">Backwards Compatibility</a></h2> <p>Some code may require that <code class="docutils literal notranslate"><span class="pre">str()</span></code> returns a str instance. In the standard library, only one such case has been found so far. The function <code class="docutils literal notranslate"><span class="pre">email.header_decode()</span></code> requires a str instance and the <code class="docutils literal notranslate"><span class="pre">email.Header.decode_header()</span></code> function tries to ensure this by calling <code class="docutils literal notranslate"><span class="pre">str()</span></code> on its argument. The code was fixed by changing the line “header = str(header)” to:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">unicode</span><span class="p">):</span> <span class="n">header</span> <span class="o">=</span> <span class="n">header</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s1">'ascii'</span><span class="p">)</span> </pre></div> </div> <p>Whether this is truly a bug is questionable since <code class="docutils literal notranslate"><span class="pre">decode_header()</span></code> really operates on byte strings, not character strings. Code that passes it a unicode instance could itself be considered buggy.</p> </section> <section id="alternative-solutions"> <h2><a class="toc-backref" href="#alternative-solutions" role="doc-backlink">Alternative Solutions</a></h2> <p>A new built-in function could be added instead of changing <code class="docutils literal notranslate"><span class="pre">str()</span></code>. Doing so would introduce virtually no backwards compatibility problems. However, since the compatibility problems are expected to rare, changing <code class="docutils literal notranslate"><span class="pre">str()</span></code> seems preferable to adding a new built-in.</p> <p>The basestring type could be changed to have the proposed behaviour, rather than changing <code class="docutils literal notranslate"><span class="pre">str()</span></code>. However, that would be confusing behaviour for an abstract base type.</p> </section> <section id="references"> <h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2> <aside class="footnote-list brackets"> <aside class="footnote brackets" id="id2" role="doc-footnote"> <dt class="label" id="id2">[<a href="#id1">1</a>]</dt> <dd><a class="reference external" href="https://bugs.python.org/issue1266570">https://bugs.python.org/issue1266570</a></aside> </aside> </section> <section id="copyright"> <h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2> <p>This document has been placed in the public domain.</p> </section> </section> <hr class="docutils" /> <p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0349.rst">https://github.com/python/peps/blob/main/peps/pep-0349.rst</a></p> <p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0349.rst">2025-02-01 08:59:27 GMT</a></p> </article> <nav id="pep-sidebar"> <h2>Contents</h2> <ul> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#specification">Specification</a></li> <li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a></li> <li><a class="reference internal" href="#alternative-solutions">Alternative Solutions</a></li> <li><a class="reference internal" href="#references">References</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> <br> <a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0349.rst">Page Source (GitHub)</a> </nav> </section> <script src="../_static/colour_scheme.js"></script> <script src="../_static/wrap_tables.js"></script> <script src="../_static/sticky_banner.js"></script> </body> </html>