CINXE.COM

PEP 277 – Unicode file name support for Windows NT | peps.python.org

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="color-scheme" content="light dark"> <title>PEP 277 – Unicode file name support for Windows NT | peps.python.org</title> <link rel="shortcut icon" href="../_static/py.png"> <link rel="canonical" href="https://peps.python.org/pep-0277/"> <link rel="stylesheet" href="../_static/style.css" type="text/css"> <link rel="stylesheet" href="../_static/mq.css" type="text/css"> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light"> <link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark"> <link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss"> <meta property="og:title" content='PEP 277 – Unicode file name support for Windows NT | peps.python.org'> <meta property="og:description" content="This PEP discusses supporting access to all files possible on Windows NT by passing Unicode file names directly to the system’s wide-character functions."> <meta property="og:type" content="website"> <meta property="og:url" content="https://peps.python.org/pep-0277/"> <meta property="og:site_name" content="Python Enhancement Proposals (PEPs)"> <meta property="og:image" content="https://peps.python.org/_static/og-image.png"> <meta property="og:image:alt" content="Python PEPs"> <meta property="og:image:width" content="200"> <meta property="og:image:height" content="200"> <meta name="description" content="This PEP discusses supporting access to all files possible on Windows NT by passing Unicode file names directly to the system’s wide-character functions."> <meta name="theme-color" content="#3776ab"> </head> <body> <svg xmlns="http://www.w3.org/2000/svg" style="display: none;"> <symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all"> <title>Following system colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="9"></circle> <path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path> </svg> </symbol> <symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all"> <title>Selected dark colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <path stroke="none" d="M0 0h24v24H0z" fill="none"></path> <path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path> </svg> </symbol> <symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all"> <title>Selected light colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="5"></circle> <line x1="12" y1="1" x2="12" y2="3"></line> <line x1="12" y1="21" x2="12" y2="23"></line> <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line> <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line> <line x1="1" y1="12" x2="3" y2="12"></line> <line x1="21" y1="12" x2="23" y2="12"></line> <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line> <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line> </svg> </symbol> </svg> <script> document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto" </script> <section id="pep-page-section"> <header> <h1>Python Enhancement Proposals</h1> <ul class="breadcrumbs"> <li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li> <li><a href="../pep-0000/">PEP Index</a> &raquo; </li> <li>PEP 277</li> </ul> <button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())"> <svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg> <span class="visually-hidden">Toggle light / dark / auto colour theme</span> </button> </header> <article> <section id="pep-content"> <h1 class="page-title">PEP 277 – Unicode file name support for Windows NT</h1> <dl class="rfc2822 field-list simple"> <dt class="field-odd">Author<span class="colon">:</span></dt> <dd class="field-odd">Neil Hodgson &lt;neilh&#32;&#97;t&#32;scintilla.org&gt;</dd> <dt class="field-even">Status<span class="colon">:</span></dt> <dd class="field-even"><abbr title="Accepted and implementation complete, or no longer active">Final</abbr></dd> <dt class="field-odd">Type<span class="colon">:</span></dt> <dd class="field-odd"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd> <dt class="field-even">Created<span class="colon">:</span></dt> <dd class="field-even">11-Jan-2002</dd> <dt class="field-odd">Python-Version<span class="colon">:</span></dt> <dd class="field-odd">2.3</dd> <dt class="field-even">Post-History<span class="colon">:</span></dt> <dd class="field-even"><p></p></dd> </dl> <hr class="docutils" /> <section id="contents"> <details><summary>Table of Contents</summary><ul class="simple"> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#specification">Specification</a></li> <li><a class="reference internal" href="#restrictions">Restrictions</a></li> <li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li> <li><a class="reference internal" href="#references">References</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> </details></section> <section id="abstract"> <h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2> <p>This PEP discusses supporting access to all files possible on Windows NT by passing Unicode file names directly to the system’s wide-character functions.</p> </section> <section id="rationale"> <h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2> <p>Python 2.2 on Win32 platforms converts Unicode file names passed to open and to functions in the <code class="docutils literal notranslate"><span class="pre">os</span></code> module into the ‘mbcs’ encoding before passing the result to the operating system. This is often successful in the common case where the script is operating with the locale set to the same value as when the file was created. Most machines are set up as one locale and rarely if ever changed from this locale. For some users, locale is changed more often and on servers there are often files saved by users using different locales.</p> <p>On Windows NT and descendent operating systems, including Windows 2000 and Windows XP, wide-character APIs are available that provide direct access to all file names, including those that are not representable using the current locale. The purpose of this proposal is to provide access to these wide-character APIs through the standard Python file object and posix module and so provide access to all files on Windows NT.</p> </section> <section id="specification"> <h2><a class="toc-backref" href="#specification" role="doc-backlink">Specification</a></h2> <p>On Windows platforms which provide wide-character file APIs, when Unicode arguments are provided to file APIs, wide-character calls are made instead of the standard C library and posix calls.</p> <p>The Python file object is extended to use a Unicode file name argument directly rather than converting it. This affects the file object constructor <code class="docutils literal notranslate"><span class="pre">file(filename[,</span> <span class="pre">mode[,</span> <span class="pre">bufsize]])</span></code> and also the <code class="docutils literal notranslate"><span class="pre">open</span></code> function which is an alias of this constructor. When a Unicode filename argument is used here then the <code class="docutils literal notranslate"><span class="pre">name</span></code> attribute of the file object will be Unicode. The representation of a file object, <code class="docutils literal notranslate"><span class="pre">repr(f)</span></code> will display Unicode file names as an escaped string in a similar manner to the representation of Unicode strings.</p> <p>The <code class="docutils literal notranslate"><span class="pre">posix</span></code> module contains functions that take file or directory names: <code class="docutils literal notranslate"><span class="pre">chdir</span></code>, <code class="docutils literal notranslate"><span class="pre">listdir</span></code>, <code class="docutils literal notranslate"><span class="pre">mkdir</span></code>, <code class="docutils literal notranslate"><span class="pre">open</span></code>, <code class="docutils literal notranslate"><span class="pre">remove</span></code>, <code class="docutils literal notranslate"><span class="pre">rename</span></code>, <code class="docutils literal notranslate"><span class="pre">rmdir</span></code>, <code class="docutils literal notranslate"><span class="pre">stat</span></code>, and <code class="docutils literal notranslate"><span class="pre">_getfullpathname</span></code>. These will use Unicode arguments directly rather than converting them. For the <code class="docutils literal notranslate"><span class="pre">rename</span></code> function, this behaviour is triggered when either of the arguments is Unicode and the other argument converted to Unicode using the default encoding.</p> <p>The <code class="docutils literal notranslate"><span class="pre">listdir</span></code> function currently returns a list of strings. Under this proposal, it will return a list of Unicode strings when its path argument is Unicode.</p> </section> <section id="restrictions"> <h2><a class="toc-backref" href="#restrictions" role="doc-backlink">Restrictions</a></h2> <p>On the consumer Windows operating systems, Windows 95, Windows 98, and Windows ME, there are no wide-character file APIs so behaviour is unchanged under this proposal. It may be possible in the future to extend this proposal to cover these operating systems as the VFAT-32 file system used by them does support Unicode file names but access is difficult and so implementing this would require much work. The “Microsoft Layer for Unicode” could be a starting point for implementing this.</p> <p>Python can be compiled with the size of Unicode characters set to 4 bytes rather than 2 by defining <code class="docutils literal notranslate"><span class="pre">PY_UNICODE_TYPE</span></code> to be a 4 byte type and <code class="docutils literal notranslate"><span class="pre">Py_UNICODE_SIZE</span></code> to be 4. As the Windows API does not accept 4 byte characters, the features described in this proposal will not work in this mode so the implementation falls back to the current ‘mbcs’ encoding technique. This restriction could be lifted in the future by performing extra conversions using <code class="docutils literal notranslate"><span class="pre">PyUnicode_AsWideChar</span></code> but for now that would add too much complexity for a very rarely used feature.</p> </section> <section id="reference-implementation"> <h2><a class="toc-backref" href="#reference-implementation" role="doc-backlink">Reference Implementation</a></h2> <p>The implementation is available at <a class="footnote-reference brackets" href="#id2" id="id1">[2]</a>.</p> </section> <section id="references"> <h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2> <p>[1] Microsoft Windows APIs <a class="reference external" href="https://msdn.microsoft.com/">https://msdn.microsoft.com/</a></p> <aside class="footnote-list brackets"> <aside class="footnote brackets" id="id2" role="doc-footnote"> <dt class="label" id="id2">[<a href="#id1">2</a>]</dt> <dd><a class="reference external" href="https://github.com/python/cpython/issues/37017">https://github.com/python/cpython/issues/37017</a></aside> </aside> </section> <section id="copyright"> <h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2> <p>This document has been placed in the public domain.</p> </section> </section> <hr class="docutils" /> <p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0277.rst">https://github.com/python/peps/blob/main/peps/pep-0277.rst</a></p> <p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0277.rst">2025-02-01 08:55:40 GMT</a></p> </article> <nav id="pep-sidebar"> <h2>Contents</h2> <ul> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#specification">Specification</a></li> <li><a class="reference internal" href="#restrictions">Restrictions</a></li> <li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li> <li><a class="reference internal" href="#references">References</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> <br> <a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0277.rst">Page Source (GitHub)</a> </nav> </section> <script src="../_static/colour_scheme.js"></script> <script src="../_static/wrap_tables.js"></script> <script src="../_static/sticky_banner.js"></script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10