CINXE.COM

PEP 710 – Recording the provenance of installed packages | peps.python.org

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="color-scheme" content="light dark"> <title>PEP 710 – Recording the provenance of installed packages | peps.python.org</title> <link rel="shortcut icon" href="../_static/py.png"> <link rel="canonical" href="https://peps.python.org/pep-0710/"> <link rel="stylesheet" href="../_static/style.css" type="text/css"> <link rel="stylesheet" href="../_static/mq.css" type="text/css"> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light"> <link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark"> <link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss"> <meta property="og:title" content='PEP 710 – Recording the provenance of installed packages | peps.python.org'> <meta property="og:description" content="This PEP describes a way to record the provenance of installed Python distributions. The record is created by an installer and is available to users in the form of a JSON file provenance_url.json in the .dist-info directory. The mentioned JSON file capt..."> <meta property="og:type" content="website"> <meta property="og:url" content="https://peps.python.org/pep-0710/"> <meta property="og:site_name" content="Python Enhancement Proposals (PEPs)"> <meta property="og:image" content="https://peps.python.org/_static/og-image.png"> <meta property="og:image:alt" content="Python PEPs"> <meta property="og:image:width" content="200"> <meta property="og:image:height" content="200"> <meta name="description" content="This PEP describes a way to record the provenance of installed Python distributions. The record is created by an installer and is available to users in the form of a JSON file provenance_url.json in the .dist-info directory. The mentioned JSON file capt..."> <meta name="theme-color" content="#3776ab"> </head> <body> <svg xmlns="http://www.w3.org/2000/svg" style="display: none;"> <symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all"> <title>Following system colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="9"></circle> <path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path> </svg> </symbol> <symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all"> <title>Selected dark colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <path stroke="none" d="M0 0h24v24H0z" fill="none"></path> <path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path> </svg> </symbol> <symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all"> <title>Selected light colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="5"></circle> <line x1="12" y1="1" x2="12" y2="3"></line> <line x1="12" y1="21" x2="12" y2="23"></line> <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line> <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line> <line x1="1" y1="12" x2="3" y2="12"></line> <line x1="21" y1="12" x2="23" y2="12"></line> <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line> <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line> </svg> </symbol> </svg> <script> document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto" </script> <section id="pep-page-section"> <header> <h1>Python Enhancement Proposals</h1> <ul class="breadcrumbs"> <li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li> <li><a href="../pep-0000/">PEP Index</a> &raquo; </li> <li>PEP 710</li> </ul> <button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())"> <svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg> <span class="visually-hidden">Toggle light / dark / auto colour theme</span> </button> </header> <article> <section id="pep-content"> <h1 class="page-title">PEP 710 – Recording the provenance of installed packages</h1> <dl class="rfc2822 field-list simple"> <dt class="field-odd">Author<span class="colon">:</span></dt> <dd class="field-odd">Fridolín Pokorný &lt;fridolin.pokorny at gmail.com&gt;</dd> <dt class="field-even">Sponsor<span class="colon">:</span></dt> <dd class="field-even">Donald Stufft &lt;donald&#32;&#97;t&#32;stufft.io&gt;</dd> <dt class="field-odd">PEP-Delegate<span class="colon">:</span></dt> <dd class="field-odd">Paul Moore &lt;p.f.moore&#32;&#97;t&#32;gmail.com&gt;</dd> <dt class="field-even">Discussions-To<span class="colon">:</span></dt> <dd class="field-even"><a class="reference external" href="https://discuss.python.org/t/pep-710-recording-the-provenance-of-installed-packages/25428">Discourse thread</a></dd> <dt class="field-odd">Status<span class="colon">:</span></dt> <dd class="field-odd"><abbr title="Proposal under active discussion and revision">Draft</abbr></dd> <dt class="field-even">Type<span class="colon">:</span></dt> <dd class="field-even"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd> <dt class="field-odd">Topic<span class="colon">:</span></dt> <dd class="field-odd"><a class="reference external" href="../topic/packaging/">Packaging</a></dd> <dt class="field-even">Created<span class="colon">:</span></dt> <dd class="field-even">27-Mar-2023</dd> <dt class="field-odd">Post-History<span class="colon">:</span></dt> <dd class="field-odd"><a class="reference external" href="https://discuss.python.org/t/pip-installation-reports/12316" title="Discourse thread">03-Dec-2021</a>, <a class="reference external" href="https://discuss.python.org/t/pre-pep-recording-provenance-of-installed-packages/23340" title="Discourse thread">30-Jan-2023</a>, <a class="reference external" href="https://discuss.python.org/t/draft-pep-recording-provenance-of-installed-packages/24838" title="Discourse thread">14-Mar-2023</a>, <a class="reference external" href="https://discuss.python.org/t/pep-710-recording-the-provenance-of-installed-packages/25428" title="Discourse thread">03-Apr-2023</a></dd> </dl> <hr class="docutils" /> <section id="contents"> <details><summary>Table of Contents</summary><ul class="simple"> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#motivation">Motivation</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#specification">Specification</a></li> <li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a><ul> <li><a class="reference internal" href="#presence-of-provenance-url-json-in-installers-and-libraries">Presence of provenance_url.json in installers and libraries</a></li> <li><a class="reference internal" href="#compatibility-with-direct-url-json">Compatibility with direct_url.json</a></li> </ul> </li> <li><a class="reference internal" href="#security-implications">Security Implications</a></li> <li><a class="reference internal" href="#how-to-teach-this">How to Teach This</a></li> <li><a class="reference internal" href="#examples">Examples</a><ul> <li><a class="reference internal" href="#examples-of-a-valid-provenance-url-json">Examples of a valid provenance_url.json</a></li> <li><a class="reference internal" href="#examples-of-an-invalid-provenance-url-json">Examples of an invalid provenance_url.json</a></li> <li><a class="reference internal" href="#example-pip-commands-and-their-effect-on-provenance-url-json-and-direct-url-json">Example pip commands and their effect on provenance_url.json and direct_url.json</a></li> </ul> </li> <li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li> <li><a class="reference internal" href="#rejected-ideas">Rejected Ideas</a><ul> <li><a class="reference internal" href="#naming-the-file-direct-url-json-instead-of-provenance-url-json">Naming the file direct_url.json instead of provenance_url.json</a></li> <li><a class="reference internal" href="#deprecating-direct-url-json-and-using-only-provenance-url-json">Deprecating direct_url.json and using only provenance_url.json</a></li> <li><a class="reference internal" href="#keeping-the-hash-key-in-the-archive-info-dictionary">Keeping the hash key in the archive_info dictionary</a></li> <li><a class="reference internal" href="#allowing-no-hashes-stated">Allowing no hashes stated</a></li> <li><a class="reference internal" href="#making-the-hashes-key-optional">Making the hashes key optional</a></li> <li><a class="reference internal" href="#storing-index-url">Storing index URL</a></li> </ul> </li> <li><a class="reference internal" href="#open-issues">Open Issues</a><ul> <li><a class="reference internal" href="#availability-of-the-provenance-url-json-file-in-conda">Availability of the provenance_url.json file in Conda</a></li> <li><a class="reference internal" href="#using-provenance-url-json-in-downstream-installers">Using provenance_url.json in downstream installers</a></li> </ul> </li> <li><a class="reference internal" href="#appendix-survey-of-installers-and-libraries">Appendix: Survey of installers and libraries</a><ul> <li><a class="reference internal" href="#pip">pip</a></li> <li><a class="reference internal" href="#distlib">distlib</a></li> <li><a class="reference internal" href="#pipenv">Pipenv</a></li> <li><a class="reference internal" href="#installer">installer</a></li> <li><a class="reference internal" href="#poetry">Poetry</a></li> <li><a class="reference internal" href="#conda">Conda</a></li> <li><a class="reference internal" href="#hatch">Hatch</a></li> <li><a class="reference internal" href="#micropipenv">micropipenv</a></li> <li><a class="reference internal" href="#thamos">Thamos</a></li> <li><a class="reference internal" href="#pdm">PDM</a></li> <li><a class="reference internal" href="#uv">uv</a></li> </ul> </li> <li><a class="reference internal" href="#acknowledgements">Acknowledgements</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> </details></section> <section id="abstract"> <h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2> <p>This PEP describes a way to record the provenance of installed Python distributions. The record is created by an installer and is available to users in the form of a JSON file <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory. The mentioned JSON file captures additional metadata to allow recording a URL to a <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Distribution-Package" title="(in Python Packaging User Guide)"><span class="xref std std-term">distribution package</span></a> together with the installed distribution hash. This proposal is built on top of <a class="pep reference internal" href="../pep-0610/" title="PEP 610 – Recording the Direct URL Origin of installed distributions">PEP 610</a> following <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">its corresponding canonical PyPA spec</span></a> and complements <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> with <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> for when packages are identified by a name, and optionally a version.</p> </section> <section id="motivation"> <h2><a class="toc-backref" href="#motivation" role="doc-backlink">Motivation</a></h2> <p>Installing a Python <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Project" title="(in Python Packaging User Guide)"><span class="xref std std-term">Project</span></a> involves downloading a <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Distribution-Package" title="(in Python Packaging User Guide)"><span class="xref std std-term">Distribution Package</span></a> from a <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Package-Index" title="(in Python Packaging User Guide)"><span class="xref std std-term">Package Index</span></a> and extracting its content to an appropriate place. After the installation process is done, information about the release artifact used as well as its source is generally lost. However, there are use cases for keeping records of distributions used for installing packages and their provenance.</p> <p>Python wheels can be built with different compiler flags or supporting different wheel tags. In both cases, users might get into a situation in which multiple wheels might be considered by installers (possibly from different package indexes) and immediately finding out which wheel file was actually used during the installation might be helpful. This way, developers can use information about wheels to debug issues making sure the desired wheel was actually installed. Another use case could be tools reporting software installed, such as tools reporting a SBOM (Software Bill of Materials), that might give more accurate reports. Yet another use case could be reconstruction of the Python environment by pinning each installed package to a specific distribution artifact consumed from a Python package index.</p> </section> <section id="rationale"> <h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2> <p>The motivation described in this PEP is an extension of <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Recording the Direct URL Origin of installed distributions</span></a> specification. In addition to recording provenance information for packages installed using a direct URL, installers should also do so for packages installed by name (and optionally version) from Python package indexes.</p> <p>The idea described in this PEP originated in a tool called <a class="reference external" href="https://github.com/thoth-station/micropipenv">micropipenv</a> that is used to install <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Distribution-Package" title="(in Python Packaging User Guide)"><span class="xref std std-term">distribution packages</span></a> in containerized environments (see the reported issue <a class="reference external" href="https://github.com/thoth-station/micropipenv/issues/206">thoth-station/micropipenv#206</a>). Currently, the assembled containerized application does not implicitly carry information about the provenance of installed distribution packages (unless these are installed from full URLs and recorded via <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code>). This requires container image suppliers to link container images with the corresponding build process, its configuration and the application source code for checking requirements files in cases when software present in containerized environments needs to be audited.</p> <p>The <a class="reference external" href="https://discuss.python.org/t/12316">subsequent discussion in the Discourse thread</a> also brought up pip’s new <code class="docutils literal notranslate"><span class="pre">--report</span></code> option that can <a class="reference external" href="https://pip.pypa.io/en/stable/reference/installation-report/">generate a detailed JSON report</a> about the installation process. This option could help with the provenance problem this PEP approaches. Nevertheless, this option needs to be <em>explicitly</em> passed to pip to obtain the provenance information, and includes additional metadata that might not be necessary for checking the provenance (such as Python version requirements of each distribution package). Also, this option is specific to pip as of the writing of this PEP.</p> <p>Note the current <a class="reference external" href="https://packaging.python.org/en/latest/specifications/recording-installed-packages/#recording-installed-packages" title="(in Python Packaging User Guide)"><span class="xref std std-ref">spec for recording installed packages</span></a> defines a <code class="docutils literal notranslate"><span class="pre">RECORD</span></code> file that records installed files, but not the distribution artifact from which these files were obtained. Auditing installed artifacts can be performed based on matching the entries listed in the <code class="docutils literal notranslate"><span class="pre">RECORD</span></code> file. However, this technique requires a pre-computed database of files each artifact provides or a comparison with the actual artifact content. Both approaches are relatively expensive and time consuming operations which could be eliminated with the proposed <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file.</p> <p>Recording provenance information for installed distribution packages, both those obtained from direct URLs and by name/version from an index, can simplify auditing Python environments in general, beyond just the specific use case for containerized applications mentioned earlier. A community project <a class="reference external" href="https://github.com/pypa/pip-audit">pip-audit</a> raised their possible interest in <a class="reference external" href="https://github.com/pypa/pip-audit/issues/170">pypa/pip-audit#170</a>.</p> </section> <section id="specification"> <h2><a class="toc-backref" href="#specification" role="doc-backlink">Specification</a></h2> <p>The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in <span class="target" id="index-0"></span><a class="rfc reference external" href="https://datatracker.ietf.org/doc/html/rfc2119.html"><strong>RFC 2119</strong></a>.</p> <p>The <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file SHOULD be created in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory by installers when installing a <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Distribution-Package" title="(in Python Packaging User Guide)"><span class="xref std std-term">Distribution Package</span></a> specified by name (and optionally by <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Version-Specifier" title="(in Python Packaging User Guide)"><span class="xref std std-term">Version Specifier</span></a>).</p> <p>This file MUST NOT be created when installing a distribution package from a requirement specifying a direct URL reference (including a VCS URL).</p> <p>Only one of the files <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> and <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> (from <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Recording the Direct URL Origin of installed distributions</span></a> specification and the corresponding specification of the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Direct URL Data Structure</span></a>), may be present in a given <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory; installers MUST NOT add both.</p> <p>The <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> JSON file MUST be a dictionary, compliant with <span class="target" id="index-1"></span><a class="rfc reference external" href="https://datatracker.ietf.org/doc/html/rfc8259.html"><strong>RFC 8259</strong></a> and UTF-8 encoded.</p> <p>If present, it MUST contain exactly two keys. The first MUST be <code class="docutils literal notranslate"><span class="pre">url</span></code>, with type <code class="docutils literal notranslate"><span class="pre">string</span></code>. The second key MUST be <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> with a value defined below.</p> <p>The value of the <code class="docutils literal notranslate"><span class="pre">url</span></code> key MUST be the URL from which the distribution package was downloaded. If a wheel is built from a source distribution, the <code class="docutils literal notranslate"><span class="pre">url</span></code> value MUST be the URL from which the source distribution was downloaded. If a wheel is downloaded and installed directly, the <code class="docutils literal notranslate"><span class="pre">url</span></code> field MUST be the URL from which the wheel was downloaded. As in the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Direct URL Data Structure</span></a> specification, the <code class="docutils literal notranslate"><span class="pre">url</span></code> value MUST be stripped of any sensitive authentication information for security reasons.</p> <p>The user:password section of the URL MAY however be composed of environment variables, matching the following regular expression:</p> <div class="highlight-text notranslate"><div class="highlight"><pre><span></span>\$\{[A-Za-z0-9-_]+\}(:\$\{[A-Za-z0-9-_]+\})? </pre></div> </div> <p>Additionally, the user:password section of the URL MAY be a well-known, non-security sensitive string. A typical example is <code class="docutils literal notranslate"><span class="pre">git</span></code> in the case of an URL such as <code class="docutils literal notranslate"><span class="pre">ssh://git&#64;gitlab.com</span></code>.</p> <p>The value of <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> MUST be a dictionary with a single key <code class="docutils literal notranslate"><span class="pre">hashes</span></code>. The value of <code class="docutils literal notranslate"><span class="pre">hashes</span></code> is a dictionary mapping hash function names to a hex-encoded digest of the file referenced by the <code class="docutils literal notranslate"><span class="pre">url</span></code> value. At least one hash MUST be recorded. Multiple hashes MAY be included, and it is up to the consumer to decide what to do with multiple hashes (it may validate all of them or a subset of them, or nothing at all).</p> <p>Each hash MUST be one of the single argument hashes provided by <a class="reference external" href="https://docs.python.org/3.11/library/hashlib.html#hashlib.algorithms_guaranteed" title="(in Python v3.11)"><code class="docutils literal notranslate"><span class="pre">hashlib.algorithms_guaranteed</span></code></a>, excluding <code class="docutils literal notranslate"><span class="pre">sha1</span></code> and <code class="docutils literal notranslate"><span class="pre">md5</span></code> which MUST NOT be used. As of Python 3.11, with <code class="docutils literal notranslate"><span class="pre">shake_128</span></code> and <code class="docutils literal notranslate"><span class="pre">shake_256</span></code> excluded for being multi-argument, the allowed set of hashes is:</p> <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span><span class="w"> </span><span class="nn">hashlib</span> <span class="gp">&gt;&gt;&gt; </span><span class="nb">sorted</span><span class="p">(</span><span class="n">hashlib</span><span class="o">.</span><span class="n">algorithms_guaranteed</span> <span class="o">-</span> <span class="p">{</span><span class="s2">&quot;shake_128&quot;</span><span class="p">,</span> <span class="s2">&quot;shake_256&quot;</span><span class="p">,</span> <span class="s2">&quot;sha1&quot;</span><span class="p">,</span> <span class="s2">&quot;md5&quot;</span><span class="p">})</span> <span class="go">[&#39;blake2b&#39;, &#39;blake2s&#39;, &#39;sha224&#39;, &#39;sha256&#39;, &#39;sha384&#39;, &#39;sha3_224&#39;, &#39;sha3_256&#39;, &#39;sha3_384&#39;, &#39;sha3_512&#39;, &#39;sha512&#39;]</span> </pre></div> </div> <p>Each hash MUST be referenced by the canonical name of the hash, always lower case.</p> <p>Hashes <code class="docutils literal notranslate"><span class="pre">sha1</span></code> and <code class="docutils literal notranslate"><span class="pre">md5</span></code> MUST NOT be present, due to the security limitations of these hash algorithms. Conversely, hash <code class="docutils literal notranslate"><span class="pre">sha256</span></code> SHOULD be included.</p> <p>Installers that cache distribution packages from an index SHOULD keep information related to the cached distribution artifact, so that the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file can be created even when installing distribution packages from the installer’s cache.</p> </section> <section id="backwards-compatibility"> <h2><a class="toc-backref" href="#backwards-compatibility" role="doc-backlink">Backwards Compatibility</a></h2> <p>Following the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/recording-installed-packages/#recording-installed-packages" title="(in Python Packaging User Guide)"><span>Recording installed projects</span></a> specification, installers may keep additional installer-specific files in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory. To make sure this PEP does not cause any backwards compatibility issues, a <a class="reference internal" href="#tool-survey">comprehensive survey of installers and libraries</a> found no current tools that are using a similarly-named file, or other major feasibility concerns.</p> <p>The <a class="reference external" href="https://packaging.python.org/en/latest/specifications/binary-distribution-format/#binary-distribution-format" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Wheel specification</span></a> lists files that can be present in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory. None of these file names collide with the proposed <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file from this PEP.</p> <section id="presence-of-provenance-url-json-in-installers-and-libraries"> <h3><a class="toc-backref" href="#presence-of-provenance-url-json-in-installers-and-libraries" role="doc-backlink">Presence of provenance_url.json in installers and libraries</a></h3> <p>A comprehensive survey of the existing installers, libraries, and dependency managers in the Python ecosystem analyzed the implications of adding support for <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> to each tool. In summary, no major backwards compatibility issues, conflicts or feasibility blockers were found as of the time of writing of this PEP. More details about the survey can be found in the <a class="reference internal" href="#appendix-survey-of-installers-and-libraries">Appendix: Survey of installers and libraries</a> section.</p> </section> <section id="compatibility-with-direct-url-json"> <h3><a class="toc-backref" href="#compatibility-with-direct-url-json" role="doc-backlink">Compatibility with direct_url.json</a></h3> <p>This proposal does not make any changes to the <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> file described in <a class="pep reference internal" href="../pep-0610/" title="PEP 610 – Recording the Direct URL Origin of installed distributions">PEP 610</a> and <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">its corresponding canonical PyPA spec</span></a>.</p> <p>The content of <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file was designed in a way to eventually allow installers reuse some of the logic supporting <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> when a direct URL refers to a source archive or a wheel.</p> <p>The main difference between the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> and <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> files are the mandatory keys and their values in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file. This helps make sure consumers of the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file can rely on its content, if the file is present in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory.</p> </section> </section> <section id="security-implications"> <h2><a class="toc-backref" href="#security-implications" role="doc-backlink">Security Implications</a></h2> <p>One of the main security features of the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file is the ability to audit installed artifacts in Python environments. Tools can check which Python package indexes were used to install Python <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Distribution-Package" title="(in Python Packaging User Guide)"><span class="xref std std-term">distribution packages</span></a> as well as the hash digests of their release artifacts.</p> <p>As an example, we can take the recent compromised dependency chain in <a class="reference external" href="https://pytorch.org/blog/compromised-nightly-dependency/">the PyTorch incident</a>. The PyTorch index provided a package named <code class="docutils literal notranslate"><span class="pre">torchtriton</span></code>. An attacker published <code class="docutils literal notranslate"><span class="pre">torchtriton</span></code> on PyPI, which ran a malicious binary. By checking the URL of the installed Python distribution stated in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file, tools can automatically check the source of the installed Python distribution. In case of the PyTorch incident, the URL of <code class="docutils literal notranslate"><span class="pre">torchtriton</span></code> should point to the PyTorch index, not PyPI. Tools can help identifying such malicious Python distributions installed by checking the installed Python distribution URL. A more exact check can include also the hash of the installed Python distribution stated in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file. Such checks on hashes can be helpful for mirrored Python package indexes where Python distributions are not distinguishable by their source URLs, making sure only desired Python package distributions are installed.</p> <p>A malicious actor can intentionally adjust the content of <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> to possibly hide provenance information of the installed Python distribution. A security check which would uncover such malicious activity is beyond scope of this PEP as it would require monitoring actions on the filesystem and eventually reviewing user or file permissions.</p> </section> <section id="how-to-teach-this"> <h2><a class="toc-backref" href="#how-to-teach-this" role="doc-backlink">How to Teach This</a></h2> <p>The <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> metadata file is intended for tools and is not directly visible to end users.</p> </section> <section id="examples"> <h2><a class="toc-backref" href="#examples" role="doc-backlink">Examples</a></h2> <section id="examples-of-a-valid-provenance-url-json"> <h3><a class="toc-backref" href="#examples-of-a-valid-provenance-url-json" role="doc-backlink">Examples of a valid provenance_url.json</a></h3> <p>A valid <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> list multiple hashes:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;archive_info&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;hashes&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;blake2s&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;fffeaf3d0bd71dc960ca2113af890a2f2198f2466f8cd58ce4b77c1fc54601ff&quot;</span><span class="p">,</span> <span class="w"> </span><span class="nt">&quot;sha256&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f&quot;</span><span class="p">,</span> <span class="w"> </span><span class="nt">&quot;sha3_256&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;c856930e0f707266d30e5b48c667a843d45e79bb30473c464e92dfa158285eab&quot;</span><span class="p">,</span> <span class="w"> </span><span class="nt">&quot;sha512&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;6bad5536c30a0b2d5905318a1592948929fbac9baf3bcf2e7faeaf90f445f82bc2b656d0a89070d8a6a9395761f4793c83187bd640c64b2656a112b5be41f73d&quot;</span> <span class="w"> </span><span class="p">}</span> <span class="w"> </span><span class="p">},</span> <span class="w"> </span><span class="nt">&quot;url&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl&quot;</span> <span class="p">}</span> </pre></div> </div> <p>A valid <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> listing a single hash entry:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;archive_info&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;hashes&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;sha256&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f&quot;</span> <span class="w"> </span><span class="p">}</span> <span class="w"> </span><span class="p">},</span> <span class="w"> </span><span class="nt">&quot;url&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl&quot;</span> <span class="p">}</span> </pre></div> </div> <p>A valid <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> listing a source distribution which was used to build and install a wheel:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;archive_info&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;hashes&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;sha256&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;8bfe29f17c10e2f2e619de8033a07a224058d96b3bfe2ed61777596f7ffd7fa9&quot;</span> <span class="w"> </span><span class="p">}</span> <span class="w"> </span><span class="p">},</span> <span class="w"> </span><span class="nt">&quot;url&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;https://files.pythonhosted.org/packages/1d/43/ad8ae671de795ec2eafd86515ef9842ab68455009d864c058d0c3dcf680d/micropipenv-0.0.1.tar.gz&quot;</span> <span class="p">}</span> </pre></div> </div> </section> <section id="examples-of-an-invalid-provenance-url-json"> <h3><a class="toc-backref" href="#examples-of-an-invalid-provenance-url-json" role="doc-backlink">Examples of an invalid provenance_url.json</a></h3> <p>The following example includes a <code class="docutils literal notranslate"><span class="pre">hash</span></code> key in the <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> dictionary as originally designed in the data structure documented in <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span>Recording the Direct URL Origin of installed distributions</span></a>. The <code class="docutils literal notranslate"><span class="pre">hash</span></code> key MUST NOT be present to prevent from any possible confusion with <code class="docutils literal notranslate"><span class="pre">hashes</span></code> and additional checks that would be required to keep hash values in sync.</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;archive_info&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;hash&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;sha256=236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f&quot;</span><span class="p">,</span> <span class="w"> </span><span class="nt">&quot;hashes&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;sha256&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f&quot;</span> <span class="w"> </span><span class="p">}</span> <span class="w"> </span><span class="p">},</span> <span class="w"> </span><span class="nt">&quot;url&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl&quot;</span> <span class="p">}</span> </pre></div> </div> <p>Another example demonstrates an invalid hash name. The referenced hash name does not correspond to the canonical hash names described in this PEP and in the Python docs under <a class="reference external" href="https://docs.python.org/3.11/library/hashlib.html#hashlib.hash.name" title="(in Python v3.11)"><code class="docutils literal notranslate"><span class="pre">hashlib.hash.name</span></code></a>.</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;archive_info&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;hashes&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;SHA-256&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f&quot;</span> <span class="w"> </span><span class="p">}</span> <span class="w"> </span><span class="p">},</span> <span class="w"> </span><span class="nt">&quot;url&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl&quot;</span> <span class="p">}</span> </pre></div> </div> <p>The last example demonstrates a <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file with no hashes available for the downloaded artifact:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;archive_info&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;hashes&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{}</span> <span class="w"> </span><span class="p">}</span> <span class="w"> </span><span class="nt">&quot;url&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl&quot;</span> <span class="p">}</span> </pre></div> </div> </section> <section id="example-pip-commands-and-their-effect-on-provenance-url-json-and-direct-url-json"> <h3><a class="toc-backref" href="#example-pip-commands-and-their-effect-on-provenance-url-json-and-direct-url-json" role="doc-backlink">Example pip commands and their effect on provenance_url.json and direct_url.json</a></h3> <p>These commands generate a <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> file but do not generate a <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file. These examples follow examples from <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Direct URL Data Structure</span></a> specification:</p> <ul class="simple"> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">https://example.com/app-1.0.tgz</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">https://example.com/app-1.0.whl</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">&quot;git+https://example.com/repo/app.git#egg=app&amp;subdirectory=setup&quot;</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">./app</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">file:///home/user/app</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">--editable</span> <span class="pre">&quot;git+https://example.com/repo/app.git#egg=app&amp;subdirectory=setup&quot;</span></code> (in which case, <code class="docutils literal notranslate"><span class="pre">url</span></code> will be the local directory where the git repository has been cloned to, and <code class="docutils literal notranslate"><span class="pre">dir_info</span></code> will be present with <code class="docutils literal notranslate"><span class="pre">&quot;editable&quot;:</span> <span class="pre">true</span></code> and no <code class="docutils literal notranslate"><span class="pre">vcs_info</span></code> will be set)</li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">-e</span> <span class="pre">./app</span></code></li> </ul> <p>Commands that generate a <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file but do not generate a <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> file:</p> <ul class="simple"> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">app</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">app~=2.2.0</span></code></li> <li><code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">app</span> <span class="pre">--no-index</span> <span class="pre">--find-links</span> <span class="pre">&quot;https://example.com/&quot;</span></code></li> </ul> <p>This behaviour can be tested using changes to pip implemented in the PR <a class="reference external" href="https://github.com/pypa/pip/pull/11865">pypa/pip#11865</a>.</p> </section> </section> <section id="reference-implementation"> <h2><a class="toc-backref" href="#reference-implementation" role="doc-backlink">Reference Implementation</a></h2> <p>A proof-of-concept for creating the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> metadata file when installing a Python <a class="reference external" href="https://packaging.python.org/en/latest/glossary/#term-Distribution-Package" title="(in Python Packaging User Guide)"><span class="xref std std-term">Distribution Package</span></a> is available in the PR to pip <a class="reference external" href="https://github.com/pypa/pip/pull/11865">pypa/pip#11865</a>. It reuses the already available implementation for the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">direct URL data structure</span></a> to provide the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> metadata file for cases when <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> is not created.</p> <p>A reference implementation for supporting the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file in PDM exists is available in <a class="reference external" href="https://github.com/pdm-project/pdm/pull/3013">pdm-project/pdm#3013</a>.</p> <p>A prototype called <a class="reference external" href="https://pypi.org/project/pip-preserve/">pip-preserve</a> was developed to demonstrate creation of <code class="docutils literal notranslate"><span class="pre">requirements.txt</span></code> files considering <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> and <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> metadata files. This tool mimics the <code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">freeze</span></code> functionality, but the listing of installed packages also includes the hashes of the Python distribution artifacts.</p> <p>To further support this proposal, <a class="reference external" href="https://github.com/sethmlarson/pip-sbom">pip-sbom</a> demonstrates creation of SBOM in the SPDX format. The tool uses information stored in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file.</p> </section> <section id="rejected-ideas"> <h2><a class="toc-backref" href="#rejected-ideas" role="doc-backlink">Rejected Ideas</a></h2> <section id="naming-the-file-direct-url-json-instead-of-provenance-url-json"> <h3><a class="toc-backref" href="#naming-the-file-direct-url-json-instead-of-provenance-url-json" role="doc-backlink">Naming the file direct_url.json instead of provenance_url.json</a></h3> <p>To preserve backwards compatibility with the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Recording the Direct URL Origin of installed distributions</span></a>, the file cannot be named <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code>, as per the text of that specification:</p> <blockquote> <div>This file MUST NOT be created when installing a distribution from an other type of requirement (i.e. name plus version specifier).</div></blockquote> <p>Such a change might introduce backwards compatibility issues for consumers of <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> who rely on its presence only when distributions are installed using a direct URL reference.</p> </section> <section id="deprecating-direct-url-json-and-using-only-provenance-url-json"> <h3><a class="toc-backref" href="#deprecating-direct-url-json-and-using-only-provenance-url-json" role="doc-backlink">Deprecating direct_url.json and using only provenance_url.json</a></h3> <p>File <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> is already well established by the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Direct URL Data Structure</span></a> specification and is already used by installers. For example, <code class="docutils literal notranslate"><span class="pre">pip</span></code> uses <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> to report a direct URL reference on <code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">freeze</span></code>. Deprecating <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> would require additional changes to the <code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">freeze</span></code> implementation in pip (see PR <a class="reference external" href="https://github.com/fridex/pip/pull/2/">fridex/pip#2</a>) and could introduce backwards compatibility issues for already existing <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> consumers.</p> </section> <section id="keeping-the-hash-key-in-the-archive-info-dictionary"> <h3><a class="toc-backref" href="#keeping-the-hash-key-in-the-archive-info-dictionary" role="doc-backlink">Keeping the hash key in the archive_info dictionary</a></h3> <p><a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Direct URL Data Structure</span></a> specification discusses the possibility to include the <code class="docutils literal notranslate"><span class="pre">hash</span></code> key alongside the <code class="docutils literal notranslate"><span class="pre">hashes</span></code> key in the <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> dictionary. This PEP explicitly does not include the <code class="docutils literal notranslate"><span class="pre">hash</span></code> key in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file and allows only the <code class="docutils literal notranslate"><span class="pre">hashes</span></code> key to be present. By doing so we eliminate possible redundancy in the file, possible confusion, and any additional checks that would need to be done to make sure the hashes are in sync.</p> </section> <section id="allowing-no-hashes-stated"> <h3><a class="toc-backref" href="#allowing-no-hashes-stated" role="doc-backlink">Allowing no hashes stated</a></h3> <p>For cases when a wheel file is installed from pip’s cache and built using an older version of pip, pip does not record hashes of the downloaded source distributions. As we do not have hashes of these downloaded source distributions, the <code class="docutils literal notranslate"><span class="pre">hashes</span></code> key in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file would not contain any entries. In such cases, pip does not create any <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file as the provenance information is not complete. It is encouraged for consumers to rebuild wheels with a newer version of pip in these cases.</p> </section> <section id="making-the-hashes-key-optional"> <h3><a class="toc-backref" href="#making-the-hashes-key-optional" role="doc-backlink">Making the hashes key optional</a></h3> <p><a class="pep reference internal" href="../pep-0610/" title="PEP 610 – Recording the Direct URL Origin of installed distributions">PEP 610</a> and <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">its corresponding canonical PyPA spec</span></a> recommend including the <code class="docutils literal notranslate"><span class="pre">hashes</span></code> key of the <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> in the <code class="docutils literal notranslate"><span class="pre">direct_url.json</span></code> file but it is not required (per the <span class="target" id="index-2"></span><a class="rfc reference external" href="https://datatracker.ietf.org/doc/html/rfc2119.html"><strong>RFC 2119</strong></a> language):</p> <blockquote> <div>A hashes key SHOULD be present as a dictionary mapping a hash name to a hex encoded digest of the file.</div></blockquote> <p>This PEP requires the <code class="docutils literal notranslate"><span class="pre">hashes</span></code> key be included in <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file if that file is created; per this PEP:</p> <blockquote> <div>The value of <code class="docutils literal notranslate"><span class="pre">archive_info</span></code> MUST be a dictionary with a single key <code class="docutils literal notranslate"><span class="pre">hashes</span></code>.</div></blockquote> <p>By doing so, consumers of <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> can check artifact digests when the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file is created by installers.</p> </section> <section id="storing-index-url"> <h3><a class="toc-backref" href="#storing-index-url" role="doc-backlink">Storing index URL</a></h3> <p>A possibility was raised for storing the index URL as part of the file content. This index URL would represent the index configured in pip’s configuration or specified using the <code class="docutils literal notranslate"><span class="pre">--index-url</span></code> or <code class="docutils literal notranslate"><span class="pre">--extra-index-url</span></code> options. Storing this information was considered confusing, especially when using other installation options like <code class="docutils literal notranslate"><span class="pre">--find-links</span></code>. Since the actual index URL is not strictly bound to the location from which the wheel file was downloaded, we decided not to store the index URL in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file.</p> </section> </section> <section id="open-issues"> <h2><a class="toc-backref" href="#open-issues" role="doc-backlink">Open Issues</a></h2> <section id="availability-of-the-provenance-url-json-file-in-conda"> <h3><a class="toc-backref" href="#availability-of-the-provenance-url-json-file-in-conda" role="doc-backlink">Availability of the provenance_url.json file in Conda</a></h3> <p>We would like to get feedback on the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file from the Conda maintainers. It is not clear whether Conda would like to adopt the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file. Conda already stores provenance related information (similar to the provenance information proposed in this PEP) in JSON files located in the <code class="docutils literal notranslate"><span class="pre">conda-meta</span></code> directory <a class="reference external" href="https://conda.io/projects/conda/en/latest/dev-guide/deep-dives/install.html">following its actions during installation</a>.</p> </section> <section id="using-provenance-url-json-in-downstream-installers"> <h3><a class="toc-backref" href="#using-provenance-url-json-in-downstream-installers" role="doc-backlink">Using provenance_url.json in downstream installers</a></h3> <p>The proposed <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file was meant to be adopted primarily by Python installers. Other installers, such as APT or DNF, might record the provenance of the installed downstream Python distributions in their own way specific to downstream package management. The proposed file is not expected to be created by these downstream package installers and thus they were intentionally left out of this PEP. However, any input by developers or maintainers of these installers is valuable to possibly enrich the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file with information that would help in some way.</p> </section> </section> <section id="appendix-survey-of-installers-and-libraries"> <span id="tool-survey"></span><h2><a class="toc-backref" href="#appendix-survey-of-installers-and-libraries" role="doc-backlink">Appendix: Survey of installers and libraries</a></h2> <section id="pip"> <h3><a class="toc-backref" href="#pip" role="doc-backlink">pip</a></h3> <p>The function from pip’s internal API responsible for installing wheels, named <a class="reference external" href="https://github.com/pypa/pip/blob/10d9cbc601e5cadc45163452b1bc463d8ad2c1f7/src/pip/_internal/operations/install/wheel.py#L432">_install_wheel</a>, does not store any <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory. Additionally, a prototype introducing the mentioned file to pip in <a class="reference external" href="https://github.com/pypa/pip/pull/11865">pypa/pip#11865</a> demonstrates incorporating logic for handling the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file in pip’s source code.</p> <p>As pip is used by some of the tools mentioned below to install Python package distributions, findings for pip apply to these tools, as well as pip does not allow parametrizing creation of files in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory in its internal API. Most of the tools mentioned below that use pip invoke pip as a subprocess which has no effect on the eventual presence of the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory.</p> </section> <section id="distlib"> <h3><a class="toc-backref" href="#distlib" role="doc-backlink">distlib</a></h3> <p><a class="reference external" href="https://distlib.readthedocs.io/">distlib</a> implements low-level functionality to manipulate the <code class="docutils literal notranslate"><span class="pre">dist-info</span></code> directory. The database of installed distributions does not use any file named <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code>, based on <a class="reference external" href="https://github.com/pypa/distlib/blob/05375908c1b2d6b0e74bdeb574569d3609db9f56/distlib/database.py#L39-L40">the distlib’s source code</a>.</p> </section> <section id="pipenv"> <h3><a class="toc-backref" href="#pipenv" role="doc-backlink">Pipenv</a></h3> <p><a class="reference external" href="https://pipenv.pypa.io/">Pipenv</a> uses pip <a class="reference external" href="https://github.com/pypa/pipenv/blob/babd428d8ee3c5caeb818d746f715c02f338839b/pipenv/routines/install.py#L262">to install Python package distributions</a>. There wasn’t any additional identified logic that would cause backwards compatibility issues when introducing the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory.</p> </section> <section id="installer"> <h3><a class="toc-backref" href="#installer" role="doc-backlink">installer</a></h3> <p><a class="reference external" href="https://github.com/pypa/installer">installer</a> does not create a <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file explicitly. Nevertheless, as per the <a class="reference external" href="https://packaging.python.org/en/latest/specifications/recording-installed-packages/#recording-installed-packages" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Recording Installed Projects</span></a> specification, installer allows passing the <code class="docutils literal notranslate"><span class="pre">additional_metadata</span></code> argument to create a file in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory - see <a class="reference external" href="https://github.com/pypa/installer/blob/f89b5d93a643ef5e9858a6e3f450c83a57bbe1f1/src/installer/_core.py#L67">the source code</a>. To avoid any backwards compatibility issues, any library or tool using installer must not request creating the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file using the mentioned <code class="docutils literal notranslate"><span class="pre">additional_metadata</span></code> argument.</p> </section> <section id="poetry"> <h3><a class="toc-backref" href="#poetry" role="doc-backlink">Poetry</a></h3> <p>The installation logic in <a class="reference external" href="https://python-poetry.org/">Poetry</a> depends on the <code class="docutils literal notranslate"><span class="pre">installer.modern-installer</span></code> configuration option (<a class="reference external" href="https://python-poetry.org/docs/configuration#installermodern-installation">see docs</a>).</p> <p>For cases when the <code class="docutils literal notranslate"><span class="pre">installer.modern-installer</span></code> configuration option is set to <code class="docutils literal notranslate"><span class="pre">false</span></code>, Poetry uses <a class="reference external" href="https://github.com/python-poetry/poetry/blob/2b15ce10f02b0c6347fe2f12ae902488edeaaf7c/src/poetry/installation/executor.py#L543-L544">pip for installing Python package distributions</a>.</p> <p>On the other hand, when <code class="docutils literal notranslate"><span class="pre">installer.modern-installer</span></code> configuration option is set to <code class="docutils literal notranslate"><span class="pre">true</span></code>, Poetry uses <a class="reference external" href="https://github.com/python-poetry/poetry/blob/2b15ce10f02b0c6347fe2f12ae902488edeaaf7c/src/poetry/installation/wheel_installer.py#L99-L109">installer to install Python package distributions</a>. As can be seen from the linked sources, there isn’t passed any additional metadata file named <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> that would cause compatibility issues with this PEP.</p> </section> <section id="conda"> <h3><a class="toc-backref" href="#conda" role="doc-backlink">Conda</a></h3> <p><a class="reference external" href="https://docs.conda.io/">Conda</a> does not create any <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file <a class="reference external" href="https://github.com/conda/conda/blob/86e83925e17c68233ac659633bdc4d76b05a245a/conda/common/pkg_formats/python.py#L370-L390">when Python package distributions are installed</a>.</p> </section> <section id="hatch"> <h3><a class="toc-backref" href="#hatch" role="doc-backlink">Hatch</a></h3> <p><a class="reference external" href="https://hatch.pypa.io/">Hatch</a> uses pip <a class="reference external" href="https://github.com/pypa/hatch/blob/dd6e9545a355a0b5b58e065b489c1ef087e3bcaf/src/hatch/env/system.py#L28-L29">to install project dependencies</a>.</p> </section> <section id="micropipenv"> <h3><a class="toc-backref" href="#micropipenv" role="doc-backlink">micropipenv</a></h3> <p>As <a class="reference external" href="https://github.com/thoth-station/micropipenv">micropipenv</a> is a wrapper on top of pip, it uses pip to install Python distributions, for both <a class="reference external" href="https://github.com/thoth-station/micropipenv/blob/8176862ec96df23e152938659d6f45645246e398/micropipenv.py#L393">lock files</a> as well as <a class="reference external" href="https://github.com/thoth-station/micropipenv/blob/8176862ec96df23e152938659d6f45645246e398/micropipenv.py#L977">for requirements files</a>.</p> </section> <section id="thamos"> <h3><a class="toc-backref" href="#thamos" role="doc-backlink">Thamos</a></h3> <p><a class="reference external" href="https://github.com/thoth-station/thamos/">Thamos</a> uses micropipenv <a class="reference external" href="https://github.com/thoth-station/thamos/blob/234351025c77cfe28b0df07f7ee017469b57d3f4/thamos/lib.py#L1290">to install Python package distributions</a>, hence any findings for micropipenv apply for Thamos.</p> </section> <section id="pdm"> <h3><a class="toc-backref" href="#pdm" role="doc-backlink">PDM</a></h3> <p><a class="reference external" href="https://pdm.fming.dev/">PDM</a> uses installer <a class="reference external" href="https://github.com/pdm-project/pdm/blob/d39a8e5b36c37093ea31e666d0e55fe21b38c16b/src/pdm/installers/installers.py#L241">to install binary distributions</a>. The only additional metadata file it eventually creates in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory is <a class="reference external" href="https://github.com/pdm-project/pdm/blob/d39a8e5b36c37093ea31e666d0e55fe21b38c16b/src/pdm/installers/installers.py#L197">the REFER_TO file</a>.</p> </section> <section id="uv"> <h3><a class="toc-backref" href="#uv" role="doc-backlink">uv</a></h3> <p><a class="reference external" href="https://github.com/astral-sh/uv/">uv</a> is written in Rust and uses its <a class="reference external" href="https://github.com/astral-sh/uv/blob/2b9a4f673e829eb622881233bd11c2380a33efcb/crates/install-wheel-rs/src/linker.rs#L38">own installation logic when installing wheels</a>. It does not create any <a class="reference external" href="https://github.com/astral-sh/uv/blob/2b9a4f673e829eb622881233bd11c2380a33efcb/crates/install-wheel-rs/src/wheel.rs#L725">additional files</a> in the <code class="docutils literal notranslate"><span class="pre">.dist-info</span></code> directory that would collide with the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file naming.</p> </section> </section> <section id="acknowledgements"> <h2><a class="toc-backref" href="#acknowledgements" role="doc-backlink">Acknowledgements</a></h2> <p>Thanks to Dustin Ingram, Brett Cannon, and Paul Moore for the initial discussion in which this idea originated.</p> <p>Thanks to Donald Stufft, Ofek Lev, and Trishank Kuppusamy for early feedback and support to work on this PEP.</p> <p>Thanks to Gregory P. Smith, Stéphane Bidoul, and C.A.M. Gerlach for reviewing this PEP and providing valuable suggestions.</p> <p>Thanks to Seth Michael Larson for providing valuable suggestions and for the proposed pip-sbom prototype.</p> <p>Thanks to Stéphane Bidoul and Chris Jerdonek for <a class="pep reference internal" href="../pep-0610/" title="PEP 610 – Recording the Direct URL Origin of installed distributions">PEP 610</a>, and related <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url/#direct-url" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Recording the Direct URL Origin of installed distributions</span></a> and <a class="reference external" href="https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#direct-url-data-structure" title="(in Python Packaging User Guide)"><span class="xref std std-ref">Direct URL Data Structure</span></a> specifications.</p> <p>Thanks to Frost Ming for raising possible concern around storing index URL in the <code class="docutils literal notranslate"><span class="pre">provenance_url.json</span></code> file and initial PEP 710 support in PDM.</p> <p>Last, but not least, thanks to Donald Stufft for sponsoring this PEP.</p> </section> <section id="copyright"> <h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2> <p>This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.</p> </section> </section> <hr class="docutils" /> <p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0710.rst">https://github.com/python/peps/blob/main/peps/pep-0710.rst</a></p> <p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0710.rst">2025-02-01 08:55:40 GMT</a></p> </article> <nav id="pep-sidebar"> <h2>Contents</h2> <ul> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#motivation">Motivation</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#specification">Specification</a></li> <li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a><ul> <li><a class="reference internal" href="#presence-of-provenance-url-json-in-installers-and-libraries">Presence of provenance_url.json in installers and libraries</a></li> <li><a class="reference internal" href="#compatibility-with-direct-url-json">Compatibility with direct_url.json</a></li> </ul> </li> <li><a class="reference internal" href="#security-implications">Security Implications</a></li> <li><a class="reference internal" href="#how-to-teach-this">How to Teach This</a></li> <li><a class="reference internal" href="#examples">Examples</a><ul> <li><a class="reference internal" href="#examples-of-a-valid-provenance-url-json">Examples of a valid provenance_url.json</a></li> <li><a class="reference internal" href="#examples-of-an-invalid-provenance-url-json">Examples of an invalid provenance_url.json</a></li> <li><a class="reference internal" href="#example-pip-commands-and-their-effect-on-provenance-url-json-and-direct-url-json">Example pip commands and their effect on provenance_url.json and direct_url.json</a></li> </ul> </li> <li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li> <li><a class="reference internal" href="#rejected-ideas">Rejected Ideas</a><ul> <li><a class="reference internal" href="#naming-the-file-direct-url-json-instead-of-provenance-url-json">Naming the file direct_url.json instead of provenance_url.json</a></li> <li><a class="reference internal" href="#deprecating-direct-url-json-and-using-only-provenance-url-json">Deprecating direct_url.json and using only provenance_url.json</a></li> <li><a class="reference internal" href="#keeping-the-hash-key-in-the-archive-info-dictionary">Keeping the hash key in the archive_info dictionary</a></li> <li><a class="reference internal" href="#allowing-no-hashes-stated">Allowing no hashes stated</a></li> <li><a class="reference internal" href="#making-the-hashes-key-optional">Making the hashes key optional</a></li> <li><a class="reference internal" href="#storing-index-url">Storing index URL</a></li> </ul> </li> <li><a class="reference internal" href="#open-issues">Open Issues</a><ul> <li><a class="reference internal" href="#availability-of-the-provenance-url-json-file-in-conda">Availability of the provenance_url.json file in Conda</a></li> <li><a class="reference internal" href="#using-provenance-url-json-in-downstream-installers">Using provenance_url.json in downstream installers</a></li> </ul> </li> <li><a class="reference internal" href="#appendix-survey-of-installers-and-libraries">Appendix: Survey of installers and libraries</a><ul> <li><a class="reference internal" href="#pip">pip</a></li> <li><a class="reference internal" href="#distlib">distlib</a></li> <li><a class="reference internal" href="#pipenv">Pipenv</a></li> <li><a class="reference internal" href="#installer">installer</a></li> <li><a class="reference internal" href="#poetry">Poetry</a></li> <li><a class="reference internal" href="#conda">Conda</a></li> <li><a class="reference internal" href="#hatch">Hatch</a></li> <li><a class="reference internal" href="#micropipenv">micropipenv</a></li> <li><a class="reference internal" href="#thamos">Thamos</a></li> <li><a class="reference internal" href="#pdm">PDM</a></li> <li><a class="reference internal" href="#uv">uv</a></li> </ul> </li> <li><a class="reference internal" href="#acknowledgements">Acknowledgements</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> <br> <a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0710.rst">Page Source (GitHub)</a> </nav> </section> <script src="../_static/colour_scheme.js"></script> <script src="../_static/wrap_tables.js"></script> <script src="../_static/sticky_banner.js"></script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10