CINXE.COM
Analyzing PyPI package downloads - Python Packaging User Guide
<!doctype html> <html class="no-js" lang="en" data-content_root="../../"> <head><meta charset="utf-8"/> <meta name="viewport" content="width=device-width,initial-scale=1"/> <meta name="color-scheme" content="light dark"><meta name="viewport" content="width=device-width, initial-scale=1" /> <link rel="index" title="Index" href="../../genindex/" /><link rel="search" title="Search" href="../../search/" /><link rel="next" title="Discussions" href="../../discussions/" /><link rel="prev" title="Tool recommendations" href="../tool-recommendations/" /> <link rel="shortcut icon" href="../../_static/py.png"/><!-- Generated with Sphinx 7.2.6 and Furo 2023.09.10 --> <title>Analyzing PyPI package downloads - Python Packaging User Guide</title> <link rel="stylesheet" type="text/css" href="../../_static/pygments.css?v=a746c00c" /> <link rel="stylesheet" type="text/css" href="../../_static/styles/furo.css?v=135e06be" /> <link rel="stylesheet" type="text/css" href="../../_static/tabs.css?v=4c969af8" /> <link rel="stylesheet" type="text/css" href="../../_static/copybutton.css?v=76b2166b" /> <link rel="stylesheet" type="text/css" href="../../_static/styles/furo-extensions.css?v=36a5483c" /> <style> body { --color-code-background: #f8f8f8; --color-code-foreground: black; } @media not print { body[data-theme="dark"] { --color-code-background: #202020; --color-code-foreground: #d0d0d0; } @media (prefers-color-scheme: dark) { body:not([data-theme="light"]) { --color-code-background: #202020; --color-code-foreground: #d0d0d0; } } } </style><script async type="text/javascript" src="/_/static/javascript/readthedocs-addons.js"></script><meta name="readthedocs-project-slug" content="python-packaging-user-guide" /><meta name="readthedocs-version-slug" content="latest" /><meta name="readthedocs-resolver-filename" content="/guides/analyzing-pypi-package-downloads/" /><meta name="readthedocs-http-status" content="200" /></head> <body> <script> document.body.dataset.theme = localStorage.getItem("theme") || "auto"; </script> <svg xmlns="http://www.w3.org/2000/svg" style="display: none;"> <symbol id="svg-toc" viewBox="0 0 24 24"> <title>Contents</title> <svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 1024 1024"> <path d="M408 442h480c4.4 0 8-3.6 8-8v-56c0-4.4-3.6-8-8-8H408c-4.4 0-8 3.6-8 8v56c0 4.4 3.6 8 8 8zm-8 204c0 4.4 3.6 8 8 8h480c4.4 0 8-3.6 8-8v-56c0-4.4-3.6-8-8-8H408c-4.4 0-8 3.6-8 8v56zm504-486H120c-4.4 0-8 3.6-8 8v56c0 4.4 3.6 8 8 8h784c4.4 0 8-3.6 8-8v-56c0-4.4-3.6-8-8-8zm0 632H120c-4.4 0-8 3.6-8 8v56c0 4.4 3.6 8 8 8h784c4.4 0 8-3.6 8-8v-56c0-4.4-3.6-8-8-8zM115.4 518.9L271.7 642c5.8 4.6 14.4.5 14.4-6.9V388.9c0-7.4-8.5-11.5-14.4-6.9L115.4 505.1a8.74 8.74 0 0 0 0 13.8z"/> </svg> </symbol> <symbol id="svg-menu" viewBox="0 0 24 24"> <title>Menu</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather-menu"> <line x1="3" y1="12" x2="21" y2="12"></line> <line x1="3" y1="6" x2="21" y2="6"></line> <line x1="3" y1="18" x2="21" y2="18"></line> </svg> </symbol> <symbol id="svg-arrow-right" viewBox="0 0 24 24"> <title>Expand</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather-chevron-right"> <polyline points="9 18 15 12 9 6"></polyline> </svg> </symbol> <symbol id="svg-sun" viewBox="0 0 24 24"> <title>Light mode</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" class="feather-sun"> <circle cx="12" cy="12" r="5"></circle> <line x1="12" y1="1" x2="12" y2="3"></line> <line x1="12" y1="21" x2="12" y2="23"></line> <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line> <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line> <line x1="1" y1="12" x2="3" y2="12"></line> <line x1="21" y1="12" x2="23" y2="12"></line> <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line> <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line> </svg> </symbol> <symbol id="svg-moon" viewBox="0 0 24 24"> <title>Dark mode</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" class="icon-tabler-moon"> <path stroke="none" d="M0 0h24v24H0z" fill="none" /> <path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z" /> </svg> </symbol> <symbol id="svg-sun-half" viewBox="0 0 24 24"> <title>Auto light/dark mode</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" class="icon-tabler-shadow"> <path stroke="none" d="M0 0h24v24H0z" fill="none"/> <circle cx="12" cy="12" r="9" /> <path d="M13 12h5" /> <path d="M13 15h4" /> <path d="M13 18h1" /> <path d="M13 9h4" /> <path d="M13 6h1" /> </svg> </symbol> </svg> <input type="checkbox" class="sidebar-toggle" name="__navigation" id="__navigation"> <input type="checkbox" class="sidebar-toggle" name="__toc" id="__toc"> <label class="overlay sidebar-overlay" for="__navigation"> <div class="visually-hidden">Hide navigation sidebar</div> </label> <label class="overlay toc-overlay" for="__toc"> <div class="visually-hidden">Hide table of contents sidebar</div> </label> <div class="page"> <header class="mobile-header"> <div class="header-left"> <label class="nav-overlay-icon" for="__navigation"> <div class="visually-hidden">Toggle site navigation sidebar</div> <i class="icon"><svg><use href="#svg-menu"></use></svg></i> </label> </div> <div class="header-center"> <a href="../../"><div class="brand">Python Packaging User Guide</div></a> </div> <div class="header-right"> <div class="theme-toggle-container theme-toggle-header"> <button class="theme-toggle"> <div class="visually-hidden">Toggle Light / Dark / Auto color theme</div> <svg class="theme-icon-when-auto"><use href="#svg-sun-half"></use></svg> <svg class="theme-icon-when-dark"><use href="#svg-moon"></use></svg> <svg class="theme-icon-when-light"><use href="#svg-sun"></use></svg> </button> </div> <label class="toc-overlay-icon toc-header-icon" for="__toc"> <div class="visually-hidden">Toggle table of contents sidebar</div> <i class="icon"><svg><use href="#svg-toc"></use></svg></i> </label> </div> </header> <aside class="sidebar-drawer"> <div class="sidebar-container"> <div class="sidebar-sticky"><a class="sidebar-brand" href="../../"> <span class="sidebar-brand-text">Python Packaging User Guide</span> </a><form class="sidebar-search-container" method="get" action="../../search/" role="search"> <input class="sidebar-search" placeholder="Search" name="q" aria-label="Search"> <input type="hidden" name="check_keywords" value="yes"> <input type="hidden" name="area" value="default"> </form> <div id="searchbox"></div><div class="sidebar-scroll"><div class="sidebar-tree"> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../../overview/">Overview of Python Packaging</a></li> <li class="toctree-l1"><a class="reference internal" href="../../flow/">The Packaging Flow</a></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../../tutorials/">Tutorials</a><input class="toctree-checkbox" id="toctree-checkbox-1" name="toctree-checkbox-1" role="switch" type="checkbox"/><label for="toctree-checkbox-1"><div class="visually-hidden">Toggle navigation of Tutorials</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l2"><a class="reference internal" href="../../tutorials/installing-packages/">Installing Packages</a></li> <li class="toctree-l2"><a class="reference internal" href="../../tutorials/managing-dependencies/">Managing Application Dependencies</a></li> <li class="toctree-l2"><a class="reference internal" href="../../tutorials/packaging-projects/">Packaging Python Projects</a></li> </ul> </li> <li class="toctree-l1 current has-children"><a class="reference internal" href="../">Guides</a><input checked="" class="toctree-checkbox" id="toctree-checkbox-2" name="toctree-checkbox-2" role="switch" type="checkbox"/><label for="toctree-checkbox-2"><div class="visually-hidden">Toggle navigation of Guides</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul class="current"> <li class="toctree-l2 has-children"><a class="reference internal" href="../section-install/">Installation</a><input class="toctree-checkbox" id="toctree-checkbox-3" name="toctree-checkbox-3" role="switch" type="checkbox"/><label for="toctree-checkbox-3"><div class="visually-hidden">Toggle navigation of Installation</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../installing-using-pip-and-virtual-environments/">Install packages in a virtual environment using pip and venv</a></li> <li class="toctree-l3"><a class="reference internal" href="../installing-using-virtualenv/">Installing packages using virtualenv</a></li> <li class="toctree-l3"><a class="reference internal" href="../installing-stand-alone-command-line-tools/">Installing stand alone command line tools</a></li> <li class="toctree-l3"><a class="reference internal" href="../installing-using-linux-tools/">Installing pip/setuptools/wheel with Linux Package Managers</a></li> <li class="toctree-l3"><a class="reference internal" href="../installing-scientific-packages/">Installing scientific packages</a></li> </ul> </li> <li class="toctree-l2 has-children"><a class="reference internal" href="../section-build-and-publish/">Building and Publishing</a><input class="toctree-checkbox" id="toctree-checkbox-4" name="toctree-checkbox-4" role="switch" type="checkbox"/><label for="toctree-checkbox-4"><div class="visually-hidden">Toggle navigation of Building and Publishing</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../writing-pyproject-toml/">Writing your <code class="docutils literal notranslate"><span class="pre">pyproject.toml</span></code></a></li> <li class="toctree-l3"><a class="reference internal" href="../distributing-packages-using-setuptools/">Packaging and distributing projects</a></li> <li class="toctree-l3"><a class="reference internal" href="../dropping-older-python-versions/">Dropping support for older Python versions</a></li> <li class="toctree-l3"><a class="reference internal" href="../packaging-binary-extensions/">Packaging binary extensions</a></li> <li class="toctree-l3"><a class="reference internal" href="../packaging-namespace-packages/">Packaging namespace packages</a></li> <li class="toctree-l3"><a class="reference internal" href="../creating-command-line-tools/">Creating and packaging command-line tools</a></li> <li class="toctree-l3"><a class="reference internal" href="../creating-and-discovering-plugins/">Creating and discovering plugins</a></li> <li class="toctree-l3"><a class="reference internal" href="../using-testpypi/">Using TestPyPI</a></li> <li class="toctree-l3"><a class="reference internal" href="../making-a-pypi-friendly-readme/">Making a PyPI-friendly README</a></li> <li class="toctree-l3"><a class="reference internal" href="../publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/">Publishing package distribution releases using GitHub Actions CI/CD workflows</a></li> <li class="toctree-l3"><a class="reference internal" href="../modernize-setup-py-project/">How to modernize a <code class="docutils literal notranslate"><span class="pre">setup.py</span></code> based project?</a></li> </ul> </li> <li class="toctree-l2 has-children"><a class="reference internal" href="../section-hosting/">Hosting</a><input class="toctree-checkbox" id="toctree-checkbox-5" name="toctree-checkbox-5" role="switch" type="checkbox"/><label for="toctree-checkbox-5"><div class="visually-hidden">Toggle navigation of Hosting</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../index-mirrors-and-caches/">Package index mirrors and caches</a></li> <li class="toctree-l3"><a class="reference internal" href="../hosting-your-own-index/">Hosting your own simple repository</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../tool-recommendations/">Tool recommendations</a></li> <li class="toctree-l2 current current-page"><a class="current reference internal" href="#">Analyzing PyPI package downloads</a></li> </ul> </li> <li class="toctree-l1 has-children"><a class="reference internal" href="../../discussions/">Discussions</a><input class="toctree-checkbox" id="toctree-checkbox-6" name="toctree-checkbox-6" role="switch" type="checkbox"/><label for="toctree-checkbox-6"><div class="visually-hidden">Toggle navigation of Discussions</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l2"><a class="reference internal" href="../../discussions/versioning/">Versioning</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/deploying-python-applications/">Deploying Python applications</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/pip-vs-easy-install/">pip vs easy_install</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/install-requires-vs-requirements/">install_requires vs requirements files</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/distribution-package-vs-import-package/">Distribution package vs. import package</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/package-formats/">Package Formats</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/src-layout-vs-flat-layout/">src layout vs flat layout</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/setup-py-deprecated/">Is <code class="docutils literal notranslate"><span class="pre">setup.py</span></code> deprecated?</a></li> <li class="toctree-l2"><a class="reference internal" href="../../discussions/single-source-version/">Single-sourcing the Project Version</a></li> </ul> </li> <li class="toctree-l1 has-children"><a class="reference internal" href="../../specifications/">PyPA specifications</a><input class="toctree-checkbox" id="toctree-checkbox-7" name="toctree-checkbox-7" role="switch" type="checkbox"/><label for="toctree-checkbox-7"><div class="visually-hidden">Toggle navigation of PyPA specifications</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l2 has-children"><a class="reference internal" href="../../specifications/section-distribution-metadata/">Package Distribution Metadata</a><input class="toctree-checkbox" id="toctree-checkbox-8" name="toctree-checkbox-8" role="switch" type="checkbox"/><label for="toctree-checkbox-8"><div class="visually-hidden">Toggle navigation of Package Distribution Metadata</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../../specifications/name-normalization/">Names and normalization</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/core-metadata/">Core metadata specifications</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/version-specifiers/">Version specifiers</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/dependency-specifiers/">Dependency specifiers</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/pyproject-toml/"><code class="docutils literal notranslate"><span class="pre">pyproject.toml</span></code> specification</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/inline-script-metadata/">Inline script metadata</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/platform-compatibility-tags/">Platform compatibility tags</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/well-known-project-urls/">Well-known Project URLs in Metadata</a></li> </ul> </li> <li class="toctree-l2 has-children"><a class="reference internal" href="../../specifications/section-installation-metadata/">Package Installation Metadata</a><input class="toctree-checkbox" id="toctree-checkbox-9" name="toctree-checkbox-9" role="switch" type="checkbox"/><label for="toctree-checkbox-9"><div class="visually-hidden">Toggle navigation of Package Installation Metadata</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../../specifications/recording-installed-packages/">Recording installed projects</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/entry-points/">Entry points specification</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/direct-url/">Recording the Direct URL Origin of installed distributions</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/direct-url-data-structure/">Direct URL Data Structure</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/virtual-environments/">Python Virtual Environments</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/externally-managed-environments/">Externally Managed Environments</a></li> </ul> </li> <li class="toctree-l2 has-children"><a class="reference internal" href="../../specifications/section-distribution-formats/">Package Distribution File Formats</a><input class="toctree-checkbox" id="toctree-checkbox-10" name="toctree-checkbox-10" role="switch" type="checkbox"/><label for="toctree-checkbox-10"><div class="visually-hidden">Toggle navigation of Package Distribution File Formats</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../../specifications/source-distribution-format/">Source distribution format</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/binary-distribution-format/">Binary distribution format</a></li> </ul> </li> <li class="toctree-l2 has-children"><a class="reference internal" href="../../specifications/section-package-indices/">Package Index Interfaces</a><input class="toctree-checkbox" id="toctree-checkbox-11" name="toctree-checkbox-11" role="switch" type="checkbox"/><label for="toctree-checkbox-11"><div class="visually-hidden">Toggle navigation of Package Index Interfaces</div><i class="icon"><svg><use href="#svg-arrow-right"></use></svg></i></label><ul> <li class="toctree-l3"><a class="reference internal" href="../../specifications/pypirc/">The <code class="file docutils literal notranslate"><span class="pre">.pypirc</span></code> file</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/simple-repository-api/">Simple repository API</a></li> <li class="toctree-l3"><a class="reference internal" href="../../specifications/index-hosted-attestations/">Index hosted attestations</a></li> </ul> </li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../../key_projects/">Project Summaries</a></li> <li class="toctree-l1"><a class="reference internal" href="../../glossary/">Glossary</a></li> <li class="toctree-l1"><a class="reference internal" href="../../support/">How to Get Support</a></li> <li class="toctree-l1"><a class="reference internal" href="../../contribute/">Contribute to this guide</a></li> <li class="toctree-l1"><a class="reference internal" href="../../news/">News</a></li> </ul> </div> </div> </div> </div> </aside> <div class="main"> <div class="content"> <div class="article-container"> <a href="#" class="back-to-top muted-link"> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"> <path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8v12z"></path> </svg> <span>Back to top</span> </a> <div class="content-icon-container"> <div class="theme-toggle-container theme-toggle-content"> <button class="theme-toggle"> <div class="visually-hidden">Toggle Light / Dark / Auto color theme</div> <svg class="theme-icon-when-auto"><use href="#svg-sun-half"></use></svg> <svg class="theme-icon-when-dark"><use href="#svg-moon"></use></svg> <svg class="theme-icon-when-light"><use href="#svg-sun"></use></svg> </button> </div> <label class="toc-overlay-icon toc-content-icon" for="__toc"> <div class="visually-hidden">Toggle table of contents sidebar</div> <i class="icon"><svg><use href="#svg-toc"></use></svg></i> </label> </div> <article role="main"> <section id="analyzing-pypi-package-downloads"> <span id="id1"></span><h1>Analyzing PyPI package downloads<a class="headerlink" href="#analyzing-pypi-package-downloads" title="Link to this heading">#</a></h1> <p>This section covers how to use the public PyPI download statistics dataset to learn more about downloads of a package (or packages) hosted on PyPI. For example, you can use it to discover the distribution of Python versions used to download a package.</p> <section id="background"> <h2>Background<a class="headerlink" href="#background" title="Link to this heading">#</a></h2> <p>PyPI does not display download statistics for a number of reasons: <a class="footnote-reference brackets" href="#id4" id="id2" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p> <ul class="simple"> <li><p><strong>Inefficient to make work with a Content Distribution Network (CDN):</strong> Download statistics change constantly. Including them in project pages, which are heavily cached, would require invalidating the cache more often, and reduce the overall effectiveness of the cache.</p></li> <li><p><strong>Highly inaccurate:</strong> A number of things prevent the download counts from being accurate, some of which include:</p> <ul> <li><p><code class="docutils literal notranslate"><span class="pre">pip</span></code>’s download cache (lowers download counts)</p></li> <li><p>Internal or unofficial mirrors (can both raise or lower download counts)</p></li> <li><p>Packages not hosted on PyPI (for comparisons sake)</p></li> <li><p>Unofficial scripts or attempts at download count inflation (raises download counts)</p></li> <li><p>Known historical data quality issues (lowers download counts)</p></li> </ul> </li> <li><p><strong>Not particularly useful:</strong> Just because a project has been downloaded a lot doesn’t mean it’s good; Similarly just because a project hasn’t been downloaded a lot doesn’t mean it’s bad!</p></li> </ul> <p>In short, because its value is low for various reasons, and the tradeoffs required to make it work are high, it has been not an effective use of limited resources.</p> </section> <section id="public-dataset"> <h2>Public dataset<a class="headerlink" href="#public-dataset" title="Link to this heading">#</a></h2> <p>As an alternative, the <a class="reference external" href="https://github.com/pypa/linehaul-cloud-function/">Linehaul project</a> streams download logs from PyPI to <a class="reference external" href="https://cloud.google.com/bigquery">Google BigQuery</a> <a class="footnote-reference brackets" href="#id5" id="id3" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>, where they are stored as a public dataset.</p> <section id="getting-set-up"> <h3>Getting set up<a class="headerlink" href="#getting-set-up" title="Link to this heading">#</a></h3> <p>In order to use <a class="reference external" href="https://cloud.google.com/bigquery">Google BigQuery</a> to query the <a class="reference external" href="https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=pypi&page=dataset">public PyPI download statistics dataset</a>, you’ll need a Google account and to enable the BigQuery API on a Google Cloud Platform project. You can run up to 1TB of queries per month <a class="reference external" href="https://cloud.google.com/blog/products/data-analytics/query-without-a-credit-card-introducing-bigquery-sandbox">using the BigQuery free tier without a credit card</a></p> <ul class="simple"> <li><p>Navigate to the <a class="reference external" href="https://console.cloud.google.com/bigquery">BigQuery web UI</a>.</p></li> <li><p>Create a new project.</p></li> <li><p>Enable the <a class="reference external" href="https://console.developers.google.com/apis/library/bigquery-json.googleapis.com">BigQuery API</a>.</p></li> </ul> <p>For more detailed instructions on how to get started with BigQuery, check out the <a class="reference external" href="https://cloud.google.com/bigquery/docs/quickstarts/quickstart-web-ui">BigQuery quickstart guide</a>.</p> </section> <section id="data-schema"> <h3>Data schema<a class="headerlink" href="#data-schema" title="Link to this heading">#</a></h3> <p>Linehaul writes an entry in a <code class="docutils literal notranslate"><span class="pre">bigquery-public-data.pypi.file_downloads</span></code> table for each download. The table contains information about what file was downloaded and how it was downloaded. Some useful columns from the <a class="reference external" href="https://console.cloud.google.com/bigquery?pli=1&p=bigquery-public-data&d=pypi&t=file_downloads&page=table">table schema</a> include:</p> <div class="table-wrapper docutils container"> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head"><p>Column</p></th> <th class="head"><p>Description</p></th> <th class="head"><p>Examples</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p>timestamp</p></td> <td><p>Date and time</p></td> <td><p><code class="docutils literal notranslate"><span class="pre">2020-03-09</span> <span class="pre">00:33:03</span> <span class="pre">UTC</span></code></p></td> </tr> <tr class="row-odd"><td><p>file.project</p></td> <td><p>Project name</p></td> <td><p><code class="docutils literal notranslate"><span class="pre">pipenv</span></code>, <code class="docutils literal notranslate"><span class="pre">nose</span></code></p></td> </tr> <tr class="row-even"><td><p>file.version</p></td> <td><p>Package version</p></td> <td><p><code class="docutils literal notranslate"><span class="pre">0.1.6</span></code>, <code class="docutils literal notranslate"><span class="pre">1.4.2</span></code></p></td> </tr> <tr class="row-odd"><td><p>details.installer.name</p></td> <td><p>Installer</p></td> <td><p>pip, <a class="reference internal" href="../../key_projects/#bandersnatch"><span class="std std-ref">bandersnatch</span></a></p></td> </tr> <tr class="row-even"><td><p>details.python</p></td> <td><p>Python version</p></td> <td><p><code class="docutils literal notranslate"><span class="pre">2.7.12</span></code>, <code class="docutils literal notranslate"><span class="pre">3.6.4</span></code></p></td> </tr> </tbody> </table> </div> </section> <section id="useful-queries"> <h3>Useful queries<a class="headerlink" href="#useful-queries" title="Link to this heading">#</a></h3> <p>Run queries in the <a class="reference external" href="https://console.cloud.google.com/bigquery">BigQuery web UI</a> by clicking the “Compose query” button.</p> <p>Note that the rows are stored in a partitioned table, which helps limit the cost of queries. These example queries analyze downloads from recent history by filtering on the <code class="docutils literal notranslate"><span class="pre">timestamp</span></code> column.</p> <section id="counting-package-downloads"> <h4>Counting package downloads<a class="headerlink" href="#counting-package-downloads" title="Link to this heading">#</a></h4> <p>The following query counts the total number of downloads for the project “pytest”.</p> <div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="o">#</span><span class="n">standardSQL</span> <span class="k">SELECT</span><span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">num_downloads</span> <span class="k">FROM</span><span class="w"> </span><span class="o">`</span><span class="n">bigquery</span><span class="o">-</span><span class="k">public</span><span class="o">-</span><span class="k">data</span><span class="p">.</span><span class="n">pypi</span><span class="p">.</span><span class="n">file_downloads</span><span class="o">`</span> <span class="k">WHERE</span><span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">project</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'pytest'</span> <span class="w"> </span><span class="c1">-- Only query the last 30 days of history</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="nb">DATE</span><span class="p">(</span><span class="k">timestamp</span><span class="p">)</span> <span class="w"> </span><span class="k">BETWEEN</span><span class="w"> </span><span class="n">DATE_SUB</span><span class="p">(</span><span class="k">CURRENT_DATE</span><span class="p">(),</span><span class="w"> </span><span class="nb">INTERVAL</span><span class="w"> </span><span class="mi">30</span><span class="w"> </span><span class="k">DAY</span><span class="p">)</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="k">CURRENT_DATE</span><span class="p">()</span> </pre></div> </div> <div class="table-wrapper docutils container"> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head"><p>num_downloads</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p>26190085</p></td> </tr> </tbody> </table> </div> <p>To count downloads from pip only, filter on the <code class="docutils literal notranslate"><span class="pre">details.installer.name</span></code> column.</p> <div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="o">#</span><span class="n">standardSQL</span> <span class="k">SELECT</span><span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">num_downloads</span> <span class="k">FROM</span><span class="w"> </span><span class="o">`</span><span class="n">bigquery</span><span class="o">-</span><span class="k">public</span><span class="o">-</span><span class="k">data</span><span class="p">.</span><span class="n">pypi</span><span class="p">.</span><span class="n">file_downloads</span><span class="o">`</span> <span class="k">WHERE</span><span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">project</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'pytest'</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">details</span><span class="p">.</span><span class="n">installer</span><span class="p">.</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'pip'</span> <span class="w"> </span><span class="c1">-- Only query the last 30 days of history</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="nb">DATE</span><span class="p">(</span><span class="k">timestamp</span><span class="p">)</span> <span class="w"> </span><span class="k">BETWEEN</span><span class="w"> </span><span class="n">DATE_SUB</span><span class="p">(</span><span class="k">CURRENT_DATE</span><span class="p">(),</span><span class="w"> </span><span class="nb">INTERVAL</span><span class="w"> </span><span class="mi">30</span><span class="w"> </span><span class="k">DAY</span><span class="p">)</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="k">CURRENT_DATE</span><span class="p">()</span> </pre></div> </div> <div class="table-wrapper docutils container"> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head"><p>num_downloads</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p>24334215</p></td> </tr> </tbody> </table> </div> </section> <section id="package-downloads-over-time"> <h4>Package downloads over time<a class="headerlink" href="#package-downloads-over-time" title="Link to this heading">#</a></h4> <p>To group by monthly downloads, use the <code class="docutils literal notranslate"><span class="pre">TIMESTAMP_TRUNC</span></code> function. Also filtering by this column reduces corresponding costs.</p> <div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="o">#</span><span class="n">standardSQL</span> <span class="k">SELECT</span> <span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">num_downloads</span><span class="p">,</span> <span class="w"> </span><span class="n">DATE_TRUNC</span><span class="p">(</span><span class="nb">DATE</span><span class="p">(</span><span class="k">timestamp</span><span class="p">),</span><span class="w"> </span><span class="k">MONTH</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="o">`</span><span class="k">month</span><span class="o">`</span> <span class="k">FROM</span><span class="w"> </span><span class="o">`</span><span class="n">bigquery</span><span class="o">-</span><span class="k">public</span><span class="o">-</span><span class="k">data</span><span class="p">.</span><span class="n">pypi</span><span class="p">.</span><span class="n">file_downloads</span><span class="o">`</span> <span class="k">WHERE</span> <span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">project</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'pytest'</span> <span class="w"> </span><span class="c1">-- Only query the last 6 months of history</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="nb">DATE</span><span class="p">(</span><span class="k">timestamp</span><span class="p">)</span> <span class="w"> </span><span class="k">BETWEEN</span><span class="w"> </span><span class="n">DATE_TRUNC</span><span class="p">(</span><span class="n">DATE_SUB</span><span class="p">(</span><span class="k">CURRENT_DATE</span><span class="p">(),</span><span class="w"> </span><span class="nb">INTERVAL</span><span class="w"> </span><span class="mi">6</span><span class="w"> </span><span class="k">MONTH</span><span class="p">),</span><span class="w"> </span><span class="k">MONTH</span><span class="p">)</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="k">CURRENT_DATE</span><span class="p">()</span> <span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="o">`</span><span class="k">month</span><span class="o">`</span> <span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="o">`</span><span class="k">month</span><span class="o">`</span><span class="w"> </span><span class="k">DESC</span> </pre></div> </div> <div class="table-wrapper docutils container"> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head"><p>num_downloads</p></th> <th class="head"><p>month</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p>1956741</p></td> <td><p>2018-01-01</p></td> </tr> <tr class="row-odd"><td><p>2344692</p></td> <td><p>2017-12-01</p></td> </tr> <tr class="row-even"><td><p>1730398</p></td> <td><p>2017-11-01</p></td> </tr> <tr class="row-odd"><td><p>2047310</p></td> <td><p>2017-10-01</p></td> </tr> <tr class="row-even"><td><p>1744443</p></td> <td><p>2017-09-01</p></td> </tr> <tr class="row-odd"><td><p>1916952</p></td> <td><p>2017-08-01</p></td> </tr> </tbody> </table> </div> </section> <section id="python-versions-over-time"> <h4>Python versions over time<a class="headerlink" href="#python-versions-over-time" title="Link to this heading">#</a></h4> <p>Extract the Python version from the <code class="docutils literal notranslate"><span class="pre">details.python</span></code> column. Warning: This query processes over 500 GB of data.</p> <div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="o">#</span><span class="n">standardSQL</span> <span class="k">SELECT</span> <span class="w"> </span><span class="n">REGEXP_EXTRACT</span><span class="p">(</span><span class="n">details</span><span class="p">.</span><span class="n">python</span><span class="p">,</span><span class="w"> </span><span class="n">r</span><span class="ss">"[0-9]+\.[0-9]+"</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">python_version</span><span class="p">,</span> <span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">num_downloads</span><span class="p">,</span> <span class="k">FROM</span><span class="w"> </span><span class="o">`</span><span class="n">bigquery</span><span class="o">-</span><span class="k">public</span><span class="o">-</span><span class="k">data</span><span class="p">.</span><span class="n">pypi</span><span class="p">.</span><span class="n">file_downloads</span><span class="o">`</span> <span class="k">WHERE</span> <span class="w"> </span><span class="c1">-- Only query the last 6 months of history</span> <span class="w"> </span><span class="nb">DATE</span><span class="p">(</span><span class="k">timestamp</span><span class="p">)</span> <span class="w"> </span><span class="k">BETWEEN</span><span class="w"> </span><span class="n">DATE_TRUNC</span><span class="p">(</span><span class="n">DATE_SUB</span><span class="p">(</span><span class="k">CURRENT_DATE</span><span class="p">(),</span><span class="w"> </span><span class="nb">INTERVAL</span><span class="w"> </span><span class="mi">6</span><span class="w"> </span><span class="k">MONTH</span><span class="p">),</span><span class="w"> </span><span class="k">MONTH</span><span class="p">)</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="k">CURRENT_DATE</span><span class="p">()</span> <span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="o">`</span><span class="n">python_version</span><span class="o">`</span> <span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="o">`</span><span class="n">num_downloads</span><span class="o">`</span><span class="w"> </span><span class="k">DESC</span> </pre></div> </div> <div class="table-wrapper docutils container"> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head"><p>python</p></th> <th class="head"><p>num_downloads</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p>3.7</p></td> <td><p>18051328726</p></td> </tr> <tr class="row-odd"><td><p>3.6</p></td> <td><p>9635067203</p></td> </tr> <tr class="row-even"><td><p>3.8</p></td> <td><p>7781904681</p></td> </tr> <tr class="row-odd"><td><p>2.7</p></td> <td><p>6381252241</p></td> </tr> <tr class="row-even"><td><p>null</p></td> <td><p>2026630299</p></td> </tr> <tr class="row-odd"><td><p>3.5</p></td> <td><p>1894153540</p></td> </tr> </tbody> </table> </div> </section> <section id="getting-absolute-links-to-artifacts"> <h4>Getting absolute links to artifacts<a class="headerlink" href="#getting-absolute-links-to-artifacts" title="Link to this heading">#</a></h4> <p>It’s sometimes helpful to be able to get the absolute links to download artifacts from PyPI based on their hashes, e.g. if a particular project or release has been deleted from PyPI. The metadata table includes the <code class="docutils literal notranslate"><span class="pre">path</span></code> column, which includes the hash and artifact filename.</p> <div class="admonition note"> <p class="admonition-title">Note</p> <p>The URL generated here is not guaranteed to be stable, but currently aligns with the URL where PyPI artifacts are hosted.</p> </div> <div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="k">SELECT</span> <span class="w"> </span><span class="n">CONCAT</span><span class="p">(</span><span class="s1">'https://files.pythonhosted.org/packages'</span><span class="p">,</span><span class="w"> </span><span class="n">path</span><span class="p">)</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">url</span> <span class="k">FROM</span> <span class="w"> </span><span class="o">`</span><span class="n">bigquery</span><span class="o">-</span><span class="k">public</span><span class="o">-</span><span class="k">data</span><span class="p">.</span><span class="n">pypi</span><span class="p">.</span><span class="n">distribution_metadata</span><span class="o">`</span> <span class="k">WHERE</span> <span class="w"> </span><span class="n">filename</span><span class="w"> </span><span class="k">LIKE</span><span class="w"> </span><span class="s1">'sampleproject%'</span> </pre></div> </div> <div class="table-wrapper docutils container"> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head"><p>url</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p><a class="reference external" href="https://files.pythonhosted.org/packages/eb/45/79be82bdeafcecb9dca474cad4003e32ef8e4a0dec6abbd4145ccb02abe1/sampleproject-1.2.0.tar.gz">https://files.pythonhosted.org/packages/eb/45/79be82bdeafcecb9dca474cad4003e32ef8e4a0dec6abbd4145ccb02abe1/sampleproject-1.2.0.tar.gz</a></p></td> </tr> <tr class="row-odd"><td><p><a class="reference external" href="https://files.pythonhosted.org/packages/56/0a/178e8bbb585ec5b13af42dae48b1d7425d6575b3ff9b02e5ec475e38e1d6/sampleproject_nomura-1.2.0-py2.py3-none-any.whl">https://files.pythonhosted.org/packages/56/0a/178e8bbb585ec5b13af42dae48b1d7425d6575b3ff9b02e5ec475e38e1d6/sampleproject_nomura-1.2.0-py2.py3-none-any.whl</a></p></td> </tr> <tr class="row-even"><td><p><a class="reference external" href="https://files.pythonhosted.org/packages/63/88/3200eeaf22571f18d2c41e288862502e33365ccbdc12b892db23f51f8e70/sampleproject_nomura-1.2.0.tar.gz">https://files.pythonhosted.org/packages/63/88/3200eeaf22571f18d2c41e288862502e33365ccbdc12b892db23f51f8e70/sampleproject_nomura-1.2.0.tar.gz</a></p></td> </tr> <tr class="row-odd"><td><p><a class="reference external" href="https://files.pythonhosted.org/packages/21/e9/2743311822e71c0756394b6c5ab15cb64ca66c78c6c6a5cd872c9ed33154/sampleproject_doubleyoung18-1.3.0-py2.py3-none-any.whl">https://files.pythonhosted.org/packages/21/e9/2743311822e71c0756394b6c5ab15cb64ca66c78c6c6a5cd872c9ed33154/sampleproject_doubleyoung18-1.3.0-py2.py3-none-any.whl</a></p></td> </tr> <tr class="row-even"><td><p><a class="reference external" href="https://files.pythonhosted.org/packages/6f/5b/2f3fe94e1c02816fe23c7ceee5292fb186912929e1972eee7fb729fa27af/sampleproject-1.3.1.tar.gz">https://files.pythonhosted.org/packages/6f/5b/2f3fe94e1c02816fe23c7ceee5292fb186912929e1972eee7fb729fa27af/sampleproject-1.3.1.tar.gz</a></p></td> </tr> </tbody> </table> </div> </section> </section> </section> <section id="caveats"> <h2>Caveats<a class="headerlink" href="#caveats" title="Link to this heading">#</a></h2> <p>In addition to the caveats listed in the background above, Linehaul suffered from a bug which caused it to significantly under-report download statistics prior to July 26, 2018. Downloads before this date are proportionally accurate (e.g. the percentage of Python 2 vs. Python 3 downloads) but total numbers are lower than actual by an order of magnitude.</p> </section> <section id="additional-tools"> <h2>Additional tools<a class="headerlink" href="#additional-tools" title="Link to this heading">#</a></h2> <p>Besides using the BigQuery console, there are some additional tools which may be useful when analyzing download statistics.</p> <section id="google-cloud-bigquery"> <h3><code class="docutils literal notranslate"><span class="pre">google-cloud-bigquery</span></code><a class="headerlink" href="#google-cloud-bigquery" title="Link to this heading">#</a></h3> <p>You can also access the public PyPI download statistics dataset programmatically via the BigQuery API and the <a class="reference external" href="https://cloud.google.com/bigquery/docs/reference/libraries">google-cloud-bigquery</a> project, the official Python client library for BigQuery.</p> <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">google.cloud</span> <span class="kn">import</span> <span class="n">bigquery</span> <span class="c1"># Note: depending on where this code is being run, you may require</span> <span class="c1"># additional authentication. See:</span> <span class="c1"># https://cloud.google.com/bigquery/docs/authentication/</span> <span class="n">client</span> <span class="o">=</span> <span class="n">bigquery</span><span class="o">.</span><span class="n">Client</span><span class="p">()</span> <span class="n">query_job</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">query</span><span class="p">(</span><span class="s2">"""</span> <span class="s2">SELECT COUNT(*) AS num_downloads</span> <span class="s2">FROM `bigquery-public-data.pypi.file_downloads`</span> <span class="s2">WHERE file.project = 'pytest'</span> <span class="s2"> -- Only query the last 30 days of history</span> <span class="s2"> AND DATE(timestamp)</span> <span class="s2"> BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)</span> <span class="s2"> AND CURRENT_DATE()"""</span><span class="p">)</span> <span class="n">results</span> <span class="o">=</span> <span class="n">query_job</span><span class="o">.</span><span class="n">result</span><span class="p">()</span> <span class="c1"># Waits for job to complete.</span> <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">results</span><span class="p">:</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"</span><span class="si">{}</span><span class="s2"> downloads"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">row</span><span class="o">.</span><span class="n">num_downloads</span><span class="p">))</span> </pre></div> </div> </section> <section id="pypinfo"> <h3><code class="docutils literal notranslate"><span class="pre">pypinfo</span></code><a class="headerlink" href="#pypinfo" title="Link to this heading">#</a></h3> <p><a class="reference external" href="https://github.com/ofek/pypinfo">pypinfo</a> is a command-line tool which provides access to the dataset and can generate several useful queries. For example, you can query the total number of download for a package with the command <code class="docutils literal notranslate"><span class="pre">pypinfo</span> <span class="pre">package_name</span></code>.</p> <p>Install <a class="reference external" href="https://github.com/ofek/pypinfo">pypinfo</a> using pip.</p> <div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python3<span class="w"> </span>-m<span class="w"> </span>pip<span class="w"> </span>install<span class="w"> </span>pypinfo </pre></div> </div> <p>Usage:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>pypinfo<span class="w"> </span>requests <span class="go">Served from cache: False</span> <span class="go">Data processed: 6.87 GiB</span> <span class="go">Data billed: 6.87 GiB</span> <span class="go">Estimated cost: $0.04</span> <span class="go">| download_count |</span> <span class="go">| -------------- |</span> <span class="go">| 9,316,415 |</span> </pre></div> </div> </section> <section id="pandas-gbq"> <h3><code class="docutils literal notranslate"><span class="pre">pandas-gbq</span></code><a class="headerlink" href="#pandas-gbq" title="Link to this heading">#</a></h3> <p>The <a class="reference external" href="https://pandas-gbq.readthedocs.io/en/latest/">pandas-gbq</a> project allows for accessing query results via <a class="reference external" href="https://pandas.pydata.org/">Pandas</a>.</p> </section> </section> <section id="references"> <h2>References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2> <aside class="footnote-list brackets"> <aside class="footnote brackets" id="id4" role="doc-footnote"> <span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id2">1</a><span class="fn-bracket">]</span></span> <p><a class="reference external" href="https://mail.python.org/pipermail/distutils-sig/2013-May/020855.html">PyPI Download Counts deprecation email</a></p> </aside> <aside class="footnote brackets" id="id5" role="doc-footnote"> <span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id3">2</a><span class="fn-bracket">]</span></span> <p><a class="reference external" href="https://mail.python.org/pipermail/distutils-sig/2016-May/028986.html">PyPI BigQuery dataset announcement email</a></p> </aside> </aside> </section> </section> </article> </div> <footer> <div class="related-pages"> <a class="next-page" href="../../discussions/"> <div class="page-info"> <div class="context"> <span>Next</span> </div> <div class="title">Discussions</div> </div> <svg class="furo-related-icon"><use href="#svg-arrow-right"></use></svg> </a> <a class="prev-page" href="../tool-recommendations/"> <svg class="furo-related-icon"><use href="#svg-arrow-right"></use></svg> <div class="page-info"> <div class="context"> <span>Previous</span> </div> <div class="title">Tool recommendations</div> </div> </a> </div> <div class="bottom-of-page"> <div class="left-details"> <div class="copyright"> Copyright © 2013–2020, PyPA </div> Made with <a href="https://www.sphinx-doc.org/">Sphinx</a> and <a class="muted-link" href="https://pradyunsg.me">@pradyunsg</a>'s <a href="https://github.com/pradyunsg/furo">Furo</a> <div class="last-updated"> Last updated on Nov 25, 2024</div> </div> <div class="right-details"> </div> </div> </footer> </div> <aside class="toc-drawer"> <div class="toc-sticky toc-scroll"> <div class="toc-title-container"> <span class="toc-title"> On this page </span> </div> <div class="toc-tree-container"> <div class="toc-tree"> <ul> <li><a class="reference internal" href="#">Analyzing PyPI package downloads</a><ul> <li><a class="reference internal" href="#background">Background</a></li> <li><a class="reference internal" href="#public-dataset">Public dataset</a><ul> <li><a class="reference internal" href="#getting-set-up">Getting set up</a></li> <li><a class="reference internal" href="#data-schema">Data schema</a></li> <li><a class="reference internal" href="#useful-queries">Useful queries</a><ul> <li><a class="reference internal" href="#counting-package-downloads">Counting package downloads</a></li> <li><a class="reference internal" href="#package-downloads-over-time">Package downloads over time</a></li> <li><a class="reference internal" href="#python-versions-over-time">Python versions over time</a></li> <li><a class="reference internal" href="#getting-absolute-links-to-artifacts">Getting absolute links to artifacts</a></li> </ul> </li> </ul> </li> <li><a class="reference internal" href="#caveats">Caveats</a></li> <li><a class="reference internal" href="#additional-tools">Additional tools</a><ul> <li><a class="reference internal" href="#google-cloud-bigquery"><code class="docutils literal notranslate"><span class="pre">google-cloud-bigquery</span></code></a></li> <li><a class="reference internal" href="#pypinfo"><code class="docutils literal notranslate"><span class="pre">pypinfo</span></code></a></li> <li><a class="reference internal" href="#pandas-gbq"><code class="docutils literal notranslate"><span class="pre">pandas-gbq</span></code></a></li> </ul> </li> <li><a class="reference internal" href="#references">References</a></li> </ul> </li> </ul> </div> </div> </div> </aside> </div> </div><script src="../../_static/documentation_options.js?v=187304be"></script> <script src="../../_static/doctools.js?v=888ff710"></script> <script src="../../_static/sphinx_highlight.js?v=dc90522c"></script> <script src="../../_static/scripts/furo.js?v=32e29ea5"></script> <script src="../../_static/tabs.js?v=3ee01567"></script> <script src="../../_static/clipboard.min.js?v=a7894cd8"></script> <script src="../../_static/copybutton.js?v=cb5fb026"></script> <script data-domain="packaging.python.org" defer="defer" src="https://plausible.io/js/script.js"></script> </body> </html>