CINXE.COM
PEP 583 – A Concurrency Memory Model for Python | peps.python.org
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="color-scheme" content="light dark"> <title>PEP 583 – A Concurrency Memory Model for Python | peps.python.org</title> <link rel="shortcut icon" href="../_static/py.png"> <link rel="canonical" href="https://peps.python.org/pep-0583/"> <link rel="stylesheet" href="../_static/style.css" type="text/css"> <link rel="stylesheet" href="../_static/mq.css" type="text/css"> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light"> <link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark"> <link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss"> <meta property="og:title" content='PEP 583 – A Concurrency Memory Model for Python | peps.python.org'> <meta property="og:description" content="This PEP describes how Python programs may behave in the presence of concurrent reads and writes to shared variables from multiple threads. We use a happens before relation to define when variable accesses are ordered or concurrent. Nearly all programs..."> <meta property="og:type" content="website"> <meta property="og:url" content="https://peps.python.org/pep-0583/"> <meta property="og:site_name" content="Python Enhancement Proposals (PEPs)"> <meta property="og:image" content="https://peps.python.org/_static/og-image.png"> <meta property="og:image:alt" content="Python PEPs"> <meta property="og:image:width" content="200"> <meta property="og:image:height" content="200"> <meta name="description" content="This PEP describes how Python programs may behave in the presence of concurrent reads and writes to shared variables from multiple threads. We use a happens before relation to define when variable accesses are ordered or concurrent. Nearly all programs..."> <meta name="theme-color" content="#3776ab"> </head> <body> <svg xmlns="http://www.w3.org/2000/svg" style="display: none;"> <symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all"> <title>Following system colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="9"></circle> <path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path> </svg> </symbol> <symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all"> <title>Selected dark colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <path stroke="none" d="M0 0h24v24H0z" fill="none"></path> <path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path> </svg> </symbol> <symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all"> <title>Selected light colour scheme</title> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> <circle cx="12" cy="12" r="5"></circle> <line x1="12" y1="1" x2="12" y2="3"></line> <line x1="12" y1="21" x2="12" y2="23"></line> <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line> <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line> <line x1="1" y1="12" x2="3" y2="12"></line> <line x1="21" y1="12" x2="23" y2="12"></line> <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line> <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line> </svg> </symbol> </svg> <script> document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto" </script> <section id="pep-page-section"> <header> <h1>Python Enhancement Proposals</h1> <ul class="breadcrumbs"> <li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> » </li> <li><a href="../pep-0000/">PEP Index</a> » </li> <li>PEP 583</li> </ul> <button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())"> <svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg> <svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg> <span class="visually-hidden">Toggle light / dark / auto colour theme</span> </button> </header> <article> <section id="pep-content"> <h1 class="page-title">PEP 583 – A Concurrency Memory Model for Python</h1> <dl class="rfc2822 field-list simple"> <dt class="field-odd">Author<span class="colon">:</span></dt> <dd class="field-odd">Jeffrey Yasskin <jyasskin at google.com></dd> <dt class="field-even">Status<span class="colon">:</span></dt> <dd class="field-even"><abbr title="Removed from consideration by sponsor or authors">Withdrawn</abbr></dd> <dt class="field-odd">Type<span class="colon">:</span></dt> <dd class="field-odd"><abbr title="Non-normative PEP containing background, guidelines or other information relevant to the Python ecosystem">Informational</abbr></dd> <dt class="field-even">Created<span class="colon">:</span></dt> <dd class="field-even">22-Mar-2008</dd> <dt class="field-odd">Post-History<span class="colon">:</span></dt> <dd class="field-odd"><p></p></dd> </dl> <hr class="docutils" /> <section id="contents"> <details><summary>Table of Contents</summary><ul class="simple"> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#a-couple-definitions">A couple definitions</a></li> <li><a class="reference internal" href="#two-simple-memory-models">Two simple memory models</a><ul> <li><a class="reference internal" href="#sequential-consistency">Sequential Consistency</a></li> <li><a class="reference internal" href="#happens-before-consistency">Happens-before consistency</a><ul> <li><a class="reference internal" href="#an-example">An example</a></li> </ul> </li> </ul> </li> <li><a class="reference internal" href="#surprising-behaviors-with-races">Surprising behaviors with races</a><ul> <li><a class="reference internal" href="#zombie-values">Zombie values</a></li> <li><a class="reference internal" href="#inconsistent-orderings">Inconsistent Orderings</a></li> <li><a class="reference internal" href="#a-happens-before-race-that-s-not-a-sequentially-consistent-race">A happens-before race that’s not a sequentially-consistent race</a></li> <li><a class="reference internal" href="#self-justifying-values">Self-justifying values</a></li> <li><a class="reference internal" href="#uninitialized-values-direct">Uninitialized values (direct)</a></li> <li><a class="reference internal" href="#uninitialized-values-flag">Uninitialized values (flag)</a></li> <li><a class="reference internal" href="#inconsistent-guarantees-from-relying-on-data-dependencies">Inconsistent guarantees from relying on data dependencies</a></li> </ul> </li> <li><a class="reference internal" href="#the-rules-for-python">The rules for Python</a><ul> <li><a class="reference internal" href="#data-race-free-programs-are-sequentially-consistent">Data-race-free programs are sequentially consistent</a></li> <li><a class="reference internal" href="#no-security-holes-from-out-of-thin-air-reads">No security holes from out-of-thin-air reads</a></li> <li><a class="reference internal" href="#restrict-reorderings-instead-of-defining-happens-before">Restrict reorderings instead of defining happens-before</a></li> <li><a class="reference internal" href="#atomic-unordered-assignments">Atomic, unordered assignments</a></li> <li><a class="reference internal" href="#two-tiers-of-guarantees">Two tiers of guarantees</a></li> <li><a class="reference internal" href="#id4">Sequential Consistency</a></li> <li><a class="reference internal" href="#adapt-the-x86-model">Adapt the x86 model</a></li> <li><a class="reference internal" href="#upgrading-or-downgrading-to-an-alternate-model">Upgrading or downgrading to an alternate model</a></li> </ul> </li> <li><a class="reference internal" href="#implementation-details">Implementation Details</a><ul> <li><a class="reference internal" href="#cpython">CPython</a></li> <li><a class="reference internal" href="#id5">PyPy</a></li> <li><a class="reference internal" href="#id6">Jython</a></li> <li><a class="reference internal" href="#id7">IronPython</a></li> </ul> </li> <li><a class="reference internal" href="#references">References</a></li> <li><a class="reference internal" href="#acknowledgements">Acknowledgements</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> </details></section> <section id="abstract"> <h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2> <p>This PEP describes how Python programs may behave in the presence of concurrent reads and writes to shared variables from multiple threads. We use a <em>happens before</em> relation to define when variable accesses are ordered or concurrent. Nearly all programs should simply use locks to guard their shared variables, and this PEP highlights some of the strange things that can happen when they don’t, but programmers often assume that it’s ok to do “simple” things without locking, and it’s somewhat unpythonic to let the language surprise them. Unfortunately, avoiding surprise often conflicts with making Python run quickly, so this PEP tries to find a good tradeoff between the two.</p> </section> <section id="rationale"> <h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2> <p>So far, we have 4 major Python implementations – CPython, <a class="reference external" href="http://www.jython.org/">Jython</a>, <a class="reference external" href="http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython">IronPython</a>, and <a class="reference external" href="http://codespeak.net/pypy/dist/pypy/doc/home.html">PyPy</a> – as well as lots of minor ones. Some of these already run on platforms that do aggressive optimizations. In general, these optimizations are invisible within a single thread of execution, but they can be visible to other threads executing concurrently. CPython currently uses a <a class="reference external" href="http://en.wikipedia.org/wiki/Global_Interpreter_Lock">GIL</a> to ensure that other threads see the results they expect, but this limits it to a single processor. Jython and IronPython run on Java’s or .NET’s threading system respectively, which allows them to take advantage of more cores but can also show surprising values to other threads.</p> <p>So that threaded Python programs continue to be portable between implementations, implementers and library authors need to agree on some ground rules.</p> </section> <section id="a-couple-definitions"> <h2><a class="toc-backref" href="#a-couple-definitions" role="doc-backlink">A couple definitions</a></h2> <dl class="simple"> <dt>Variable</dt><dd>A name that refers to an object. Variables are generally introduced by assigning to them, and may be destroyed by passing them to <code class="docutils literal notranslate"><span class="pre">del</span></code>. Variables are fundamentally mutable, while objects may not be. There are several varieties of variables: module variables (often called “globals” when accessed from within the module), class variables, instance variables (also known as fields), and local variables. All of these can be shared between threads (the local variables if they’re saved into a closure). The object in which the variables are scoped notionally has a <code class="docutils literal notranslate"><span class="pre">dict</span></code> whose keys are the variables’ names.</dd> <dt>Object</dt><dd>A collection of instance variables (a.k.a. fields) and methods. At least, that’ll do for this PEP.</dd> <dt>Program Order</dt><dd>The order that actions (reads and writes) happen within a thread, which is very similar to the order they appear in the text.</dd> <dt>Conflicting actions</dt><dd>Two actions on the same variable, at least one of which is a write.</dd> <dt>Data race</dt><dd>A situation in which two conflicting actions happen at the same time. “The same time” is defined by the memory model.</dd> </dl> </section> <section id="two-simple-memory-models"> <h2><a class="toc-backref" href="#two-simple-memory-models" role="doc-backlink">Two simple memory models</a></h2> <p>Before talking about the details of data races and the surprising behaviors they produce, I’ll present two simple memory models. The first is probably too strong for Python, and the second is probably too weak.</p> <section id="sequential-consistency"> <h3><a class="toc-backref" href="#sequential-consistency" role="doc-backlink">Sequential Consistency</a></h3> <p>In a sequentially-consistent concurrent execution, actions appear to happen in a global total order with each read of a particular variable seeing the value written by the last write that affected that variable. The total order for actions must be consistent with the program order. A program has a data race on a given input when one of its sequentially consistent executions puts two conflicting actions next to each other.</p> <p>This is the easiest memory model for humans to understand, although it doesn’t eliminate all confusion, since operations can be split in odd places.</p> </section> <section id="happens-before-consistency"> <h3><a class="toc-backref" href="#happens-before-consistency" role="doc-backlink">Happens-before consistency</a></h3> <p>The program contains a collection of <em>synchronization actions</em>, which in Python currently include lock acquires and releases and thread starts and joins. Synchronization actions happen in a global total order that is consistent with the program order (they don’t <em>have</em> to happen in a total order, but it simplifies the description of the model). A lock release <em>synchronizes with</em> all later acquires of the same lock. Similarly, given <code class="docutils literal notranslate"><span class="pre">t</span> <span class="pre">=</span> <span class="pre">threading.Thread(target=worker)</span></code>:</p> <ul class="simple"> <li>A call to <code class="docutils literal notranslate"><span class="pre">t.start()</span></code> synchronizes with the first statement in <code class="docutils literal notranslate"><span class="pre">worker()</span></code>.</li> <li>The return from <code class="docutils literal notranslate"><span class="pre">worker()</span></code> synchronizes with the return from <code class="docutils literal notranslate"><span class="pre">t.join()</span></code>.</li> <li>If the return from <code class="docutils literal notranslate"><span class="pre">t.start()</span></code> happens before (see below) a call to <code class="docutils literal notranslate"><span class="pre">t.isAlive()</span></code> that returns <code class="docutils literal notranslate"><span class="pre">False</span></code>, the return from <code class="docutils literal notranslate"><span class="pre">worker()</span></code> synchronizes with that call.</li> </ul> <p>We call the source of the synchronizes-with edge a <em>release</em> operation on the relevant variable, and we call the target an <em>acquire</em> operation.</p> <p>The <em>happens before</em> order is the transitive closure of the program order with the synchronizes-with edges. That is, action <em>A</em> happens before action <em>B</em> if:</p> <ul class="simple"> <li>A falls before B in the program order (which means they run in the same thread)</li> <li>A synchronizes with B</li> <li>You can get to B by following happens-before edges from A.</li> </ul> <p>An execution of a program is happens-before consistent if each read <em>R</em> sees the value of a write <em>W</em> to the same variable such that:</p> <ul class="simple"> <li><em>R</em> does not happen before <em>W</em>, and</li> <li>There is no other write <em>V</em> that overwrote <em>W</em> before <em>R</em> got a chance to see it. (That is, it can’t be the case that <em>W</em> happens before <em>V</em> happens before <em>R</em>.)</li> </ul> <p>You have a data race if two conflicting actions aren’t related by happens-before.</p> <section id="an-example"> <h4><a class="toc-backref" href="#an-example" role="doc-backlink">An example</a></h4> <p>Let’s use the rules from the happens-before model to prove that the following program prints “[7]”:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Queue</span><span class="p">:</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">l</span> <span class="o">=</span> <span class="p">[]</span> <span class="bp">self</span><span class="o">.</span><span class="n">cond</span> <span class="o">=</span> <span class="n">threading</span><span class="o">.</span><span class="n">Condition</span><span class="p">()</span> <span class="k">def</span> <span class="nf">get</span><span class="p">():</span> <span class="k">with</span> <span class="bp">self</span><span class="o">.</span><span class="n">cond</span><span class="p">:</span> <span class="k">while</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">l</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">cond</span><span class="o">.</span><span class="n">wait</span><span class="p">()</span> <span class="n">ret</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">l</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="bp">self</span><span class="o">.</span><span class="n">l</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">l</span><span class="p">[</span><span class="mi">1</span><span class="p">:]</span> <span class="k">return</span> <span class="n">ret</span> <span class="k">def</span> <span class="nf">put</span><span class="p">(</span><span class="n">x</span><span class="p">):</span> <span class="k">with</span> <span class="bp">self</span><span class="o">.</span><span class="n">cond</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">l</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">cond</span><span class="o">.</span><span class="n">notify</span><span class="p">()</span> <span class="n">myqueue</span> <span class="o">=</span> <span class="n">Queue</span><span class="p">()</span> <span class="k">def</span> <span class="nf">worker1</span><span class="p">():</span> <span class="n">x</span> <span class="o">=</span> <span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="n">myqueue</span><span class="o">.</span><span class="n">put</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">def</span> <span class="nf">worker2</span><span class="p">():</span> <span class="n">y</span> <span class="o">=</span> <span class="n">myqueue</span><span class="o">.</span><span class="n">get</span><span class="p">()</span> <span class="nb">print</span> <span class="n">y</span> <span class="n">thread1</span> <span class="o">=</span> <span class="n">threading</span><span class="o">.</span><span class="n">Thread</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="n">worker1</span><span class="p">)</span> <span class="n">thread2</span> <span class="o">=</span> <span class="n">threading</span><span class="o">.</span><span class="n">Thread</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="n">worker2</span><span class="p">)</span> <span class="n">thread2</span><span class="o">.</span><span class="n">start</span><span class="p">()</span> <span class="n">thread1</span><span class="o">.</span><span class="n">start</span><span class="p">()</span> </pre></div> </div> <ol class="arabic simple"> <li>Because <code class="docutils literal notranslate"><span class="pre">myqueue</span></code> is initialized in the main thread before <code class="docutils literal notranslate"><span class="pre">thread1</span></code> or <code class="docutils literal notranslate"><span class="pre">thread2</span></code> is started, that initialization happens before <code class="docutils literal notranslate"><span class="pre">worker1</span></code> and <code class="docutils literal notranslate"><span class="pre">worker2</span></code> begin running, so there’s no way for either to raise a NameError, and both <code class="docutils literal notranslate"><span class="pre">myqueue.l</span></code> and <code class="docutils literal notranslate"><span class="pre">myqueue.cond</span></code> are set to their final objects.</li> <li>The initialization of <code class="docutils literal notranslate"><span class="pre">x</span></code> in <code class="docutils literal notranslate"><span class="pre">worker1</span></code> happens before it calls <code class="docutils literal notranslate"><span class="pre">myqueue.put()</span></code>, which happens before it calls <code class="docutils literal notranslate"><span class="pre">myqueue.l.append(x)</span></code>, which happens before the call to <code class="docutils literal notranslate"><span class="pre">myqueue.cond.release()</span></code>, all because they run in the same thread.</li> <li>In <code class="docutils literal notranslate"><span class="pre">worker2</span></code>, <code class="docutils literal notranslate"><span class="pre">myqueue.cond</span></code> will be released and re-acquired until <code class="docutils literal notranslate"><span class="pre">myqueue.l</span></code> contains a value (<code class="docutils literal notranslate"><span class="pre">x</span></code>). The call to <code class="docutils literal notranslate"><span class="pre">myqueue.cond.release()</span></code> in <code class="docutils literal notranslate"><span class="pre">worker1</span></code> happens before that last call to <code class="docutils literal notranslate"><span class="pre">myqueue.cond.acquire()</span></code> in <code class="docutils literal notranslate"><span class="pre">worker2</span></code>.</li> <li>That last call to <code class="docutils literal notranslate"><span class="pre">myqueue.cond.acquire()</span></code> happens before <code class="docutils literal notranslate"><span class="pre">myqueue.get()</span></code> reads <code class="docutils literal notranslate"><span class="pre">myqueue.l</span></code>, which happens before <code class="docutils literal notranslate"><span class="pre">myqueue.get()</span></code> returns, which happens before <code class="docutils literal notranslate"><span class="pre">print</span> <span class="pre">y</span></code>, again all because they run in the same thread.</li> <li>Because happens-before is transitive, the list initially stored in <code class="docutils literal notranslate"><span class="pre">x</span></code> in thread1 is initialized before it is printed in thread2.</li> </ol> <p>Usually, we wouldn’t need to look all the way into a thread-safe queue’s implementation in order to prove that uses were safe. Its interface would specify that puts happen before gets, and we’d reason directly from that.</p> </section> </section> </section> <section id="surprising-behaviors-with-races"> <span id="pep-583-hazards"></span><h2><a class="toc-backref" href="#surprising-behaviors-with-races" role="doc-backlink">Surprising behaviors with races</a></h2> <p>Lots of strange things can happen when code has data races. It’s easy to avoid all of these problems by just protecting shared variables with locks. This is not a complete list of race hazards; it’s just a collection that seem relevant to Python.</p> <p>In all of these examples, variables starting with <code class="docutils literal notranslate"><span class="pre">r</span></code> are local variables, and other variables are shared between threads.</p> <section id="zombie-values"> <h3><a class="toc-backref" href="#zombie-values" role="doc-backlink">Zombie values</a></h3> <p>This example comes from the <a class="reference external" href="http://java.sun.com/docs/books/jls/third_edition/html/memory.html">Java memory model</a>:</p> <blockquote> <div>Initially <code class="docutils literal notranslate"><span class="pre">p</span> <span class="pre">is</span> <span class="pre">q</span></code> and <code class="docutils literal notranslate"><span class="pre">p.x</span> <span class="pre">==</span> <span class="pre">0</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = p</td> <td>r6 = p</td> </tr> <tr class="row-odd"><td>r2 = r1.x</td> <td>r6.x = 3</td> </tr> <tr class="row-even"><td>r3 = q</td> <td></td> </tr> <tr class="row-odd"><td>r4 = r3.x</td> <td></td> </tr> <tr class="row-even"><td>r5 = r1.x</td> <td></td> </tr> </tbody> </table> <p>Can produce <code class="docutils literal notranslate"><span class="pre">r2</span> <span class="pre">==</span> <span class="pre">r5</span> <span class="pre">==</span> <span class="pre">0</span></code> but <code class="docutils literal notranslate"><span class="pre">r4</span> <span class="pre">==</span> <span class="pre">3</span></code>, proving that <code class="docutils literal notranslate"><span class="pre">p.x</span></code> went from 0 to 3 and back to 0.</p> </div></blockquote> <p>A good compiler would like to optimize out the redundant load of <code class="docutils literal notranslate"><span class="pre">p.x</span></code> in initializing <code class="docutils literal notranslate"><span class="pre">r5</span></code> by just re-using the value already loaded into <code class="docutils literal notranslate"><span class="pre">r2</span></code>. We get the strange result if thread 1 sees memory in this order:</p> <blockquote> <div><table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Evaluation</th> <th class="head">Computes</th> <th class="head">Why</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = p</td> <td></td> <td></td> </tr> <tr class="row-odd"><td>r2 = r1.x</td> <td>r2 == 0</td> <td></td> </tr> <tr class="row-even"><td>r3 = q</td> <td>r3 is p</td> <td></td> </tr> <tr class="row-odd"><td>p.x = 3</td> <td></td> <td>Side-effect of thread 2</td> </tr> <tr class="row-even"><td>r4 = r3.x</td> <td>r4 == 3</td> <td></td> </tr> <tr class="row-odd"><td>r5 = r2</td> <td>r5 == 0</td> <td>Optimized from r5 = r1.x because r2 == r1.x.</td> </tr> </tbody> </table> </div></blockquote> </section> <section id="inconsistent-orderings"> <h3><a class="toc-backref" href="#inconsistent-orderings" role="doc-backlink">Inconsistent Orderings</a></h3> <p>From <a class="reference external" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2177.html">N2177: Sequential Consistency for Atomics</a>, and also known as Independent Read of Independent Write (IRIW).</p> <blockquote> <div>Initially, <code class="docutils literal notranslate"><span class="pre">a</span> <span class="pre">==</span> <span class="pre">b</span> <span class="pre">==</span> <span class="pre">0</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> <th class="head">Thread 3</th> <th class="head">Thread 4</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = a</td> <td>r3 = b</td> <td>a = 1</td> <td>b = 1</td> </tr> <tr class="row-odd"><td>r2 = b</td> <td>r4 = a</td> <td></td> <td></td> </tr> </tbody> </table> <p>We may get <code class="docutils literal notranslate"><span class="pre">r1</span> <span class="pre">==</span> <span class="pre">r3</span> <span class="pre">==</span> <span class="pre">1</span></code> and <code class="docutils literal notranslate"><span class="pre">r2</span> <span class="pre">==</span> <span class="pre">r4</span> <span class="pre">==</span> <span class="pre">0</span></code>, proving both that <code class="docutils literal notranslate"><span class="pre">a</span></code> was written before <code class="docutils literal notranslate"><span class="pre">b</span></code> (thread 1’s data), and that <code class="docutils literal notranslate"><span class="pre">b</span></code> was written before <code class="docutils literal notranslate"><span class="pre">a</span></code> (thread 2’s data). See <a class="reference external" href="http://en.wikipedia.org/wiki/Relativity_of_simultaneity">Special Relativity</a> for a real-world example.</p> </div></blockquote> <p>This can happen if thread 1 and thread 3 are running on processors that are close to each other, but far away from the processors that threads 2 and 4 are running on and the writes are not being transmitted all the way across the machine before becoming visible to nearby threads.</p> <p>Neither acquire/release semantics nor explicit memory barriers can help with this. Making the orders consistent without locking requires detailed knowledge of the architecture’s memory model, but Java requires it for volatiles so we could use documentation aimed at its implementers.</p> </section> <section id="a-happens-before-race-that-s-not-a-sequentially-consistent-race"> <h3><a class="toc-backref" href="#a-happens-before-race-that-s-not-a-sequentially-consistent-race" role="doc-backlink">A happens-before race that’s not a sequentially-consistent race</a></h3> <p>From the POPL paper about the Java memory model [#JMM-popl].</p> <blockquote> <div>Initially, <code class="docutils literal notranslate"><span class="pre">x</span> <span class="pre">==</span> <span class="pre">y</span> <span class="pre">==</span> <span class="pre">0</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = x</td> <td>r2 = y</td> </tr> <tr class="row-odd"><td>if r1 != 0:</td> <td>if r2 != 0:</td> </tr> <tr class="row-even"><td>y = 42</td> <td>x = 42</td> </tr> </tbody> </table> <p>Can <code class="docutils literal notranslate"><span class="pre">r1</span> <span class="pre">==</span> <span class="pre">r2</span> <span class="pre">==</span> <span class="pre">42</span></code>???</p> </div></blockquote> <p>In a sequentially-consistent execution, there’s no way to get an adjacent read and write to the same variable, so the program should be considered correctly synchronized (albeit fragile), and should only produce <code class="docutils literal notranslate"><span class="pre">r1</span> <span class="pre">==</span> <span class="pre">r2</span> <span class="pre">==</span> <span class="pre">0</span></code>. However, the following execution is happens-before consistent:</p> <blockquote> <div><table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Statement</th> <th class="head">Value</th> <th class="head">Thread</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = x</td> <td>42</td> <td>1</td> </tr> <tr class="row-odd"><td>if r1 != 0:</td> <td>true</td> <td>1</td> </tr> <tr class="row-even"><td>y = 42</td> <td></td> <td>1</td> </tr> <tr class="row-odd"><td>r2 = y</td> <td>42</td> <td>2</td> </tr> <tr class="row-even"><td>if r2 != 0:</td> <td>true</td> <td>2</td> </tr> <tr class="row-odd"><td>x = 42</td> <td></td> <td>2</td> </tr> </tbody> </table> </div></blockquote> <p>WTF, you are asking yourself. Because there were no inter-thread happens-before edges in the original program, the read of x in thread 1 can see any of the writes from thread 2, even if they only happened because the read saw them. There <em>are</em> data races in the happens-before model.</p> <p>We don’t want to allow this, so the happens-before model isn’t enough for Python. One rule we could add to happens-before that would prevent this execution is:</p> <blockquote> <div>If there are no data races in any sequentially-consistent execution of a program, the program should have sequentially consistent semantics.</div></blockquote> <p>Java gets this rule as a theorem, but Python may not want all of the machinery you need to prove it.</p> </section> <section id="self-justifying-values"> <h3><a class="toc-backref" href="#self-justifying-values" role="doc-backlink">Self-justifying values</a></h3> <p>Also from the POPL paper about the Java memory model [#JMM-popl].</p> <blockquote> <div>Initially, <code class="docutils literal notranslate"><span class="pre">x</span> <span class="pre">==</span> <span class="pre">y</span> <span class="pre">==</span> <span class="pre">0</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = x</td> <td>r2 = y</td> </tr> <tr class="row-odd"><td>y = r1</td> <td>x = r2</td> </tr> </tbody> </table> <p>Can <code class="docutils literal notranslate"><span class="pre">x</span> <span class="pre">==</span> <span class="pre">y</span> <span class="pre">==</span> <span class="pre">42</span></code>???</p> </div></blockquote> <p>In a sequentially consistent execution, no. In a happens-before consistent execution, yes: The read of x in thread 1 is allowed to see the value written in thread 2 because there are no happens-before relations between the threads. This could happen if the compiler or processor transforms the code into:</p> <blockquote> <div><table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>y = 42</td> <td>r2 = y</td> </tr> <tr class="row-odd"><td>r1 = x</td> <td>x = r2</td> </tr> <tr class="row-even"><td>if r1 != 42:</td> <td></td> </tr> <tr class="row-odd"><td>y = r1</td> <td></td> </tr> </tbody> </table> </div></blockquote> <p>It can produce a security hole if the speculated value is a secret object, or points to the memory that an object used to occupy. Java cares a lot about such security holes, but Python may not.</p> </section> <section id="uninitialized-values-direct"> <span id="uninitialized-values"></span><h3><a class="toc-backref" href="#uninitialized-values-direct" role="doc-backlink">Uninitialized values (direct)</a></h3> <p>From several classic double-checked locking examples.</p> <blockquote> <div>Initially, <code class="docutils literal notranslate"><span class="pre">d</span> <span class="pre">==</span> <span class="pre">None</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>while not d: pass</td> <td>d = [3, 4]</td> </tr> <tr class="row-odd"><td>assert d[1] == 4</td> <td></td> </tr> </tbody> </table> <p>This could raise an IndexError, fail the assertion, or, without some care in the implementation, cause a crash or other undefined behavior.</p> </div></blockquote> <p>Thread 2 may actually be implemented as:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">r1</span> <span class="o">=</span> <span class="nb">list</span><span class="p">()</span> <span class="n">r1</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span> <span class="n">r1</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span> <span class="n">d</span> <span class="o">=</span> <span class="n">r1</span> </pre></div> </div> <p>Because the assignment to d and the item assignments are independent, the compiler and processor may optimize that to:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">r1</span> <span class="o">=</span> <span class="nb">list</span><span class="p">()</span> <span class="n">d</span> <span class="o">=</span> <span class="n">r1</span> <span class="n">r1</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span> <span class="n">r1</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span> </pre></div> </div> <p>Which is obviously incorrect and explains the IndexError. If we then look deeper into the implementation of <code class="docutils literal notranslate"><span class="pre">r1.append(3)</span></code>, we may find that it and <code class="docutils literal notranslate"><span class="pre">d[1]</span></code> cannot run concurrently without causing their own race conditions. In CPython (without the GIL), those race conditions would produce undefined behavior.</p> <p>There’s also a subtle issue on the reading side that can cause the value of d[1] to be out of date. Somewhere in the implementation of <code class="docutils literal notranslate"><span class="pre">list</span></code>, it stores its contents as an array in memory. This array may happen to be in thread 1’s cache. If thread 1’s processor reloads <code class="docutils literal notranslate"><span class="pre">d</span></code> from main memory without reloading the memory that ought to contain the values 3 and 4, it could see stale values instead. As far as I know, this can only actually happen on Alphas and maybe Itaniums, and we probably have to prevent it anyway to avoid crashes.</p> </section> <section id="uninitialized-values-flag"> <h3><a class="toc-backref" href="#uninitialized-values-flag" role="doc-backlink">Uninitialized values (flag)</a></h3> <p>From several more double-checked locking examples.</p> <blockquote> <div>Initially, <code class="docutils literal notranslate"><span class="pre">d</span> <span class="pre">==</span> <span class="pre">dict()</span></code> and <code class="docutils literal notranslate"><span class="pre">initialized</span> <span class="pre">==</span> <span class="pre">False</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>while not initialized: pass</td> <td>d[‘a’] = 3</td> </tr> <tr class="row-odd"><td>r1 = d[‘a’]</td> <td>initialized = True</td> </tr> <tr class="row-even"><td>r2 = r1 == 3</td> <td></td> </tr> <tr class="row-odd"><td>assert r2</td> <td></td> </tr> </tbody> </table> <p>This could raise a KeyError, fail the assertion, or, without some care in the implementation, cause a crash or other undefined behavior.</p> </div></blockquote> <p>Because <code class="docutils literal notranslate"><span class="pre">d</span></code> and <code class="docutils literal notranslate"><span class="pre">initialized</span></code> are independent (except in the programmer’s mind), the compiler and processor can rearrange these almost arbitrarily, except that thread 1’s assertion has to stay after the loop.</p> </section> <section id="inconsistent-guarantees-from-relying-on-data-dependencies"> <h3><a class="toc-backref" href="#inconsistent-guarantees-from-relying-on-data-dependencies" role="doc-backlink">Inconsistent guarantees from relying on data dependencies</a></h3> <p>This is a problem with Java <code class="docutils literal notranslate"><span class="pre">final</span></code> variables and the proposed <a class="reference external" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2556.html">data-dependency ordering</a> in C++0x.</p> <blockquote> <div>First execute:<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">g</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">def</span> <span class="nf">Init</span><span class="p">():</span> <span class="n">g</span><span class="o">.</span><span class="n">extend</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">])</span> <span class="k">return</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span> <span class="n">h</span> <span class="o">=</span> <span class="kc">None</span> </pre></div> </div> <p>Then in two threads:</p> <table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>while not h: pass</td> <td>r1 = Init()</td> </tr> <tr class="row-odd"><td>assert h == [1,2,3]</td> <td>freeze(r1)</td> </tr> <tr class="row-even"><td>assert h == g</td> <td>h = r1</td> </tr> </tbody> </table> <p>If h has semantics similar to a Java <code class="docutils literal notranslate"><span class="pre">final</span></code> variable (except for being write-once), then even though the first assertion is guaranteed to succeed, the second could fail.</p> </div></blockquote> <p>Data-dependent guarantees like those <code class="docutils literal notranslate"><span class="pre">final</span></code> provides only work if the access is through the final variable. It’s not even safe to access the same object through a different route. Unfortunately, because of how processors work, final’s guarantees are only cheap when they’re weak.</p> </section> </section> <section id="the-rules-for-python"> <h2><a class="toc-backref" href="#the-rules-for-python" role="doc-backlink">The rules for Python</a></h2> <p>The first rule is that Python interpreters can’t crash due to race conditions in user code. For CPython, this means that race conditions can’t make it down into C. For Jython, it means that NullPointerExceptions can’t escape the interpreter.</p> <p>Presumably we also want a model at least as strong as happens-before consistency because it lets us write a simple description of how concurrent queues and thread launching and joining work.</p> <p>Other rules are more debatable, so I’ll present each one with pros and cons.</p> <section id="data-race-free-programs-are-sequentially-consistent"> <h3><a class="toc-backref" href="#data-race-free-programs-are-sequentially-consistent" role="doc-backlink">Data-race-free programs are sequentially consistent</a></h3> <p>We’d like programmers to be able to reason about their programs as if they were sequentially consistent. Since it’s hard to tell whether you’ve written a happens-before race, we only want to require programmers to prevent sequential races. The Java model does this through a complicated definition of causality, but if we don’t want to include that, we can just assert this property directly.</p> </section> <section id="no-security-holes-from-out-of-thin-air-reads"> <h3><a class="toc-backref" href="#no-security-holes-from-out-of-thin-air-reads" role="doc-backlink">No security holes from out-of-thin-air reads</a></h3> <p>If the program produces a self-justifying value, it could expose access to an object that the user would rather the program not see. Again, Java’s model handles this with the causality definition. We might be able to prevent these security problems by banning speculative writes to shared variables, but I don’t have a proof of that, and Python may not need those security guarantees anyway.</p> </section> <section id="restrict-reorderings-instead-of-defining-happens-before"> <h3><a class="toc-backref" href="#restrict-reorderings-instead-of-defining-happens-before" role="doc-backlink">Restrict reorderings instead of defining happens-before</a></h3> <p>The .NET [#CLR-msdn] and x86 [#x86-model] memory models are based on defining which reorderings compilers may allow. I think that it’s easier to program to a happens-before model than to reason about all of the possible reorderings of a program, and it’s easier to insert enough happens-before edges to make a program correct, than to insert enough memory fences to do the same thing. So, although we could layer some reordering restrictions on top of the happens-before base, I don’t think Python’s memory model should be entirely reordering restrictions.</p> </section> <section id="atomic-unordered-assignments"> <h3><a class="toc-backref" href="#atomic-unordered-assignments" role="doc-backlink">Atomic, unordered assignments</a></h3> <p>Assignments of primitive types are already atomic. If you assign <code class="docutils literal notranslate"><span class="pre">3<<72</span> <span class="pre">+</span> <span class="pre">5</span></code> to a variable, no thread can see only part of the value. Jeremy Manson suggested that we extend this to all objects. This allows compilers to reorder operations to optimize them, without allowing some of the more confusing <a class="reference internal" href="#uninitialized-values">uninitialized values</a>. The basic idea here is that when you assign a shared variable, readers can’t see any changes made to the new value before the assignment, or to the old value after the assignment. So, if we have a program like:</p> <blockquote> <div>Initially, <code class="docutils literal notranslate"><span class="pre">(d.a,</span> <span class="pre">d.b)</span> <span class="pre">==</span> <span class="pre">(1,</span> <span class="pre">2)</span></code>, and <code class="docutils literal notranslate"><span class="pre">(e.c,</span> <span class="pre">e.d)</span> <span class="pre">==</span> <span class="pre">(3,</span> <span class="pre">4)</span></code>. We also have <code class="docutils literal notranslate"><span class="pre">class</span> <span class="pre">Obj(object):</span> <span class="pre">pass</span></code>.<table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = Obj()</td> <td>r3 = d</td> </tr> <tr class="row-odd"><td>r1.a = 3</td> <td>r4, r5 = r3.a, r3.b</td> </tr> <tr class="row-even"><td>r1.b = 4</td> <td>r6 = e</td> </tr> <tr class="row-odd"><td>d = r1</td> <td>r7, r8 = r6.c, r6.d</td> </tr> <tr class="row-even"><td>r2 = Obj()</td> <td></td> </tr> <tr class="row-odd"><td>r2.c = 6</td> <td></td> </tr> <tr class="row-even"><td>r2.d = 7</td> <td></td> </tr> <tr class="row-odd"><td>e = r2</td> <td></td> </tr> </tbody> </table> <p><code class="docutils literal notranslate"><span class="pre">(r4,</span> <span class="pre">r5)</span></code> can be <code class="docutils literal notranslate"><span class="pre">(1,</span> <span class="pre">2)</span></code> or <code class="docutils literal notranslate"><span class="pre">(3,</span> <span class="pre">4)</span></code> but nothing else, and <code class="docutils literal notranslate"><span class="pre">(r7,</span> <span class="pre">r8)</span></code> can be either <code class="docutils literal notranslate"><span class="pre">(3,</span> <span class="pre">4)</span></code> or <code class="docutils literal notranslate"><span class="pre">(6,</span> <span class="pre">7)</span></code> but nothing else. Unlike if writes were releases and reads were acquires, it’s legal for thread 2 to see <code class="docutils literal notranslate"><span class="pre">(e.c,</span> <span class="pre">e.d)</span> <span class="pre">==</span> <span class="pre">(6,</span> <span class="pre">7)</span> <span class="pre">and</span> <span class="pre">(d.a,</span> <span class="pre">d.b)</span> <span class="pre">==</span> <span class="pre">(1,</span> <span class="pre">2)</span></code> (out of order).</p> </div></blockquote> <p>This allows the compiler a lot of flexibility to optimize without allowing users to see some strange values. However, because it relies on data dependencies, it introduces some surprises of its own. For example, the compiler could freely optimize the above example to:</p> <blockquote> <div><table class="docutils align-default"> <thead> <tr class="row-odd"><th class="head">Thread 1</th> <th class="head">Thread 2</th> </tr> </thead> <tbody> <tr class="row-even"><td>r1 = Obj()</td> <td>r3 = d</td> </tr> <tr class="row-odd"><td>r2 = Obj()</td> <td>r6 = e</td> </tr> <tr class="row-even"><td>r1.a = 3</td> <td>r4, r7 = r3.a, r6.c</td> </tr> <tr class="row-odd"><td>r2.c = 6</td> <td>r5, r8 = r3.b, r6.d</td> </tr> <tr class="row-even"><td>r2.d = 7</td> <td></td> </tr> <tr class="row-odd"><td>e = r2</td> <td></td> </tr> <tr class="row-even"><td>r1.b = 4</td> <td></td> </tr> <tr class="row-odd"><td>d = r1</td> <td></td> </tr> </tbody> </table> </div></blockquote> <p>As long as it didn’t let the initialization of <code class="docutils literal notranslate"><span class="pre">e</span></code> move above any of the initializations of members of <code class="docutils literal notranslate"><span class="pre">r2</span></code>, and similarly for <code class="docutils literal notranslate"><span class="pre">d</span></code> and <code class="docutils literal notranslate"><span class="pre">r1</span></code>.</p> <p>This also helps to ground happens-before consistency. To see the problem, imagine that the user unsafely publishes a reference to an object as soon as she gets it. The model needs to constrain what values can be read through that reference. Java says that every field is initialized to 0 before anyone sees the object for the first time, but Python would have trouble defining “every field”. If instead we say that assignments to shared variables have to see a value at least as up to date as when the assignment happened, then we don’t run into any trouble with early publication.</p> </section> <section id="two-tiers-of-guarantees"> <h3><a class="toc-backref" href="#two-tiers-of-guarantees" role="doc-backlink">Two tiers of guarantees</a></h3> <p>Most other languages with any guarantees for unlocked variables distinguish between ordinary variables and volatile/atomic variables. They provide many more guarantees for the volatile ones. Python can’t easily do this because we don’t declare variables. This may or may not matter, since python locks aren’t significantly more expensive than ordinary python code. If we want to get those tiers back, we could:</p> <ol class="arabic simple"> <li>Introduce a set of atomic types similar to Java’s <a class="footnote-reference brackets" href="#java-atomics" id="id1">[5]</a> or C++’s <a class="footnote-reference brackets" href="#cpp-atomics" id="id2">[6]</a>. Unfortunately, we couldn’t assign to them with <code class="docutils literal notranslate"><span class="pre">=</span></code>.</li> <li>Without requiring variable declarations, we could also specify that <em>all</em> of the fields on a given object are atomic.</li> <li>Extend the <code class="docutils literal notranslate"><span class="pre">__slots__</span></code> mechanism <a class="footnote-reference brackets" href="#slots" id="id3">[7]</a> with a parallel <code class="docutils literal notranslate"><span class="pre">__volatiles__</span></code> list, and maybe a <code class="docutils literal notranslate"><span class="pre">__finals__</span></code> list.</li> </ol> </section> <section id="id4"> <h3><a class="toc-backref" href="#id4" role="doc-backlink">Sequential Consistency</a></h3> <p>We could just adopt sequential consistency for Python. This avoids all of the <a class="reference internal" href="#pep-583-hazards">hazards</a> mentioned above, but it prohibits lots of optimizations too. As far as I know, this is the current model of CPython, but if CPython learned to optimize out some variable reads, it would lose this property.</p> <p>If we adopt this, Jython’s <code class="docutils literal notranslate"><span class="pre">dict</span></code> implementation may no longer be able to use ConcurrentHashMap because that only promises to create appropriate happens-before edges, not to be sequentially consistent (although maybe the fact that Java volatiles are totally ordered carries over). Both Jython and IronPython would probably need to use <a class="reference external" href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicReferenceArray.html">AtomicReferenceArray</a> or the equivalent for any <code class="docutils literal notranslate"><span class="pre">__slots__</span></code> arrays.</p> </section> <section id="adapt-the-x86-model"> <h3><a class="toc-backref" href="#adapt-the-x86-model" role="doc-backlink">Adapt the x86 model</a></h3> <p>The x86 model is:</p> <ol class="arabic simple"> <li>Loads are not reordered with other loads.</li> <li>Stores are not reordered with other stores.</li> <li>Stores are not reordered with older loads.</li> <li>Loads may be reordered with older stores to different locations but not with older stores to the same location.</li> <li>In a multiprocessor system, memory ordering obeys causality (memory ordering respects transitive visibility).</li> <li>In a multiprocessor system, stores to the same location have a total order.</li> <li>In a multiprocessor system, locked instructions have a total order.</li> <li>Loads and stores are not reordered with locked instructions.</li> </ol> <p>In acquire/release terminology, this appears to say that every store is a release and every load is an acquire. This is slightly weaker than sequential consistency, in that it allows <a class="reference internal" href="#inconsistent-orderings">inconsistent orderings</a>, but it disallows <a class="reference internal" href="#zombie-values">zombie values</a> and the compiler optimizations that produce them. We would probably want to weaken the model somehow to explicitly allow compilers to eliminate redundant variable reads. The x86 model may also be expensive to implement on other platforms, although because x86 is so common, that may not matter much.</p> </section> <section id="upgrading-or-downgrading-to-an-alternate-model"> <h3><a class="toc-backref" href="#upgrading-or-downgrading-to-an-alternate-model" role="doc-backlink">Upgrading or downgrading to an alternate model</a></h3> <p>We can adopt an initial memory model without totally restricting future implementations. If we start with a weak model and want to get stronger later, we would only have to change the implementations, not programs. Individual implementations could also guarantee a stronger memory model than the language demands, although that could hurt interoperability. On the other hand, if we start with a strong model and want to weaken it later, we can add a <code class="docutils literal notranslate"><span class="pre">from</span> <span class="pre">__future__</span> <span class="pre">import</span> <span class="pre">weak_memory</span></code> statement to declare that some modules are safe.</p> </section> </section> <section id="implementation-details"> <h2><a class="toc-backref" href="#implementation-details" role="doc-backlink">Implementation Details</a></h2> <p>The required model is weaker than any particular implementation. This section tries to document the actual guarantees each implementation provides, and should be updated as the implementations change.</p> <section id="cpython"> <h3><a class="toc-backref" href="#cpython" role="doc-backlink">CPython</a></h3> <p>Uses the GIL to guarantee that other threads don’t see funny reorderings, and does few enough optimizations that I believe it’s actually sequentially consistent at the bytecode level. Threads can switch between any two bytecodes (instead of only between statements), so two threads that concurrently execute:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span> </pre></div> </div> <p>with <code class="docutils literal notranslate"><span class="pre">i</span></code> initially <code class="docutils literal notranslate"><span class="pre">0</span></code> could easily end up with <code class="docutils literal notranslate"><span class="pre">i==1</span></code> instead of the expected <code class="docutils literal notranslate"><span class="pre">i==2</span></code>. If they execute:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">i</span> <span class="o">+=</span> <span class="mi">1</span> </pre></div> </div> <p>instead, CPython 2.6 will always give the right answer, but it’s easy to imagine another implementation in which this statement won’t be atomic.</p> </section> <section id="id5"> <h3><a class="toc-backref" href="#id5" role="doc-backlink">PyPy</a></h3> <p>Also uses a GIL, but probably does enough optimization to violate sequential consistency. I know very little about this implementation.</p> </section> <section id="id6"> <h3><a class="toc-backref" href="#id6" role="doc-backlink">Jython</a></h3> <p>Provides true concurrency under the <a class="reference external" href="http://java.sun.com/docs/books/jls/third_edition/html/memory.html">Java memory model</a> and stores all object fields (except for those in <code class="docutils literal notranslate"><span class="pre">__slots__</span></code>?) in a <a class="reference external" href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html">ConcurrentHashMap</a>, which provides fairly strong ordering guarantees. Local variables in a function may have fewer guarantees, which would become visible if they were captured into a closure that was then passed to another thread.</p> </section> <section id="id7"> <h3><a class="toc-backref" href="#id7" role="doc-backlink">IronPython</a></h3> <p>Provides true concurrency under the CLR memory model, which probably protects it from <a class="reference internal" href="#uninitialized-values">uninitialized values</a>. IronPython uses a locked map to store object fields, providing at least as many guarantees as Jython.</p> </section> </section> <section id="references"> <h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2> <aside class="footnote-list brackets"> <aside class="footnote brackets" id="jmm-popl" role="doc-footnote"> <dt class="label" id="jmm-popl">[1]</dt> <dd>The Java Memory Model, by Jeremy Manson, Bill Pugh, and Sarita Adve (<a class="reference external" href="http://www.cs.umd.edu/users/jmanson/java/journal.pdf">http://www.cs.umd.edu/users/jmanson/java/journal.pdf</a>). This paper is an excellent introduction to memory models in general and has lots of examples of compiler/processor optimizations and the strange program behaviors they can produce.</aside> <aside class="footnote brackets" id="cpp0x-memory-model" role="doc-footnote"> <dt class="label" id="cpp0x-memory-model">[2]</dt> <dd>N2480: A Less Formal Explanation of the Proposed C++ Concurrency Memory Model, Hans Boehm (<a class="reference external" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2480.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2480.html</a>)</aside> <aside class="footnote brackets" id="clr-msdn" role="doc-footnote"> <dt class="label" id="clr-msdn">[3]</dt> <dd>Memory Models: Understand the Impact of Low-Lock Techniques in Multithreaded Apps, Vance Morrison (<a class="reference external" href="http://msdn2.microsoft.com/en-us/magazine/cc163715.aspx">http://msdn2.microsoft.com/en-us/magazine/cc163715.aspx</a>)</aside> <aside class="footnote brackets" id="x86-model" role="doc-footnote"> <dt class="label" id="x86-model">[4]</dt> <dd>Intel(R) 64 Architecture Memory Ordering White Paper (<a class="reference external" href="http://www.intel.com/products/processor/manuals/318147.pdf">http://www.intel.com/products/processor/manuals/318147.pdf</a>)</aside> <aside class="footnote brackets" id="java-atomics" role="doc-footnote"> <dt class="label" id="java-atomics">[<a href="#id1">5</a>]</dt> <dd>Package java.util.concurrent.atomic (<a class="reference external" href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/atomic/package-summary.html">http://java.sun.com/javase/6/docs/api/java/util/concurrent/atomic/package-summary.html</a>)</aside> <aside class="footnote brackets" id="cpp-atomics" role="doc-footnote"> <dt class="label" id="cpp-atomics">[<a href="#id2">6</a>]</dt> <dd>C++ Atomic Types and Operations, Hans Boehm and Lawrence Crowl (<a class="reference external" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2427.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2427.html</a>)</aside> <aside class="footnote brackets" id="slots" role="doc-footnote"> <dt class="label" id="slots">[<a href="#id3">7</a>]</dt> <dd>__slots__ (<a class="reference external" href="http://docs.python.org/ref/slots.html">http://docs.python.org/ref/slots.html</a>)</aside> <aside class="footnote brackets" id="id8" role="doc-footnote"> <dt class="label" id="id8">[8]</dt> <dd>Alternatives to SC, a thread on the cpp-threads mailing list, which includes lots of good examples. (<a class="reference external" href="http://www.decadentplace.org.uk/pipermail/cpp-threads/2007-January/001287.html">http://www.decadentplace.org.uk/pipermail/cpp-threads/2007-January/001287.html</a>)</aside> <aside class="footnote brackets" id="safethread" role="doc-footnote"> <dt class="label" id="safethread">[9]</dt> <dd>python-safethread, a patch by Adam Olsen for CPython that removes the GIL and statically guarantees that all objects shared between threads are consistently locked. (<a class="reference external" href="http://code.google.com/p/python-safethread/">http://code.google.com/p/python-safethread/</a>)</aside> </aside> </section> <section id="acknowledgements"> <h2><a class="toc-backref" href="#acknowledgements" role="doc-backlink">Acknowledgements</a></h2> <p>Thanks to Jeremy Manson and Alex Martelli for detailed discussions on what this PEP should look like.</p> </section> <section id="copyright"> <h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2> <p>This document has been placed in the public domain.</p> </section> </section> <hr class="docutils" /> <p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0583.rst">https://github.com/python/peps/blob/main/peps/pep-0583.rst</a></p> <p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0583.rst">2023-09-09 17:39:29 GMT</a></p> </article> <nav id="pep-sidebar"> <h2>Contents</h2> <ul> <li><a class="reference internal" href="#abstract">Abstract</a></li> <li><a class="reference internal" href="#rationale">Rationale</a></li> <li><a class="reference internal" href="#a-couple-definitions">A couple definitions</a></li> <li><a class="reference internal" href="#two-simple-memory-models">Two simple memory models</a><ul> <li><a class="reference internal" href="#sequential-consistency">Sequential Consistency</a></li> <li><a class="reference internal" href="#happens-before-consistency">Happens-before consistency</a><ul> <li><a class="reference internal" href="#an-example">An example</a></li> </ul> </li> </ul> </li> <li><a class="reference internal" href="#surprising-behaviors-with-races">Surprising behaviors with races</a><ul> <li><a class="reference internal" href="#zombie-values">Zombie values</a></li> <li><a class="reference internal" href="#inconsistent-orderings">Inconsistent Orderings</a></li> <li><a class="reference internal" href="#a-happens-before-race-that-s-not-a-sequentially-consistent-race">A happens-before race that’s not a sequentially-consistent race</a></li> <li><a class="reference internal" href="#self-justifying-values">Self-justifying values</a></li> <li><a class="reference internal" href="#uninitialized-values-direct">Uninitialized values (direct)</a></li> <li><a class="reference internal" href="#uninitialized-values-flag">Uninitialized values (flag)</a></li> <li><a class="reference internal" href="#inconsistent-guarantees-from-relying-on-data-dependencies">Inconsistent guarantees from relying on data dependencies</a></li> </ul> </li> <li><a class="reference internal" href="#the-rules-for-python">The rules for Python</a><ul> <li><a class="reference internal" href="#data-race-free-programs-are-sequentially-consistent">Data-race-free programs are sequentially consistent</a></li> <li><a class="reference internal" href="#no-security-holes-from-out-of-thin-air-reads">No security holes from out-of-thin-air reads</a></li> <li><a class="reference internal" href="#restrict-reorderings-instead-of-defining-happens-before">Restrict reorderings instead of defining happens-before</a></li> <li><a class="reference internal" href="#atomic-unordered-assignments">Atomic, unordered assignments</a></li> <li><a class="reference internal" href="#two-tiers-of-guarantees">Two tiers of guarantees</a></li> <li><a class="reference internal" href="#id4">Sequential Consistency</a></li> <li><a class="reference internal" href="#adapt-the-x86-model">Adapt the x86 model</a></li> <li><a class="reference internal" href="#upgrading-or-downgrading-to-an-alternate-model">Upgrading or downgrading to an alternate model</a></li> </ul> </li> <li><a class="reference internal" href="#implementation-details">Implementation Details</a><ul> <li><a class="reference internal" href="#cpython">CPython</a></li> <li><a class="reference internal" href="#id5">PyPy</a></li> <li><a class="reference internal" href="#id6">Jython</a></li> <li><a class="reference internal" href="#id7">IronPython</a></li> </ul> </li> <li><a class="reference internal" href="#references">References</a></li> <li><a class="reference internal" href="#acknowledgements">Acknowledgements</a></li> <li><a class="reference internal" href="#copyright">Copyright</a></li> </ul> <br> <a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0583.rst">Page Source (GitHub)</a> </nav> </section> <script src="../_static/colour_scheme.js"></script> <script src="../_static/wrap_tables.js"></script> <script src="../_static/sticky_banner.js"></script> </body> </html>