CINXE.COM
Concurrency control — Delta Lake Documentation
<!DOCTYPE html> <!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> <head> <meta charset="utf-8"> <meta content="Learn about the ACID transaction guarantees between reads and writes provided by Delta Lake." name="description" /> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Concurrency control — Delta Lake Documentation</title> <link rel="canonical" href="https://docs.delta.io/latest/concurrency-control.html"> <link rel="shortcut icon" href="_static/favicon.ico"/> <link href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:300,400,600" rel="stylesheet"> <link rel="stylesheet" href="_static/css/theme.css" type="text/css" /> <link rel="stylesheet" href="_static/css/custom.css" type="text/css" /> <link rel="stylesheet" href="_static/css/algolia.css" type="text/css" /> <link rel="index" title="Index" href="genindex.html"/> <link rel="search" title="Search" href="search.html"/> <link rel="top" title="Delta Lake Documentation" href="index.html"/> <link rel="up" title="Apache Spark connector" href="delta-spark.html"/> <link rel="next" title="Migration guide" href="/porting.html"/> <link rel="prev" title="Read Delta Sharing Tables" href="/delta-sharing.html"/> <script src="_static/js/modernizr.min.js"></script> </head> <body class="wy-body-for-nav" role="document"> <nav class="wy-nav-top header" role="navigation" aria-label="top navigation"> <ul> <li class="menu-toggle"> <i data-toggle="wy-nav-top" class="wy-nav-top-menu-button db-icon db-icon-menu pull-left"></i> <a href="index.html" class="wy-nav-top-logo"><img src="_static/delta-lake-logo.png" alt="Delta Lake" /></a> <span class="version">3.3.1</span> </li> </ul> </nav> <page> <nav data-toggle="wy-nav-shift" class="wy-nav-side relative"> <div class="wy-side-scroll"> <div class="wy-side-nav-search"> <div role="search"> <input id="algolia-search" type="text" name="q" placeholder="Search" /> </div> </div> <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation"> <a href="index.html" class="main-navigation-home"><img src="_static/icons/nav-home.svg"> Delta Lake</a> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="delta-intro.html">Introduction</a></li> <li class="toctree-l1 current"><a class="reference internal" href="delta-spark.html">Apache Spark connector</a><ul class="current"> <li class="toctree-l2"><a class="reference internal" href="quick-start.html">Quickstart</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-batch.html">Table batch reads and writes</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-streaming.html">Table streaming reads and writes</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-update.html">Table deletes, updates, and merges</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-change-data-feed.html">Change data feed</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-utility.html">Table utility commands</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-constraints.html">Constraints</a></li> <li class="toctree-l2"><a class="reference internal" href="versioning.html">How does Delta Lake manage feature compatibility?</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-default-columns.html">Delta default column values</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-column-mapping.html">Delta column mapping</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-clustering.html">Use liquid clustering for Delta tables</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-deletion-vectors.html">What are deletion vectors?</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-drop-feature.html">Drop Delta table features</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-row-tracking.html">Use row tracking for Delta tables</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-storage.html">Storage configuration</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-type-widening.html">Delta type widening</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-uniform.html">Universal Format (UniForm)</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-sharing.html">Read Delta Sharing Tables</a></li> <li class="toctree-l2 current"><a class="current reference internal" href="#">Concurrency control</a><ul> <li class="toctree-l3"><a class="reference internal" href="#optimistic-concurrency-control">Optimistic concurrency control</a></li> <li class="toctree-l3"><a class="reference internal" href="#write-conflicts">Write conflicts</a></li> <li class="toctree-l3"><a class="reference internal" href="#avoid-conflicts-using-partitioning-and-disjoint-command-conditions">Avoid conflicts using partitioning and disjoint command conditions</a></li> <li class="toctree-l3"><a class="reference internal" href="#conflict-exceptions">Conflict exceptions</a><ul> <li class="toctree-l4"><a class="reference internal" href="#concurrentappendexception">ConcurrentAppendException</a></li> <li class="toctree-l4"><a class="reference internal" href="#concurrentdeletereadexception">ConcurrentDeleteReadException</a></li> <li class="toctree-l4"><a class="reference internal" href="#concurrentdeletedeleteexception">ConcurrentDeleteDeleteException</a></li> <li class="toctree-l4"><a class="reference internal" href="#metadatachangedexception">MetadataChangedException</a></li> <li class="toctree-l4"><a class="reference internal" href="#concurrenttransactionexception">ConcurrentTransactionException</a></li> <li class="toctree-l4"><a class="reference internal" href="#protocolchangedexception">ProtocolChangedException</a></li> </ul> </li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="porting.html">Migration guide</a></li> <li class="toctree-l2"><a class="reference internal" href="best-practices.html">Best practices</a></li> <li class="toctree-l2"><a class="reference internal" href="delta-faq.html">Frequently asked questions (FAQ)</a></li> <li class="toctree-l2"><a class="reference internal" href="optimizations-oss.html">Optimizations</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="delta-trino-integration.html">Trino connector</a></li> <li class="toctree-l1"><a class="reference internal" href="delta-presto-integration.html">Presto connector</a></li> <li class="toctree-l1"><a class="reference internal" href="redshift-spectrum-integration.html">AWS Redshift Spectrum connector</a></li> <li class="toctree-l1"><a class="reference internal" href="snowflake-integration.html">Snowflake connector</a></li> <li class="toctree-l1"><a class="reference internal" href="bigquery-integration.html">Google BigQuery connector</a></li> <li class="toctree-l1"><a class="reference internal" href="flink-integration.html">Apache Flink connector</a></li> <li class="toctree-l1"><a class="reference internal" href="delta-more-connectors.html">Other connectors</a></li> <li class="toctree-l1"><a class="reference internal" href="delta-kernel.html">Delta Kernel</a></li> <li class="toctree-l1"><a class="reference internal" href="delta-standalone.html">Delta Standalone (deprecated)</a></li> <li class="toctree-l1"><a class="reference internal" href="delta-apidoc.html">Delta Lake APIs</a></li> <li class="toctree-l1"><a class="reference internal" href="releases.html">Releases</a></li> <li class="toctree-l1"><a class="reference internal" href="delta-resources.html">Delta Lake resources</a></li> <li class="toctree-l1"><a class="reference internal" href="table-properties.html">Delta table properties reference</a></li> </ul> </div> <div role="contentinfo"> <p class="build_info"> Updated Mar 27, 2025 </p> <p> <a id='feedbacklink' href="https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md" target="_blank">Contribute</a> </p> </div> </div> </nav> <main class="wy-grid-for-nav"> <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"> <div class="wy-nav-content"> <div class="rst-content"> <div role="navigation" aria-label="breadcrumbs navigation"> <ul class="wy-breadcrumbs"> <li><a href="index.html">Documentation</a> <span class="db-icon db-icon-chevron-right"></span></li> <li><a href="delta-spark.html">Apache Spark connector</a> <span class="db-icon db-icon-chevron-right"></span></li> <li>Concurrency control</li> <li class="wy-breadcrumbs-aside"> <a href="https://github.com/delta-io/delta" class="fa fa-github"> Delta Lake GitHub repo</a> </li> </ul> </div> <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> <div itemprop="articleBody"> <div class="section" id="concurrency-control"> <h1>Concurrency control<a class="headerlink" href="#concurrency-control" title="Permalink to this headline"> </a></h1> <p>Delta Lake provides ACID transaction guarantees between reads and writes. This means that:</p> <ul class="simple"> <li><p>For supported <a class="reference internal" href="delta-storage.html"><span class="doc">storage systems</span></a>, multiple writers across multiple clusters can simultaneously modify a table partition and see a consistent snapshot view of the table and there will be a serial order for these writes.</p></li> <li><p>Readers continue to see a consistent snapshot view of the table that the Apache Spark job started with, even when a table is modified during a job.</p></li> </ul> <div class="contents local topic" id="in-this-article"> <p class="topic-title first">In this article:</p> <ul class="simple"> <li><p><a class="reference internal" href="#optimistic-concurrency-control" id="id1">Optimistic concurrency control</a></p></li> <li><p><a class="reference internal" href="#write-conflicts" id="id2">Write conflicts</a></p></li> <li><p><a class="reference internal" href="#avoid-conflicts-using-partitioning-and-disjoint-command-conditions" id="id3">Avoid conflicts using partitioning and disjoint command conditions</a></p></li> <li><p><a class="reference internal" href="#conflict-exceptions" id="id4">Conflict exceptions</a></p></li> </ul> </div> <div class="section" id="optimistic-concurrency-control"> <span id="-optimistic-concurrency-control"></span><h2><a class="toc-backref" href="#id1">Optimistic concurrency control</a><a class="headerlink" href="#optimistic-concurrency-control" title="Permalink to this headline"> </a></h2> <p>Delta Lake uses <a class="reference external" href="https://en.wikipedia.org/wiki/Optimistic_concurrency_control">optimistic concurrency control</a> to provide transactional guarantees between writes. Under this mechanism, writes operate in three stages:</p> <ol class="arabic simple"> <li><p><strong>Read</strong>: Reads (if needed) the latest available version of the table to identify which files need to be modified (that is, rewritten).</p></li> <li><p><strong>Write</strong>: Stages all the changes by writing new data files.</p></li> <li><p><strong>Validate and commit</strong>: Before committing the changes, checks whether the proposed changes conflict with any other changes that may have been concurrently committed since the snapshot that was read. If there are no conflicts, all the staged changes are committed as a new versioned snapshot, and the write operation succeeds. However, if there are conflicts, the write operation fails with a concurrent modification exception rather than corrupting the table as would happen with the write operation on a Parquet table.</p></li> </ol> </div> <div class="section" id="write-conflicts"> <span id="-write-conflicts"></span><h2><a class="toc-backref" href="#id2">Write conflicts</a><a class="headerlink" href="#write-conflicts" title="Permalink to this headline"> </a></h2> <p>The following table describes which pairs of write operations can conflict. Compaction refers to <a class="reference internal" href="best-practices.html#-compact-files"><span class="std std-ref">file compaction operation</span></a> written with the option <code class="docutils literal notranslate"><span class="pre">dataChange</span></code> set to <code class="docutils literal notranslate"><span class="pre">false</span></code>.</p> <table class="docutils align-center"> <colgroup> <col style="width: 25%" /> <col style="width: 25%" /> <col style="width: 25%" /> <col style="width: 25%" /> </colgroup> <thead> <tr class="row-odd"><th class="head"></th> <th class="head"><p>INSERT</p></th> <th class="head"><p>UPDATE, DELETE, MERGE INTO</p></th> <th class="head"><p>COMPACTION</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p><strong>INSERT</strong></p></td> <td><p>Cannot conflict</p></td> <td></td> <td></td> </tr> <tr class="row-odd"><td><p><strong>UPDATE, DELETE, MERGE INTO</strong></p></td> <td><p>Can conflict</p></td> <td><p>Can conflict</p></td> <td></td> </tr> <tr class="row-even"><td><p><strong>COMPACTION</strong></p></td> <td><p>Cannot conflict</p></td> <td><p>Can conflict</p></td> <td><p>Can conflict</p></td> </tr> </tbody> </table> </div> <div class="section" id="avoid-conflicts-using-partitioning-and-disjoint-command-conditions"> <span id="-avoid-conflicts-using-partitioning-and-disjoint-command-conditions"></span><h2><a class="toc-backref" href="#id3">Avoid conflicts using partitioning and disjoint command conditions</a><a class="headerlink" href="#avoid-conflicts-using-partitioning-and-disjoint-command-conditions" title="Permalink to this headline"> </a></h2> <p>In all cases marked “can conflict”, whether the two operations will conflict depends on whether they operate on the same set of files. You can make the two sets of files disjoint by partitioning the table by the same columns as those used in the conditions of the operations. For example, the two commands <code class="docutils literal notranslate"><span class="pre">UPDATE</span> <span class="pre">table</span> <span class="pre">WHERE</span> <span class="pre">date</span> <span class="pre">></span> <span class="pre">'2010-01-01'</span> <span class="pre">...</span></code> and <code class="docutils literal notranslate"><span class="pre">DELETE</span> <span class="pre">table</span> <span class="pre">WHERE</span> <span class="pre">date</span> <span class="pre"><</span> <span class="pre">'2010-01-01'</span></code> will conflict if the table is not partitioned by date, as both can attempt to modify the same set of files. Partitioning the table by <code class="docutils literal notranslate"><span class="pre">date</span></code> will avoid the conflict. Hence, partitioning a table according to the conditions commonly used on the command can reduce conflicts significantly. However, partitioning a table by a column that has high cardinality can lead to other performance issues due to large number of subdirectories.</p> </div> <div class="section" id="conflict-exceptions"> <span id="-conflict-exceptions"></span><h2><a class="toc-backref" href="#id4">Conflict exceptions</a><a class="headerlink" href="#conflict-exceptions" title="Permalink to this headline"> </a></h2> <p>When a transaction conflict occurs, you will observe one of the following exceptions:</p> <div class="contents local topic" id="contents"> <ul class="simple"> <li><p><a class="reference internal" href="#concurrentappendexception" id="id5">ConcurrentAppendException</a></p></li> <li><p><a class="reference internal" href="#concurrentdeletereadexception" id="id6">ConcurrentDeleteReadException</a></p></li> <li><p><a class="reference internal" href="#concurrentdeletedeleteexception" id="id7">ConcurrentDeleteDeleteException</a></p></li> <li><p><a class="reference internal" href="#metadatachangedexception" id="id8">MetadataChangedException</a></p></li> <li><p><a class="reference internal" href="#concurrenttransactionexception" id="id9">ConcurrentTransactionException</a></p></li> <li><p><a class="reference internal" href="#protocolchangedexception" id="id10">ProtocolChangedException</a></p></li> </ul> </div> <div class="section" id="concurrentappendexception"> <span id="-concurrentappendexception"></span><h3><a class="toc-backref" href="#id5">ConcurrentAppendException</a><a class="headerlink" href="#concurrentappendexception" title="Permalink to this headline"> </a></h3> <p>This exception occurs when a concurrent operation adds files in the same partition (or anywhere in an unpartitioned table) that your operation reads. The file additions can be caused by <code class="docutils literal notranslate"><span class="pre">INSERT</span></code>, <code class="docutils literal notranslate"><span class="pre">DELETE</span></code>, <code class="docutils literal notranslate"><span class="pre">UPDATE</span></code>, or <code class="docutils literal notranslate"><span class="pre">MERGE</span></code> operations.</p> <p>This exception is often thrown during concurrent <code class="docutils literal notranslate"><span class="pre">DELETE</span></code>, <code class="docutils literal notranslate"><span class="pre">UPDATE</span></code>, or <code class="docutils literal notranslate"><span class="pre">MERGE</span></code> operations. While the concurrent operations may be physically updating different partition directories, one of them may read the same partition that the other one concurrently updates, thus causing a conflict. You can avoid this by making the separation explicit in the operation condition. Consider the following example.</p> <div class="highlight-scala notranslate"><div class="highlight"><pre><span></span><span class="c1">// Target 'deltaTable' is partitioned by date and country</span> <span class="n">deltaTable</span><span class="p">.</span><span class="n">as</span><span class="p">(</span><span class="s">"t"</span><span class="p">).</span><span class="n">merge</span><span class="p">(</span> <span class="w"> </span><span class="n">source</span><span class="p">.</span><span class="n">as</span><span class="p">(</span><span class="s">"s"</span><span class="p">),</span> <span class="w"> </span><span class="s">"s.user_id = t.user_id AND s.date = t.date AND s.country = t.country"</span><span class="p">)</span> <span class="w"> </span><span class="p">.</span><span class="n">whenMatched</span><span class="p">().</span><span class="n">updateAll</span><span class="p">()</span> <span class="w"> </span><span class="p">.</span><span class="n">whenNotMatched</span><span class="p">().</span><span class="n">insertAll</span><span class="p">()</span> <span class="w"> </span><span class="p">.</span><span class="n">execute</span><span class="p">()</span> </pre></div> </div> <p>Suppose you run the above code concurrently for different dates or countries. Since each job is working on an independent partition on the target Delta table, you don’t expect any conflicts. However, the condition is not explicit enough and can scan the entire table and can conflict with concurrent operations updating any other partitions. Instead, you can rewrite your statement to add specific date and country to the merge condition, as shown in the following example.</p> <div class="highlight-scala notranslate"><div class="highlight"><pre><span></span><span class="c1">// Target 'deltaTable' is partitioned by date and country</span> <span class="n">deltaTable</span><span class="p">.</span><span class="n">as</span><span class="p">(</span><span class="s">"t"</span><span class="p">).</span><span class="n">merge</span><span class="p">(</span> <span class="w"> </span><span class="n">source</span><span class="p">.</span><span class="n">as</span><span class="p">(</span><span class="s">"s"</span><span class="p">),</span> <span class="w"> </span><span class="s">"s.user_id = t.user_id AND s.date = t.date AND s.country = t.country AND t.date = '"</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="o"><</span><span class="n">date</span><span class="o">></span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">"' AND t.country = '"</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="o"><</span><span class="n">country</span><span class="o">></span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">"'"</span><span class="p">)</span> <span class="w"> </span><span class="p">.</span><span class="n">whenMatched</span><span class="p">().</span><span class="n">updateAll</span><span class="p">()</span> <span class="w"> </span><span class="p">.</span><span class="n">whenNotMatched</span><span class="p">().</span><span class="n">insertAll</span><span class="p">()</span> <span class="w"> </span><span class="p">.</span><span class="n">execute</span><span class="p">()</span> </pre></div> </div> <p>This operation is now safe to run concurrently on different dates and countries.</p> </div> <div class="section" id="concurrentdeletereadexception"> <span id="-concurrentdeletereadexception"></span><h3><a class="toc-backref" href="#id6">ConcurrentDeleteReadException</a><a class="headerlink" href="#concurrentdeletereadexception" title="Permalink to this headline"> </a></h3> <p>This exception occurs when a concurrent operation deleted a file that your operation read. Common causes are a <code class="docutils literal notranslate"><span class="pre">DELETE</span></code>, <code class="docutils literal notranslate"><span class="pre">UPDATE</span></code>, or <code class="docutils literal notranslate"><span class="pre">MERGE</span></code> operation that rewrites files.</p> </div> <div class="section" id="concurrentdeletedeleteexception"> <span id="-concurrentdeletedeleteexception"></span><h3><a class="toc-backref" href="#id7">ConcurrentDeleteDeleteException</a><a class="headerlink" href="#concurrentdeletedeleteexception" title="Permalink to this headline"> </a></h3> <p>This exception occurs when a concurrent operation deleted a file that your operation also deletes. This could be caused by two concurrent compaction operations rewriting the same files.</p> </div> <div class="section" id="metadatachangedexception"> <span id="-metadatachangedexception"></span><h3><a class="toc-backref" href="#id8">MetadataChangedException</a><a class="headerlink" href="#metadatachangedexception" title="Permalink to this headline"> </a></h3> <p>This exception occurs when a concurrent transaction updates the metadata of a Delta table. Common causes are <code class="docutils literal notranslate"><span class="pre">ALTER</span> <span class="pre">TABLE</span></code> operations or writes to your Delta table that update the schema of the table.</p> </div> <div class="section" id="concurrenttransactionexception"> <span id="-concurrenttransactionexception"></span><h3><a class="toc-backref" href="#id9">ConcurrentTransactionException</a><a class="headerlink" href="#concurrenttransactionexception" title="Permalink to this headline"> </a></h3> <p>If a streaming query using the same checkpoint location is started multiple times concurrently and tries to write to the Delta table at the same time. You should never have two streaming queries use the same checkpoint location and run at the same time.</p> </div> <div class="section" id="protocolchangedexception"> <span id="-protocolchangedexception"></span><h3><a class="toc-backref" href="#id10">ProtocolChangedException</a><a class="headerlink" href="#protocolchangedexception" title="Permalink to this headline"> </a></h3> <p>This exception can occur in the following cases:</p> <ul class="simple"> <li><p>When your Delta table is upgraded to a new version. For future operations to succeed you may need to upgrade your Delta Lake version.</p></li> <li><p>When multiple writers are creating or replacing a table at the same time.</p></li> <li><p>When multiple writers are writing to an empty path at the same time.</p></li> </ul> </div> </div> </div> </div> </div> <footer> <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation"> <a href="delta-sharing.html" class="btn btn-neutral" title="Read Delta Sharing Tables" accesskey="p"><span class="db-icon db-icon-chevron-left"></span> Previous</a> <a href="porting.html" class="btn btn-neutral" title="Migration guide" accesskey="n">Next <span class="db-icon db-icon-chevron-right"></span></a> </div> <hr/> <div role="contentinfo"> </div> </footer> </div> </div> </section> </main> </page> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT:'./', VERSION:'3.3.1', COLLAPSE_INDEX:false, FILE_SUFFIX:'.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="_static/jquery.js"></script> <script type="text/javascript" src="_static/underscore.js"></script> <script type="text/javascript" src="_static/doctools.js"></script> <script type="text/javascript" src="_static/language_data.js"></script> <script type="text/javascript" src="_static/js/clipboard.min.js"></script> <script type="text/javascript" src="_static/js/jquery.waypoints.min.js"></script> <script type="text/javascript">var CLIPPY_SVG_PATH = "_static/clippy.svg";</script> <script type="text/javascript" src="_static/js/custom.js"></script> <script type="text/javascript"> jQuery(function () { SphinxRtdTheme.StickyNav.enable(); }); </script> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-138952006-1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-138952006-1'); </script> <!-- Algolia Search --> <script type="text/javascript"> var algoliaConfigs = { key: '65b6bc89fa5c7427e1c3b7996d1b3735', index: 'delta_io', }; </script> <div id="algolia-wrapper"></div> <script src="_static/js/tether.min.js"></script> <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.js"></script> </body> </html>