CINXE.COM
Ownership-based Buffer Deallocation - MLIR
<!doctype html><html lang=en-us><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"><title>Ownership-based Buffer Deallocation - MLIR</title><meta name=description content="Multi-Level IR Compiler Framework"><meta name=generator content="Hugo 0.119.0"><link href=https://mlir.llvm.org/index.xml rel=alternate type=application/rss+xml><link rel=canonical href=https://mlir.llvm.org/docs/OwnershipBasedBufferDeallocation/><link rel=stylesheet href=https://mlir.llvm.org/css/theme.css><script src=https://use.fontawesome.com/releases/v5.0.6/js/all.js></script> <link rel=stylesheet href=https://mlir.llvm.org/css/chroma.min.css><script src=https://cdn.jsdelivr.net/npm/jquery@3.3.1/dist/jquery.min.js></script> <script src=https://cdn.jsdelivr.net/npm/jquery.easing@1.4.1/jquery.easing.min.js></script> <script src=https://mlir.llvm.org/js/bundle.js></script> <script type=text/javascript src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> <script type=text/x-mathjax-config> MathJax.Hub.Config({ tex2jax: { inlineMath: [['$', '$'] ], displayMath: [ ['$$','$$'], ["\\[","\\]"] ] } }); </script><link rel=apple-touch-icon sizes=180x180 href="/apple-touch-icon.png?v=1"><link rel=icon type=image/png sizes=32x32 href="/favicon-32x32.png?v=1"><link rel=icon type=image/png sizes=16x16 href="/favicon-16x16.png?v=1"><link rel=manifest href="/site.webmanifest?v=1"><link rel=mask-icon href="/safari-pinned-tab.svg?v=1" color=#3775e0><link rel="shortcut icon" href="/favicon.ico?v=1"><meta name=msapplication-TileColor content="#2d89ef"><meta name=theme-color content="#ffffff"><link rel=icon href=/favicon.svg type=image/svg+xml sizes=any><style>:root{}</style></head><body><div class=container><header><h1><div><img src=https://mlir.llvm.org//mlir-logo.png width=40px align=absmiddle> MLIR</div></h1><p class=description>Multi-Level IR Compiler Framework</p></header><div class=global-menu><nav><ul><li class=parent><a href>Community<i class="fas fa-angle-right"></i></a><ul class=sub-menu><li class=child><a href=https://llvm.discourse.group/c/mlir/31>Forums</a></li><li class=child><a href=https://discord.gg/xS7Z362>Chat</a></li></ul></li><li><a href=/getting_started/Debugging/>Debugging Tips</a></li><li><a href=/getting_started/Faq/>FAQ</a></li><li class=parent><a href=https://github.com/llvm/llvm-project/tree/main/mlir>Source<i class="fas fa-angle-right"></i></a><ul class=sub-menu><li class=child><a href=/doxygen/>Doxygen</a></li><li class=child><a href=https://github.com/llvm/llvm-project/tree/main/mlir>GitHub</a></li></ul></li><li><a href="https://bugs.llvm.org/buglist.cgi?bug_status=__open__&list_id=177877&order=changeddate%20DESC%2Cpriority%2Cbug_severity&product=MLIR&query_format=specific">Bugs</a></li><li><a href=https://github.com/llvm/mlir-www/tree/main/website/static/LogoAssets>Logo Assets</a></li><li><a href=https://www.youtube.com/MLIRCompiler>Youtube Channel</a></li></ul></nav></div><div class=content-container><main><h1>Ownership-based Buffer Deallocation</h1><p><nav id=TableOfContents><ul><li><a href=#function-boundary-abi>Function boundary ABI</a></li><li><a href=#inserting-bufferizationdealloc-operations>Inserting <code>bufferization.dealloc</code> operations</a></li><li><a href=#supported-interfaces>Supported interfaces</a></li><li><a href=#limitations>Limitations</a></li><li><a href=#example>Example</a></li><li><a href=#buffer-deallocation-simplification-pass>Buffer Deallocation Simplification Pass</a></li><li><a href=#lower-deallocations-pass>Lower Deallocations Pass</a><ul><li><a href=#generic-lowering>Generic Lowering</a></li><li><a href=#specialized-lowerings>Specialized Lowerings</a></li></ul></li></ul></nav><p>One-Shot Bufferize does not deallocate any buffers that it allocates. After running One-Shot Bufferize, the resulting IR may have a number of <code>memref.alloc</code> ops, but no <code>memref.dealloc</code> ops. Buffer dellocation is delegated to the <code>-ownership-based-buffer-deallocation</code> pass. This pass supersedes the now deprecated <code>-buffer-deallocation</code> pass, which does not work well with One-Shot Bufferize.</p><p>On a high level, buffers are “owned” by a basic block. Ownership materializes as an <code>i1</code> SSA value and can be thought of as “responsibility to deallocate”. It is conceptually similar to <code>std::unique_ptr</code> in C++.</p><p>There are few additional preprocessing and postprocessing passes that should be run together with the ownership-based buffer deallocation pass. The recommended compilation pipeline is as follows:</p><pre tabindex=0><code>one-shot-bufferize | it's recommended to perform all bufferization here at latest, | <- any allocations inserted after this point have to be handled V manually expand-realloc V ownership-based-buffer-deallocation V canonicalize <- mostly for scf.if simplifications V buffer-deallocation-simplification V <- from this point onwards no tensor values are allowed lower-deallocations V CSE V canonicalize </code></pre><p>The entire deallocation pipeline (excluding <code>-one-shot-bufferize</code>) is exposed as <code>-buffer-deallocation-pipeline</code>.</p><p>The ownership-based buffer deallocation pass processes operations implementing <code>FunctionOpInterface</code> one-by-one without analysing the call-graph. This means that there have to be <a href=#function-boundary-abi>some rules</a> on how MemRefs are handled when being passed from one function to another. The rest of the pass revolves heavily around the <code>bufferization.dealloc</code> operation which is inserted at the end of each basic block with appropriate operands and should be optimized using the Buffer Deallocation Simplification pass (<code>--buffer-deallocation-simplification</code>) and the regular canonicalizer (<code>--canonicalize</code>). Lowering the result of the <code>-ownership-based-buffer-deallocation</code> pass directly using <code>--convert-bufferization-to-memref</code> without beforehand optimization is not recommended as it will lead to very inefficient code (the runtime-cost of <code>bufferization.dealloc</code> is <code>O(|memrefs|^2+|memref|*|retained|)</code>).</p><h2 id=function-boundary-abi>Function boundary ABI <a class=headline-hash href=#function-boundary-abi>¶</a></h2><p>The Buffer Deallocation pass operates on the level of operations implementing the <code>FunctionOpInterface</code>. Such operations can take MemRefs as arguments, but also return them. To ensure compatibility among all functions (including external ones), some rules have to be enforced:</p><ul><li>When a MemRef is passed as a function argument, ownership is never acquired. It is always the caller’s responsibility to deallocate such MemRefs.</li><li>Returning a MemRef from a function always passes ownership to the caller, i.e., it is also the caller’s responsibility to deallocate memrefs returned from a called function.</li><li>A function must not return a MemRef with the same allocated base buffer as one of its arguments (in this case a copy has to be created). Note that in this context two subviews of the same buffer that don’t overlap are also considered to alias.</li></ul><p>For external functions (e.g., library functions written externally in C), the externally provided implementation has to adhere to these rules and they are just assumed by the buffer deallocation pass. Functions on which the deallocation pass is applied and for which the implementation is accessible are modified by the pass such that the ABI is respected (i.e., buffer copies are inserted when necessary).</p><h2 id=inserting-bufferizationdealloc-operations>Inserting <code>bufferization.dealloc</code> operations <a class=headline-hash href=#inserting-bufferizationdealloc-operations>¶</a></h2><p><code>bufferization.dealloc</code> and ownership indicators are the main abstractions in the ownership-based buffer deallocation pass. <code>bufferization.dealloc</code> deallocates all given buffers if the respective ownership indicator is set and there is no aliasing buffer in the retain list.</p><p><img src=/includes/img/bufferization_dealloc_op.svg alt=branch_example_pre_move></p><p><code>bufferization.dealloc</code> operations are unconditionally inserted at the end of each basic block (just before the terminator). The majority of the pass is about finding the correct operands for this operation. There are three variadic operand lists to be populated, the first contains all MemRef values that may need to be deallocated, the second list contains their associated ownership values (of <code>i1</code> type), and the third list contains MemRef values that are still needed at a later point and should thus not be deallocated (e.g., yielded or returned buffers).</p><p><code>bufferization.dealloc</code> allows us to deal with any kind of aliasing behavior: it lowers to runtime aliasing checks when not enough information can be collected statically. When enough aliasing information is statically available, operands or the entire op may fold away.</p><p><strong>Ownerships</strong></p><p>To do so, we use a concept of ownership indicators of memrefs which materialize as an <code>i1</code> value for any SSA value of <code>memref</code> type, indicating whether the basic block in which it was materialized has ownership of this MemRef. Ideally, this is a constant <code>true</code> or <code>false</code>, but might also be a non-constant SSA value. To keep track of those ownership values without immediately materializing them (which might require insertion of <code>bufferization.clone</code> operations or operations checking for aliasing at runtime at positions where we don’t actually need a materialized value), we use the <code>Ownership</code> class. This class represents the ownership in three states forming a lattice on a partial order:</p><pre tabindex=0><code>forall X in SSA values. uninitialized < unique(X) < unknown forall X, Y in SSA values. unique(X) == unique(Y) iff X and Y always evaluate to the same value unique(X) != unique(Y) otherwise </code></pre><p>Intuitively, the states have the following meaning:</p><ul><li>Uninitialized: the ownership is not initialized yet, this is the default state; once an operation is finished processing the ownership of all operation results with MemRef type should not be uninitialized anymore.</li><li>Unique: there is a specific SSA value that can be queried to check ownership without materializing any additional IR</li><li>Unknown: no specific SSA value is available without materializing additional IR, typically this is because two ownerships in ‘Unique’ state would have to be merged manually (e.g., the result of an <code>arith.select</code> either has the ownership of the then or else case depending on the condition value, inserting another <code>arith.select</code> for the ownership values can perform the merge and provide a ‘Unique’ ownership for the result), however, in the general case this ‘Unknown’ state has to be assigned.</li></ul><p>Implied by the above partial order, the pass combines two ownerships in the following way:</p><table><thead><tr><th style=text-align:left>Ownership 1</th><th style=text-align:left>Ownership 2</th><th style=text-align:left>Combined Ownership</th></tr></thead><tbody><tr><td style=text-align:left>uninitialized</td><td style=text-align:left>uninitialized</td><td style=text-align:left>uninitialized</td></tr><tr><td style=text-align:left>unique(X)</td><td style=text-align:left>uninitialized</td><td style=text-align:left>unique(X)</td></tr><tr><td style=text-align:left>unique(X)</td><td style=text-align:left>unique(X)</td><td style=text-align:left>unique(X)</td></tr><tr><td style=text-align:left>unique(X)</td><td style=text-align:left>unique(Y)</td><td style=text-align:left>unknown</td></tr><tr><td style=text-align:left>unknown</td><td style=text-align:left>unique</td><td style=text-align:left>unknown</td></tr><tr><td style=text-align:left>unknown</td><td style=text-align:left>uninitialized</td><td style=text-align:left>unknown</td></tr><tr><td style=text-align:left><td colspan=3>+ symmetric cases</td><td></td><td></td></tr></tbody></table><p><strong>Collecting the list of MemRefs that potentially need to be deallocated</strong></p><p>For a given block, the list of MemRefs that potentially need to be deallocated at the end of that block is computed by keeping track of all values for which the block potentially takes over ownership. This includes MemRefs provided as basic block arguments, interface handlers for operations like <code>memref.alloc</code> and <code>func.call</code>, but also liveness information in regions with multiple basic blocks. More concretely, it is computed by taking the MemRefs in the ‘in’ set of the liveness analysis of the current basic block B, appended by the MemRef block arguments and by the set of MemRefs allocated in B itself (determined by the interface handlers), then subtracted (also determined by the interface handlers) by the set of MemRefs deallocated in B.</p><p>Note that we don’t have to take the intersection of the liveness ‘in’ set with the ‘out’ set of the predecessor block because a value that is in the ‘in’ set must be defined in an ancestor block that dominates all direct predecessors and thus the ‘in’ set of this block is a subset of the ‘out’ sets of each predecessor.</p><pre tabindex=0><code>memrefs = filter((liveIn(block) U allocated(block) U arguments(block)) \ deallocated(block), isMemRef) </code></pre><p>The list of conditions for the second variadic operands list of <code>bufferization.dealloc</code> is computed by querying the stored ownership value for each of the MemRefs collected as described above. The ownership state is updated by the interface handlers while processing the basic block.</p><p><strong>Collecting the list of MemRefs to retain</strong></p><p>Given a basic block B, the list of MemRefs that have to be retained can be different for each successor block S. For the two basic blocks B and S and the values passed via block arguments to the destination block S, we compute the list of MemRefs that have to be retained in B by taking the MemRefs in the successor operand list of the terminator and the MemRefs in the ‘out’ set of the liveness analysis for B intersected with the ‘in’ set of the destination block S.</p><p>This list of retained values makes sure that we cannot run into use-after-free situations even if no aliasing information is present at compile-time.</p><pre tabindex=0><code>toRetain = filter(successorOperands + (liveOut(fromBlock) insersect liveIn(toBlock)), isMemRef) </code></pre><h2 id=supported-interfaces>Supported interfaces <a class=headline-hash href=#supported-interfaces>¶</a></h2><p>The pass uses liveness analysis and a few interfaces:</p><ul><li><code>FunctionOpInterface</code></li><li><code>CallOpInterface</code></li><li><code>MemoryEffectOpInterface</code></li><li><code>RegionBranchOpInterface</code></li><li><code>RegionBranchTerminatorOpInterface</code></li></ul><p>Due to insufficient information provided by the interface, it also special-cases on the <code>cf.cond_br</code> operation and makes some assumptions about operations implementing the <code>RegionBranchOpInterface</code> at the moment, but improving the interfaces would allow us to remove those dependencies in the future.</p><h2 id=limitations>Limitations <a class=headline-hash href=#limitations>¶</a></h2><p>The Buffer Deallocation pass has some requirements and limitations on the input IR. These are checked in the beginning of the pass and errors are emitted accordingly:</p><ul><li>The set of interfaces the pass operates on must be implemented (correctly). E.g., if there is an operation present with a nested region, but does not implement the <code>RegionBranchOpInterface</code>, an error is emitted because the pass cannot know the semantics of the nested region (and does not make any default assumptions on it).</li><li>No explicit control-flow loops are present. Currently, only loops using structural-control-flow are supported. However, this limitation could be lifted in the future.</li><li>Deallocation operations should not be present already. The pass should handle them correctly already (at least in most cases), but it’s not supported yet due to insufficient testing.</li><li>Terminators must implement either <code>RegionBranchTerminatorOpInterface</code> or <code>BranchOpInterface</code>, but not both. Terminators with more than one successor are not supported (except <code>cf.cond_br</code>). This is not a fundamental limitation, but there is no use-case justifying the more complex implementation at the moment.</li></ul><h2 id=example>Example <a class=headline-hash href=#example>¶</a></h2><p>The following example contains a few interesting cases:</p><ul><li>Basic block arguments are modified to also pass along the ownership indicator, but not for entry blocks, where the function boundary ABI is applied instead.</li><li>The result of <code>arith.select</code> initially has ‘Unknown’ assigned as ownership, but once the <code>bufferization.dealloc</code> operation is inserted it is put in the ‘retained’ list (since it has uses in a later basic block) and thus the ‘Unknown’ ownership can be replaced with a ‘Unique’ ownership using the corresponding result of the dealloc operation.</li><li>The <code>cf.cond_br</code> operation has more than one successor and thus has to insert two <code>bufferization.dealloc</code> operations (one for each successor). While they have the same list of MemRefs to deallocate (because they perform the deallocations for the same block), it must be taken into account that some MemRefs remain <em>live</em> for one branch but not the other (thus set intersection is performed on the <em>live-out</em> of the current block and the <em>live-in</em> of the target block). Also, <code>cf.cond_br</code> supports separate forwarding operands for each successor. To make sure that no MemRef is deallocated twice (because there are two <code>bufferization.dealloc</code> operations with the same MemRefs to deallocate), the condition operands are adjusted to take the branch condition into account. While a generic lowering for such terminator operations could be implemented, a specialized implementation can take all the semantics of this particular operation into account and thus generate a more efficient lowering.</li></ul><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=kt>func</span><span class=p>.</span><span class=kt>func</span> <span class=nf>@example</span><span class=p>(</span><span class=nv>%memref</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=nv>%select_cond</span><span class=p>:</span> <span class=k>i1</span><span class=p>,</span> <span class=nv>%br_cond</span><span class=p>:</span> <span class=k>i1</span><span class=p>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%alloc</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> </span></span><span class=line><span class=cl> <span class=nv>%alloca</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloca<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> </span></span><span class=line><span class=cl> <span class=nv>%select</span> <span class=p>=</span> arith<span class=p>.</span>select <span class=nv>%select_cond</span><span class=p>,</span> <span class=nv>%alloc</span><span class=p>,</span> <span class=nv>%alloca</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> </span></span><span class=line><span class=cl> cf<span class=p>.</span>cond_br <span class=nv>%br_cond</span><span class=p>,</span> <span class=nl>^bb1</span><span class=p>(</span><span class=nv>%alloc</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>),</span> <span class=nl>^bb1</span><span class=p>(</span><span class=nv>%memref</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl><span class=nl>^bb1</span><span class=p>(</span><span class=nv>%bbarg</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>):</span> </span></span><span class=line><span class=cl> test<span class=p>.</span>copy<span class=p>(</span><span class=nv>%bbarg</span><span class=p>,</span> <span class=nv>%select</span><span class=p>)</span> <span class=p>:</span> <span class=p>(</span><span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> <span class=kt>return</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><p>After running <code>--ownership-based-buffer-deallocation</code>, it looks as follows:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=c>// Function boundary ABI: ownership of `%memref` will never be acquired. </span></span></span><span class=line><span class=cl><span class=c></span><span class=kt>func</span><span class=p>.</span><span class=kt>func</span> <span class=nf>@example</span><span class=p>(</span><span class=nv>%memref</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=nv>%select_cond</span><span class=p>:</span> <span class=k>i1</span><span class=p>,</span> <span class=nv>%br_cond</span><span class=p>:</span> <span class=k>i1</span><span class=p>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%false</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> false </span></span><span class=line><span class=cl> <span class=nv>%true</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> true </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// The ownership of a MemRef defined by the `memref.alloc` operation is always </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// assigned to be 'true'. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%alloc</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// The ownership of a MemRef defined by the `memref.alloca` operation is </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// always assigned to be 'false'. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%alloca</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloca<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// The ownership of %select will be the join of the ownership of %alloc and </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// the ownership of %alloca, i.e., of %true and %false. Because the pass does </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// not know about the semantics of the `arith.select` operation (unless a </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// custom handler is implemented), the ownership join will be 'Unknown'. If </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// the materialized ownership indicator of %select is needed, either a clone </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// has to be created for which %true is assigned as ownership or the result </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// of a `bufferization.dealloc` where %select is in the retain list has to be </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// used. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%select</span> <span class=p>=</span> arith<span class=p>.</span>select <span class=nv>%select_cond</span><span class=p>,</span> <span class=nv>%alloc</span><span class=p>,</span> <span class=nv>%alloca</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// We use `memref.extract_strided_metadata` to get the base memref since it is </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// not allowed to pass arbitrary memrefs to `memref.dealloc`. This property is </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// already enforced for `bufferization.dealloc` </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%base_buffer_memref</span><span class=p>,</span> <span class=nl>... =</span> <span class=kt>memref</span><span class=p>.</span>extract_strided_metadata <span class=nv>%memref</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> <span class=p>-></span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span> </span></span><span class=line><span class=cl> <span class=nv>%base_buffer_alloc</span><span class=p>,</span> <span class=nl>... =</span> <span class=kt>memref</span><span class=p>.</span>extract_strided_metadata <span class=nv>%alloc</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> <span class=p>-></span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span> </span></span><span class=line><span class=cl> <span class=nv>%base_buffer_alloca</span><span class=p>,</span> <span class=nl>... =</span> <span class=kt>memref</span><span class=p>.</span>extract_strided_metadata <span class=nv>%alloca</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> <span class=p>-></span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// The deallocation conditions need to be adjusted to incorporate the branch </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// condition. In this example, this requires only a single negation, but might </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// also require multiple arith.andi operations. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%not_br_cond</span> <span class=p>=</span> arith<span class=p>.</span>xori <span class=nv>%true</span><span class=p>,</span> <span class=nv>%br_cond</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// There are two dealloc operations inserted in this basic block, one per </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// successor. Both have the same list of MemRefs to deallocate and the </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// conditions only differ by the branch condition conjunct. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// Note, however, that the retained list differs. Here, both contain the </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// %select value because it is used in both successors (since it's the same </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// block), but the value passed via block argument differs (%memref vs. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// %alloc). </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%10</span><span class=p>:</span><span class=nl>2 =</span> bufferization<span class=p>.</span>dealloc </span></span><span class=line><span class=cl> <span class=p>(</span><span class=nv>%base_buffer_memref</span><span class=p>,</span> <span class=nv>%base_buffer_alloc</span><span class=p>,</span> <span class=nv>%base_buffer_alloca</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> if <span class=p>(</span><span class=nv>%false</span><span class=p>,</span> <span class=nv>%br_cond</span><span class=p>,</span> <span class=nv>%false</span><span class=p>)</span> </span></span><span class=line><span class=cl> retain <span class=p>(</span><span class=nv>%alloc</span><span class=p>,</span> <span class=nv>%select</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=nv>%11</span><span class=p>:</span><span class=nl>2 =</span> bufferization<span class=p>.</span>dealloc </span></span><span class=line><span class=cl> <span class=p>(</span><span class=nv>%base_buffer_memref</span><span class=p>,</span> <span class=nv>%base_buffer_alloc</span><span class=p>,</span> <span class=nv>%base_buffer_alloca</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> if <span class=p>(</span><span class=nv>%false</span><span class=p>,</span> <span class=nv>%not_br_cond</span><span class=p>,</span> <span class=nv>%false</span><span class=p>)</span> </span></span><span class=line><span class=cl> retain <span class=p>(</span><span class=nv>%memref</span><span class=p>,</span> <span class=nv>%select</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Because %select is used in ^bb1 without passing it via block argument, we </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// need to update it's ownership value here by merging the ownership values </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// returned by the dealloc operations </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%new_ownership</span> <span class=p>=</span> arith<span class=p>.</span>select <span class=nv>%br_cond</span><span class=p>,</span> <span class=nv>%10#1</span><span class=p>,</span> <span class=nv>%11#1</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// The terminator is modified to pass along the ownership indicator values </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// with each MemRef value. </span></span></span><span class=line><span class=cl><span class=c></span> cf<span class=p>.</span>cond_br <span class=nv>%br_cond</span><span class=p>,</span> <span class=nl>^bb1</span><span class=p>(</span><span class=nv>%alloc</span><span class=p>,</span> <span class=nv>%10#0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=k>i1</span><span class=p>),</span> </span></span><span class=line><span class=cl> <span class=nl>^bb1</span><span class=p>(</span><span class=nv>%memref</span><span class=p>,</span> <span class=nv>%11#0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=k>i1</span><span class=p>)</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl><span class=c>// All non-entry basic blocks are modified to have an additional i1 argument for </span></span></span><span class=line><span class=cl><span class=c>// each MemRef value in the argument list. </span></span></span><span class=line><span class=cl><span class=c></span><span class=nl>^bb1</span><span class=p>(</span><span class=nv>%13</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=nv>%14</span><span class=p>:</span> <span class=k>i1</span><span class=p>):</span> <span class=c>// 2 preds: ^bb0, ^bb0 </span></span></span><span class=line><span class=cl><span class=c></span> test<span class=p>.</span>copy<span class=p>(</span><span class=nv>%13</span><span class=p>,</span> <span class=nv>%select</span><span class=p>)</span> <span class=p>:</span> <span class=p>(</span><span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=nv>%base_buffer_13</span><span class=p>,</span> <span class=nl>... =</span> <span class=kt>memref</span><span class=p>.</span>extract_strided_metadata <span class=nv>%13</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> <span class=p>-></span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span> </span></span><span class=line><span class=cl> <span class=nv>%base_buffer_select</span><span class=p>,</span> <span class=nl>... =</span> <span class=kt>memref</span><span class=p>.</span>extract_strided_metadata <span class=nv>%select</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i8</span><span class=p>></span> <span class=p>-></span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span><span class=p>,</span> <span class=k>index</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Here, we don't have a retained list, because the block has no successors </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// and the return has no operands. </span></span></span><span class=line><span class=cl><span class=c></span> bufferization<span class=p>.</span>dealloc <span class=p>(</span><span class=nv>%base_buffer_13</span><span class=p>,</span> <span class=nv>%base_buffer_select</span> </span></span><span class=line><span class=cl> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=k>i8</span><span class=p>>)</span> </span></span><span class=line><span class=cl> if <span class=p>(</span><span class=nv>%14</span><span class=p>,</span> <span class=nv>%new_ownership</span><span class=p>)</span> </span></span><span class=line><span class=cl> <span class=kt>return</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><h2 id=buffer-deallocation-simplification-pass>Buffer Deallocation Simplification Pass <a class=headline-hash href=#buffer-deallocation-simplification-pass>¶</a></h2><p>The <a href=#bufferizationdealloc-bufferizationdeallocop>semantics of the <code>bufferization.dealloc</code> operation</a> provide a lot of opportunities for optimizations which can be conveniently split into patterns using the greedy pattern rewriter. Some of those patterns need access to additional analyses such as an analysis that can determine whether two MemRef values must, may, or never originate from the same buffer allocation. These patterns are collected in the Buffer Deallocation Simplification pass, while patterns that don’t need additional analyses are registered as part of the regular canonicalizer pass. This pass is best run after <code>--ownership-based-buffer-deallocation</code> followed by <code>--canonicalize</code>.</p><p>The pass applies patterns for the following simplifications:</p><ul><li>Remove MemRefs from retain list when guaranteed to not alias with any value in the ‘memref’ operand list. This avoids an additional aliasing check with the removed value.</li><li>Split off values in the ‘memref’ list to new <code>bufferization.dealloc</code> operations only containing this value in the ‘memref’ list when it is guaranteed to not alias with any other value in the ‘memref’ list. This avoids at least one aliasing check at runtime and enables using a more efficient lowering for this new <code>bufferization.dealloc</code> operation.</li><li>Remove values from the ‘memref’ operand list when it is guaranteed to alias with at least one value in the ‘retained’ list and may not alias any other value in the ‘retain’ list.</li></ul><h2 id=lower-deallocations-pass>Lower Deallocations Pass <a class=headline-hash href=#lower-deallocations-pass>¶</a></h2><p>The <code>-lower-deallocations</code> pass transforms all <code>bufferization.dealloc</code> operations to <code>memref.dealloc</code> operations and may also insert operations from the <code>scf</code>, <code>func</code>, and <code>arith</code> dialects to make deallocations conditional and check whether two MemRef values come from the same allocation at runtime (when the <code>buffer-deallocation-simplification</code> pass wasn’t able to determine it statically).</p><p>The same lowering of the <code>bufferization.dealloc</code> operation is also part of the <code>-convert-bufferization-to-memref</code> conversion pass which also lowers all the other operations of the bufferization dialect.</p><p>We distinguish multiple cases in this lowering pass to provide an overall more efficient lowering. In the general case, a library function is created to avoid quadratic code size explosion (relative to the number of operands of the dealloc operation). The specialized lowerings aim to avoid this library function because it requires allocating auxiliary MemRefs of index values.</p><h3 id=generic-lowering>Generic Lowering <a class=headline-hash href=#generic-lowering>¶</a></h3><p>A library function is generated to avoid code-size blow-up. On a high level, the base-memref of all operands is extracted as an index value and stored into specifically allocated MemRefs and passed to the library function which then determines whether they come from the same original allocation. This information is needed to avoid double-free situations and to correctly retain the MemRef values in the <code>retained</code> list.</p><p><strong>Dealloc Operation Lowering</strong></p><p>This lowering supports all features the dealloc operation has to offer. It computes the base pointer of each memref (as an index), stores it in a new memref helper structure and passes it to the helper function generated in <code>buildDeallocationLibraryFunction</code>. The results are stored in two lists (represented as MemRefs) of booleans passed as arguments. The first list stores whether the corresponding condition should be deallocated, the second list stores the ownership of the retained values which can be used to replace the result values of the <code>bufferization.dealloc</code> operation.</p><p>Example:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=nv>%0</span><span class=p>:</span><span class=nl>2 =</span> bufferization<span class=p>.</span>dealloc <span class=p>(</span><span class=nv>%m0</span><span class=p>,</span> <span class=nv>%m1</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>5x</span><span class=k>f32</span><span class=p>>)</span> </span></span><span class=line><span class=cl> if <span class=p>(</span><span class=nv>%cond0</span><span class=p>,</span> <span class=nv>%cond1</span><span class=p>)</span> </span></span><span class=line><span class=cl> retain <span class=p>(</span><span class=nv>%r0</span><span class=p>,</span> <span class=nv>%r1</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>1x</span><span class=k>f32</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>>)</span> </span></span></code></pre></div><p>lowers to (simplified):</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=nv>%c0</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>0</span> <span class=p>:</span> <span class=k>index</span> </span></span><span class=line><span class=cl><span class=nv>%c1</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>1</span> <span class=p>:</span> <span class=k>index</span> </span></span><span class=line><span class=cl><span class=nv>%dealloc_base_pointer_list</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>index</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%cond_list</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%retain_base_pointer_list</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>index</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%m0_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%m0</span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>store <span class=nv>%m0_base_pointer</span><span class=p>,</span> <span class=nv>%dealloc_base_pointer_list</span><span class=p>[</span><span class=nv>%c0</span><span class=p>]</span> </span></span><span class=line><span class=cl><span class=nv>%m1_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%m1</span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>store <span class=nv>%m1_base_pointer</span><span class=p>,</span> <span class=nv>%dealloc_base_pointer_list</span><span class=p>[</span><span class=nv>%c1</span><span class=p>]</span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>store <span class=nv>%cond0</span><span class=p>,</span> <span class=nv>%cond_list</span><span class=p>[</span><span class=nv>%c0</span><span class=p>]</span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>store <span class=nv>%cond1</span><span class=p>,</span> <span class=nv>%cond_list</span><span class=p>[</span><span class=nv>%c1</span><span class=p>]</span> </span></span><span class=line><span class=cl><span class=nv>%r0_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%r0</span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>store <span class=nv>%r0_base_pointer</span><span class=p>,</span> <span class=nv>%retain_base_pointer_list</span><span class=p>[</span><span class=nv>%c0</span><span class=p>]</span> </span></span><span class=line><span class=cl><span class=nv>%r1_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%r1</span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>store <span class=nv>%r1_base_pointer</span><span class=p>,</span> <span class=nv>%retain_base_pointer_list</span><span class=p>[</span><span class=nv>%c1</span><span class=p>]</span> </span></span><span class=line><span class=cl><span class=nv>%dyn_dealloc_base_pointer_list</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>cast <span class=nv>%dealloc_base_pointer_list</span> <span class=p>:</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>index</span><span class=p>></span> to <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>index</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%dyn_cond_list</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>cast <span class=nv>%cond_list</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> to <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%dyn_retain_base_pointer_list</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>cast <span class=nv>%retain_base_pointer_list</span> <span class=p>:</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>index</span><span class=p>></span> to <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>index</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%dealloc_cond_out</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%ownership_out</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%dyn_dealloc_cond_out</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>cast <span class=nv>%dealloc_cond_out</span> <span class=p>:</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> to <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%dyn_ownership_out</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>cast <span class=nv>%ownership_out</span> <span class=p>:</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> to <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl>call <span class=nf>@dealloc_helper</span><span class=p>(</span><span class=nv>%dyn_dealloc_base_pointer_list</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_retain_base_pointer_list</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_cond_list</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_dealloc_cond_out</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_ownership_out</span><span class=p>)</span> <span class=p>:</span> <span class=p>(...)</span> </span></span><span class=line><span class=cl><span class=nv>%m0_dealloc_cond</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_dealloc_cond_out</span><span class=p>[</span><span class=nv>%c0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl>scf<span class=p>.</span>if <span class=nv>%m0_dealloc_cond</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%m0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span><span class=line><span class=cl><span class=nv>%m1_dealloc_cond</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_dealloc_cond_out</span><span class=p>[</span><span class=nv>%c1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl>scf<span class=p>.</span>if <span class=nv>%m1_dealloc_cond</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%m1</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>5x</span><span class=k>f32</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span><span class=line><span class=cl><span class=nv>%r0_ownership</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_ownership_out</span><span class=p>[</span><span class=nv>%c0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=nv>%r1_ownership</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_ownership_out</span><span class=p>[</span><span class=nv>%c1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%dealloc_base_pointer_list</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>index</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%retain_base_pointer_list</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>index</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%cond_list</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%dealloc_cond_out</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%ownership_out</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=c>// replace %0#0 with %r0_ownership </span></span></span><span class=line><span class=cl><span class=c>// replace %0#1 with %r1_ownership </span></span></span></code></pre></div><p><strong>Library function</strong></p><p>A library function is built per compilation unit that can be called at bufferization dealloc sites to determine whether two MemRefs come from the same allocation and their new ownerships.</p><p>The generated function takes two MemRefs of indices and three MemRefs of booleans as arguments:</p><ul><li>The first argument A should contain the result of the extract_aligned_pointer_as_index operation applied to the MemRefs to be deallocated</li><li>The second argument B should contain the result of the extract_aligned_pointer_as_index operation applied to the MemRefs to be retained</li><li>The third argument C should contain the conditions as passed directly to the deallocation operation.</li><li>The fourth argument D is used to pass results to the caller. Those represent the condition under which the MemRef at the corresponding position in A should be deallocated.</li><li>The fifth argument E is used to pass results to the caller. It provides the ownership value corresponding the the MemRef at the same position in B</li></ul><p>This helper function is supposed to be called once for each <code>bufferization.dealloc</code> operation to determine the deallocation need and new ownership indicator for the retained values, but does not perform the deallocation itself.</p><p>Generated code:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=kt>func</span><span class=p>.</span><span class=kt>func</span> <span class=nf>@dealloc_helper</span><span class=p>(</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_dealloc_base_pointer_list</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>index</span><span class=p>>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_retain_base_pointer_list</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>index</span><span class=p>>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_cond_list</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_dealloc_cond_out</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>>,</span> </span></span><span class=line><span class=cl> <span class=nv>%dyn_ownership_out</span><span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%c0</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>0</span> <span class=p>:</span> <span class=k>index</span> </span></span><span class=line><span class=cl> <span class=nv>%c1</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>1</span> <span class=p>:</span> <span class=k>index</span> </span></span><span class=line><span class=cl> <span class=nv>%true</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> true </span></span><span class=line><span class=cl> <span class=nv>%false</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> false </span></span><span class=line><span class=cl> <span class=nv>%num_dealloc_memrefs</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>dim <span class=nv>%dyn_dealloc_base_pointer_list</span><span class=p>,</span> <span class=nv>%c0</span> </span></span><span class=line><span class=cl> <span class=nv>%num_retain_memrefs</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>dim <span class=nv>%dyn_retain_base_pointer_list</span><span class=p>,</span> <span class=nv>%c0</span> </span></span><span class=line><span class=cl> <span class=c>// Zero initialize result buffer. </span></span></span><span class=line><span class=cl><span class=c></span> scf<span class=p>.</span>for <span class=nv>%i</span> <span class=p>=</span> <span class=nv>%c0</span> to <span class=nv>%num_retain_memrefs</span> step <span class=nv>%c1</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>store <span class=nv>%false</span><span class=p>,</span> <span class=nv>%dyn_ownership_out</span><span class=p>[</span><span class=nv>%i</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> scf<span class=p>.</span>for <span class=nv>%i</span> <span class=p>=</span> <span class=nv>%c0</span> to <span class=nv>%num_dealloc_memrefs</span> step <span class=nv>%c1</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%dealloc_bp</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_dealloc_base_pointer_list</span><span class=p>[</span><span class=nv>%i</span><span class=p>]</span> </span></span><span class=line><span class=cl> <span class=nv>%cond</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_cond_list</span><span class=p>[</span><span class=nv>%i</span><span class=p>]</span> </span></span><span class=line><span class=cl> <span class=c>// Check for aliasing with retained memrefs. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%does_not_alias_retained</span> <span class=p>=</span> scf<span class=p>.</span>for <span class=nv>%j</span> <span class=p>=</span> <span class=nv>%c0</span> to <span class=nv>%num_retain_memrefs</span> </span></span><span class=line><span class=cl> step <span class=nv>%c1</span> iter_args<span class=p>(</span><span class=nv>%does_not_alias_aggregated</span> <span class=p>=</span> <span class=nv>%true</span><span class=p>)</span> <span class=p>-></span> <span class=p>(</span><span class=k>i1</span><span class=p>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%retain_bp</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_retain_base_pointer_list</span><span class=p>[</span><span class=nv>%j</span><span class=p>]</span> </span></span><span class=line><span class=cl> <span class=nv>%does_alias</span> <span class=p>=</span> arith<span class=p>.</span>cmpi eq<span class=p>,</span> <span class=nv>%retain_bp</span><span class=p>,</span> <span class=nv>%dealloc_bp</span> <span class=p>:</span> <span class=k>index</span> </span></span><span class=line><span class=cl> scf<span class=p>.</span>if <span class=nv>%does_alias</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%curr_ownership</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_ownership_out</span><span class=p>[</span><span class=nv>%j</span><span class=p>]</span> </span></span><span class=line><span class=cl> <span class=nv>%updated_ownership</span> <span class=p>=</span> arith<span class=p>.</span>ori <span class=nv>%curr_ownership</span><span class=p>,</span> <span class=nv>%cond</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>store <span class=nv>%updated_ownership</span><span class=p>,</span> <span class=nv>%dyn_ownership_out</span><span class=p>[</span><span class=nv>%j</span><span class=p>]</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=nv>%does_not_alias</span> <span class=p>=</span> arith<span class=p>.</span>cmpi ne<span class=p>,</span> <span class=nv>%retain_bp</span><span class=p>,</span> <span class=nv>%dealloc_bp</span> <span class=p>:</span> <span class=k>index</span> </span></span><span class=line><span class=cl> <span class=nv>%updated_aggregate</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%does_not_alias_aggregated</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=nv>%does_not_alias</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> scf<span class=p>.</span>yield <span class=nv>%updated_aggregate</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=c>// Check for aliasing with dealloc memrefs in the list before the </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// current one, i.e., </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// `fix i, forall j < i: check_aliasing(%dyn_dealloc_base_pointer[j], </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// %dyn_dealloc_base_pointer[i])` </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%does_not_alias_any</span> <span class=p>=</span> scf<span class=p>.</span>for <span class=nv>%j</span> <span class=p>=</span> <span class=nv>%c0</span> to <span class=nv>%i</span> step <span class=nv>%c1</span> </span></span><span class=line><span class=cl> iter_args<span class=p>(</span><span class=nv>%does_not_alias_agg</span> <span class=p>=</span> <span class=nv>%does_not_alias_retained</span><span class=p>)</span> <span class=p>-></span> <span class=p>(</span><span class=k>i1</span><span class=p>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%prev_dealloc_bp</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>load <span class=nv>%dyn_dealloc_base_pointer_list</span><span class=p>[</span><span class=nv>%j</span><span class=p>]</span> </span></span><span class=line><span class=cl> <span class=nv>%does_not_alias</span> <span class=p>=</span> arith<span class=p>.</span>cmpi ne<span class=p>,</span> <span class=nv>%prev_dealloc_bp</span><span class=p>,</span> <span class=nv>%dealloc_bp</span> </span></span><span class=line><span class=cl> <span class=nv>%updated_alias_agg</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%does_not_alias_agg</span><span class=p>,</span> <span class=nv>%does_not_alias</span> </span></span><span class=line><span class=cl> scf<span class=p>.</span>yield <span class=nv>%updated_alias_agg</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=nv>%dealloc_cond</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%does_not_alias_any</span><span class=p>,</span> <span class=nv>%cond</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>store <span class=nv>%dealloc_cond</span><span class=p>,</span> <span class=nv>%dyn_dealloc_cond_out</span><span class=p>[</span><span class=nv>%i</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>?x</span><span class=k>i1</span><span class=p>></span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=kt>return</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><h3 id=specialized-lowerings>Specialized Lowerings <a class=headline-hash href=#specialized-lowerings>¶</a></h3><p>Currently, there are two special lowerings for common cases to avoid the library function and thus unnecessary memory load and store operations and function calls:</p><p><strong>One memref, no retained</strong></p><p>Lower a simple case without any retained values and a single MemRef. Ideally, static analysis can provide enough information such that the <code>buffer-deallocation-simplification</code> pass is able to split the dealloc operations up into this simple case as much as possible before running this pass.</p><p>Example:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl>bufferization<span class=p>.</span>dealloc <span class=p>(</span><span class=nv>%arg0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>>)</span> if <span class=p>(</span><span class=nv>%arg1</span><span class=p>)</span> </span></span></code></pre></div><p>is lowered to</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl>scf<span class=p>.</span>if <span class=nv>%arg1</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%arg0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><p>In most cases, the branch condition is either constant ’true’ or ‘false’ and can thus be optimized away entirely by the canonicalizer pass.</p><p><strong>One memref, arbitrarily many retained</strong></p><p>A special case lowering for the deallocation operation with exactly one MemRef, but an arbitrary number of retained values. The size of the code produced by this lowering is linear to the number of retained values.</p><p>Example:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=nv>%0</span><span class=p>:</span><span class=nl>2 =</span> bufferization<span class=p>.</span>dealloc <span class=p>(</span><span class=nv>%m</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>>)</span> if <span class=p>(</span><span class=nv>%cond</span><span class=p>)</span> </span></span><span class=line><span class=cl> retain <span class=p>(</span><span class=nv>%r0</span><span class=p>,</span> <span class=nv>%r1</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>1x</span><span class=k>f32</span><span class=p>>,</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>>)</span> </span></span><span class=line><span class=cl><span class=kt>return</span> <span class=nv>%0#0</span><span class=p>,</span> <span class=nv>%0#1</span> <span class=p>:</span> <span class=k>i1</span><span class=p>,</span> <span class=k>i1</span> </span></span></code></pre></div><p>is lowered to</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=nv>%m_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%m</span> </span></span><span class=line><span class=cl><span class=nv>%r0_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%r0</span> </span></span><span class=line><span class=cl><span class=nv>%r0_does_not_alias</span> <span class=p>=</span> arith<span class=p>.</span>cmpi ne<span class=p>,</span> <span class=nv>%m_base_pointer</span><span class=p>,</span> <span class=nv>%r0_base_pointer</span> </span></span><span class=line><span class=cl><span class=nv>%r1_base_pointer</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>extract_aligned_pointer_as_index <span class=nv>%r1</span> </span></span><span class=line><span class=cl><span class=nv>%r1_does_not_alias</span> <span class=p>=</span> arith<span class=p>.</span>cmpi ne<span class=p>,</span> <span class=nv>%m_base_pointer</span><span class=p>,</span> <span class=nv>%r1_base_pointer</span> </span></span><span class=line><span class=cl><span class=nv>%not_retained</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%r0_does_not_alias</span><span class=p>,</span> <span class=nv>%r1_does_not_alias</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl><span class=nv>%should_dealloc</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%not_retained</span><span class=p>,</span> <span class=nv>%cond</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl>scf<span class=p>.</span>if <span class=nv>%should_dealloc</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%m</span> <span class=p>:</span> <span class=kt>memref</span><span class=p><</span><span class=m>2x</span><span class=k>f32</span><span class=p>></span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span><span class=line><span class=cl><span class=nv>%true</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> true </span></span><span class=line><span class=cl><span class=nv>%r0_does_alias</span> <span class=p>=</span> arith<span class=p>.</span>xori <span class=nv>%r0_does_not_alias</span><span class=p>,</span> <span class=nv>%true</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl><span class=nv>%r0_ownership</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%r0_does_alias</span><span class=p>,</span> <span class=nv>%cond</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl><span class=nv>%r1_does_alias</span> <span class=p>=</span> arith<span class=p>.</span>xori <span class=nv>%r1_does_not_alias</span><span class=p>,</span> <span class=nv>%true</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl><span class=nv>%r1_ownership</span> <span class=p>=</span> arith<span class=p>.</span>andi <span class=nv>%r1_does_alias</span><span class=p>,</span> <span class=nv>%cond</span> <span class=p>:</span> <span class=k>i1</span> </span></span><span class=line><span class=cl><span class=kt>return</span> <span class=nv>%r0_ownership</span><span class=p>,</span> <span class=nv>%r1_ownership</span> <span class=p>:</span> <span class=k>i1</span><span class=p>,</span> <span class=k>i1</span> </span></span></code></pre></div><div class=edit-meta><br></div><nav class=pagination><a class="nav nav-prev" href=https://mlir.llvm.org/docs/Canonicalization/ title="Operation Canonicalization"><i class="fas fa-arrow-left" aria-hidden=true></i> Prev - Operation Canonicalization</a> <a class="nav nav-next" href=https://mlir.llvm.org/docs/PassManagement/ title="Pass Infrastructure">Next - Pass Infrastructure <i class="fas fa-arrow-right" aria-hidden=true></i></a></nav><footer><p class=powered>Powered by <a href=https://gohugo.io>Hugo</a>. Theme by <a href=https://themes.gohugo.io/hugo-theme-techdoc/>TechDoc</a>. Designed by <a href=https://github.com/thingsym/hugo-theme-techdoc>Thingsym</a>.</p></footer></main><div class=sidebar><nav class=slide-menu><ul><li><a href=https://mlir.llvm.org/>Home</a></li><li><a href=https://mlir.llvm.org/users/>Users of MLIR</a></li><li><a href=https://mlir.llvm.org/pubs/>MLIR Related Publications</a></li><li><a href=https://mlir.llvm.org/talks/>Talks</a></li><li><a href=https://mlir.llvm.org/deprecation/>Deprecations & Current Refactoring</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/getting_started/>Getting Started<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/getting_started/ReportingIssues/>Reporting Issues</a></li><li><a href=https://mlir.llvm.org/getting_started/Debugging/>Debugging Tips</a></li><li><a href=https://mlir.llvm.org/getting_started/Faq/>FAQ</a></li><li><a href=https://mlir.llvm.org/getting_started/Contributing/>How to Contribute</a></li><li><a href=https://mlir.llvm.org/getting_started/DeveloperGuide/>Developer Guide</a></li><li><a href=https://mlir.llvm.org/getting_started/openprojects/>Open Projects</a></li><li><a href=https://mlir.llvm.org/getting_started/Glossary/>Glossary</a></li><li><a href=https://mlir.llvm.org/getting_started/TestingGuide/>Testing Guide</a></li></ul></li><li class="parent has-sub-menu"><a href=https://mlir.llvm.org/docs/>Code Documentation<span class="mark opened">-</span></a><ul class=sub-menu><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Bindings/>Bindings<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Bindings/Python/>MLIR Python Bindings</a></li></ul></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Tools/>Tools<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tools/MLIRLSP/>MLIR : Language Server Protocol</a></li><li><a href=https://mlir.llvm.org/docs/Tools/mlir-reduce/>MLIR Reduce</a></li><li><a href=https://mlir.llvm.org/docs/Tools/mlir-rewrite/>mlir-rewrite</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/QuantPasses/></a></li><li><a href=https://mlir.llvm.org/docs/ActionTracing/>Action: Tracing and Debugging MLIR-based Compilers</a></li><li><a href=https://mlir.llvm.org/docs/BufferDeallocationInternals/>Buffer Deallocation - Internals</a></li><li><a href=https://mlir.llvm.org/docs/Bufferization/>Bufferization</a></li><li><a href=https://mlir.llvm.org/docs/DataLayout/>Data Layout Modeling</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/DefiningDialects/>Defining Dialects<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/DefiningDialects/Constraints/>Constraints</a></li><li><a href=https://mlir.llvm.org/docs/DefiningDialects/AttributesAndTypes/>Defining Dialect Attributes and Types</a></li><li><a href=https://mlir.llvm.org/docs/DefiningDialects/Operations/>Operation Definition Specification (ODS)</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Diagnostics/>Diagnostic Infrastructure</a></li><li><a href=https://mlir.llvm.org/docs/DialectConversion/>Dialect Conversion</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Dialects/>Dialects<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Dialects/DLTITransformOps/></a></li><li><a href=https://mlir.llvm.org/docs/Dialects/OpenACCDialect/>'acc' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Affine/>'affine' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/AMDGPU/>'amdgpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/AMX/>'amx' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArithOps/>'arith' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArmNeon/>'arm_neon' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArmSVE/>'arm_sve' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArmSME/>'ArmSME' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/AsyncDialect/>'async' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/BufferizationOps/>'bufferization' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ControlFlowDialect/>'cf' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ComplexOps/>'complex' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/DLTIDialect/>'dlti' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/EmitC/>'emitc' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Func/>'func' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/GPU/>'gpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/IndexOps/>'index' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/IRDL/>'irdl' Dialect</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Dialects/Linalg/>'linalg' Dialect<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Dialects/Linalg/OpDSL/>Linalg OpDSL</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Dialects/LLVM/>'llvm' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MathOps/>'math' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MemRef/>'memref' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Mesh/>'mesh' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MLProgramOps/>'ml_program' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MPI/>'mpi' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/NVGPU/>'nvgpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/NVVMDialect/>'nvvm' Dialect</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Dialects/OpenMPDialect/>'omp' Dialect<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Dialects/OpenMPDialect/ODS/>ODS Documentation</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Dialects/PDLInterpOps/>'pdl_interp' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/PDLOps/>'pdl' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/PolynomialDialect/>'polynomial' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/PtrOps/>'ptr' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/QuantDialect/>'quant' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ROCDLDialect/>'rocdl' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/SCFDialect/>'scf' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ShapeDialect/>'shape' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/SparseTensorOps/>'sparse_tensor' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/TensorOps/>'tensor' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/UBOps/>'ub' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/VCIXDialect/>'vcix' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Vector/>'vector' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/X86Vector/>'x86vector' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/XeGPU/>'xegpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Builtin/>Builtin Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MatchOpInterfaces/>OpInterface definitions</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/SPIR-V/>SPIR-V Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/TOSA/>Tensor Operator Set Architecture (TOSA) Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Transform/>Transform Dialect</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Interfaces/>Interfaces</a></li><li><a href=https://mlir.llvm.org/docs/TargetLLVMIR/>LLVM IR Target</a></li><li><a href=https://mlir.llvm.org/docs/BytecodeFormat/>MLIR Bytecode Format</a></li><li><a href=https://mlir.llvm.org/docs/CAPI/>MLIR C API</a></li><li><a href=https://mlir.llvm.org/docs/LangRef/>MLIR Language Reference</a></li><li><a href=https://mlir.llvm.org/docs/ReleaseNotes/>MLIR Release Notes</a></li><li><a href=https://mlir.llvm.org/docs/Canonicalization/>Operation Canonicalization</a></li><li class=active><a href=https://mlir.llvm.org/docs/OwnershipBasedBufferDeallocation/>Ownership-based Buffer Deallocation</a></li><li><a href=https://mlir.llvm.org/docs/PassManagement/>Pass Infrastructure</a></li><li><a href=https://mlir.llvm.org/docs/Passes/>Passes</a></li><li><a href=https://mlir.llvm.org/docs/PatternRewriter/>Pattern Rewriting : Generic DAG-to-DAG Rewriting</a></li><li><a href=https://mlir.llvm.org/docs/PDLL/>PDLL - PDL Language</a></li><li><a href=https://mlir.llvm.org/docs/Quantization/>Quantization</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Rationale/>Rationale<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Rationale/RationaleGenericDAGRewriter/>Generic DAG Rewriter Infrastructure Rationale</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/>Linalg Dialect Rationale: The Case For Compiler-Friendly Custom Operations</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/Rationale/>MLIR Rationale</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/MLIRForGraphAlgorithms/>MLIR: Incremental Application to Graph Algorithms in ML Frameworks</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/RationaleSimplifiedPolyhedralForm/>MLIR: The case for a simplified polyhedral form</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/SideEffectsAndSpeculation/>Side Effects & Speculation</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/UsageOfConst/>Usage of 'const' in MLIR, for core IR types</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/ShapeInference/>Shape Inference</a></li><li><a href=https://mlir.llvm.org/docs/SPIRVToLLVMDialectConversion/>SPIR-V Dialect to LLVM Dialect conversion manual</a></li><li><a href=https://mlir.llvm.org/docs/SymbolsAndSymbolTables/>Symbols and Symbol Tables</a></li><li><a href=https://mlir.llvm.org/docs/DeclarativeRewrites/>Table-driven Declarative Rewrite Rule (DRR)</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Traits/>Traits<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Traits/Broadcastable/>The `Broadcastable` Trait</a></li></ul></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Tutorials/>Tutorials<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tutorials/CreatingADialect/>Creating a Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/QuickstartRewrites/>Quickstart tutorial to adding MLIR graph rewrite</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Tutorials/Toy/>Toy Tutorial<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-1/>Chapter 1: Toy Language and AST</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-2/>Chapter 2: Emitting Basic MLIR</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-3/>Chapter 3: High-level Language-Specific Analysis and Transformation</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-4/>Chapter 4: Enabling Generic Transformation with Interfaces</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-5/>Chapter 5: Partial Lowering to Lower-Level Dialects for Optimization</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-6/>Chapter 6: Lowering to LLVM and CodeGeneration</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-7/>Chapter 7: Adding a Composite Type to Toy</a></li></ul></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Tutorials/transform/>Transform Dialect Tutorial<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch0/>Chapter 0: A Primer on “Structured” Linalg Operations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch1/>Chapter 1: Combining Existing Transformations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch2/>Chapter 2: Adding a Simple New Transformation Operation</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch3/>Chapter 3: More than Simple Transform Operations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch4/>Chapter 4: Matching Payload with Transform Operations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/ChH/>Chapter H: Reproducing Halide Schedule</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Tutorials/UnderstandingTheIRStructure/>Understanding the IR Structure</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/MlirOpt/>Using `mlir-opt`</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/DataFlowAnalysis/>Writing DataFlow Analyses in MLIR</a></li></ul></li></ul></li></ul></nav><div class=sidebar-footer></div></div></div><a href=# id=backtothetop-fixed class=backtothetop data-backtothetop-duration=600 data-backtothetop-easing=easeOutQuart data-backtothetop-fixed-fadein=1000 data-backtothetop-fixed-fadeout=1000 data-backtothetop-fixed-bottom=10 data-backtothetop-fixed-right=20><span class="fa-layers fa-fw"><i class="fas fa-circle"></i> <i class="fas fa-arrow-circle-up"></i></span></a></div></body></html>