CINXE.COM

Chapter 5: Partial Lowering to Lower-Level Dialects for Optimization - MLIR

<!doctype html><html lang=en-us><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"><title>Chapter 5: Partial Lowering to Lower-Level Dialects for Optimization - MLIR</title><meta name=description content="Multi-Level IR Compiler Framework"><meta name=generator content="Hugo 0.119.0"><link href=https://mlir.llvm.org/index.xml rel=alternate type=application/rss+xml><link rel=canonical href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-5/><link rel=stylesheet href=https://mlir.llvm.org/css/theme.css><script src=https://use.fontawesome.com/releases/v5.0.6/js/all.js></script> <link rel=stylesheet href=https://mlir.llvm.org/css/chroma.min.css><script src=https://cdn.jsdelivr.net/npm/jquery@3.3.1/dist/jquery.min.js></script> <script src=https://cdn.jsdelivr.net/npm/jquery.easing@1.4.1/jquery.easing.min.js></script> <script src=https://mlir.llvm.org/js/bundle.js></script> <script type=text/javascript src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> <script type=text/x-mathjax-config> MathJax.Hub.Config({ tex2jax: { inlineMath: [['$', '$'] ], displayMath: [ ['$$','$$'], ["\\[","\\]"] ] } }); </script><link rel=apple-touch-icon sizes=180x180 href="/apple-touch-icon.png?v=1"><link rel=icon type=image/png sizes=32x32 href="/favicon-32x32.png?v=1"><link rel=icon type=image/png sizes=16x16 href="/favicon-16x16.png?v=1"><link rel=manifest href="/site.webmanifest?v=1"><link rel=mask-icon href="/safari-pinned-tab.svg?v=1" color=#3775e0><link rel="shortcut icon" href="/favicon.ico?v=1"><meta name=msapplication-TileColor content="#2d89ef"><meta name=theme-color content="#ffffff"><link rel=icon href=/favicon.svg type=image/svg+xml sizes=any><style>:root{}</style></head><body><div class=container><header><h1><div><img src=https://mlir.llvm.org//mlir-logo.png width=40px align=absmiddle> MLIR</div></h1><p class=description>Multi-Level IR Compiler Framework</p></header><div class=global-menu><nav><ul><li class=parent><a href>Community<i class="fas fa-angle-right"></i></a><ul class=sub-menu><li class=child><a href=https://llvm.discourse.group/c/mlir/31>Forums</a></li><li class=child><a href=https://discord.gg/xS7Z362>Chat</a></li></ul></li><li><a href=/getting_started/Debugging/>Debugging Tips</a></li><li><a href=/getting_started/Faq/>FAQ</a></li><li class=parent><a href=https://github.com/llvm/llvm-project/tree/main/mlir>Source<i class="fas fa-angle-right"></i></a><ul class=sub-menu><li class=child><a href=/doxygen/>Doxygen</a></li><li class=child><a href=https://github.com/llvm/llvm-project/tree/main/mlir>GitHub</a></li></ul></li><li><a href="https://bugs.llvm.org/buglist.cgi?bug_status=__open__&amp;list_id=177877&amp;order=changeddate%20DESC%2Cpriority%2Cbug_severity&amp;product=MLIR&amp;query_format=specific">Bugs</a></li><li><a href=https://github.com/llvm/mlir-www/tree/main/website/static/LogoAssets>Logo Assets</a></li><li><a href=https://www.youtube.com/MLIRCompiler>Youtube Channel</a></li></ul></nav></div><div class=content-container><main><h1>Chapter 5: Partial Lowering to Lower-Level Dialects for Optimization</h1><p><nav id=TableOfContents><ul><li><a href=#conversion-target>Conversion Target</a></li><li><a href=#conversion-patterns>Conversion Patterns</a></li><li><a href=#partial-lowering>Partial Lowering</a><ul><li><a href=#design-considerations-with-partial-lowering>Design Considerations With Partial Lowering</a></li></ul></li><li><a href=#complete-toy-example>Complete Toy Example</a></li><li><a href=#taking-advantage-of-affine-optimization>Taking Advantage of Affine Optimization</a></li></ul></nav><p>At this point, we are eager to generate actual code and see our Toy language take life. We will use LLVM to generate code, but just showing the LLVM builder interface here wouldn&rsquo;t be very exciting. Instead, we will show how to perform progressive lowering through a mix of dialects coexisting in the same function.</p><p>To make it more interesting, in this chapter we will consider that we want to reuse existing optimizations implemented in a dialect optimizing affine transformations: <code>Affine</code>. This dialect is tailored to the computation-heavy part of the program and is limited: it doesn&rsquo;t support representing our <code>toy.print</code> builtin, for instance, neither should it! Instead, we can target <code>Affine</code> for the computation heavy part of Toy, and in the <a href=/docs/Tutorials/Toy/Ch-6/>next chapter</a> directly target the <code>LLVM IR</code> dialect for lowering <code>print</code>. As part of this lowering, we will be lowering from the <a href=/docs/Dialects/Builtin/#rankedtensortype>TensorType</a> that <code>Toy</code> operates on to the <a href=/docs/Dialects/Builtin/#memreftype>MemRefType</a> that is indexed via an affine loop-nest. Tensors represent an abstract value-typed sequence of data, meaning that they don&rsquo;t live in any memory. MemRefs, on the other hand, represent lower level buffer access, as they are concrete references to a region of memory.</p><h1 id=dialect-conversions>Dialect Conversions</h1><p>MLIR has many different dialects, so it is important to have a unified framework for <a href=/getting_started/Glossary/#conversion>converting</a> between them. This is where the <code>DialectConversion</code> framework comes into play. This framework allows for transforming a set of <em>illegal</em> operations to a set of <em>legal</em> ones. To use this framework, we need to provide two things (and an optional third):</p><ul><li><p>A <a href=/docs/DialectConversion/#conversion-target>Conversion Target</a></p><ul><li>This is the formal specification of what operations or dialects are legal for the conversion. Operations that aren&rsquo;t legal will require rewrite patterns to perform <a href=/getting_started/Glossary/#legalization>legalization</a>.</li></ul></li><li><p>A set of <a href=/docs/DialectConversion/#rewrite-pattern-specification>Rewrite Patterns</a></p><ul><li>This is the set of <a href=/docs/Tutorials/QuickstartRewrites/>patterns</a> used to convert <em>illegal</em> operations into a set of zero or more <em>legal</em> ones.</li></ul></li><li><p>Optionally, a <a href=/docs/DialectConversion/#type-conversion>Type Converter</a>.</p><ul><li>If provided, this is used to convert the types of block arguments. We won&rsquo;t be needing this for our conversion.</li></ul></li></ul><h2 id=conversion-target>Conversion Target&nbsp;<a class=headline-hash href=#conversion-target>¶</a></h2><p>For our purposes, we want to convert the compute-intensive <code>Toy</code> operations into a combination of operations from the <code>Affine</code>, <code>Arith</code>, <code>Func</code>, and <code>MemRef</code> dialects for further optimization. To start off the lowering, we first define our conversion target:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-c++ data-lang=c++><span class=line><span class=cl><span class=kt>void</span> <span class=n>ToyToAffineLoweringPass</span><span class=o>::</span><span class=n>runOnOperation</span><span class=p>()</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=c1>// The first thing to define is the conversion target. This will define the </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// final target for this lowering. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>mlir</span><span class=o>::</span><span class=n>ConversionTarget</span> <span class=n>target</span><span class=p>(</span><span class=n>getContext</span><span class=p>());</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>// We define the specific operations, or dialects, that are legal targets for </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// this lowering. In our case, we are lowering to a combination of the </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// `Affine`, `Arith`, `Func`, and `MemRef` dialects. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>target</span><span class=p>.</span><span class=n>addLegalDialect</span><span class=o>&lt;</span><span class=n>affine</span><span class=o>::</span><span class=n>AffineDialect</span><span class=p>,</span> <span class=n>arith</span><span class=o>::</span><span class=n>ArithDialect</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=n>func</span><span class=o>::</span><span class=n>FuncDialect</span><span class=p>,</span> <span class=n>memref</span><span class=o>::</span><span class=n>MemRefDialect</span><span class=o>&gt;</span><span class=p>();</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>// We also define the Toy dialect as Illegal so that the conversion will fail </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// if any of these operations are *not* converted. Given that we actually want </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// a partial lowering, we explicitly mark the Toy operations that don&#39;t want </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// to lower, `toy.print`, as *legal*. `toy.print` will still need its operands </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// to be updated though (as we convert from TensorType to MemRefType), so we </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// only treat it as `legal` if its operands are legal. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>target</span><span class=p>.</span><span class=n>addIllegalDialect</span><span class=o>&lt;</span><span class=n>ToyDialect</span><span class=o>&gt;</span><span class=p>();</span> </span></span><span class=line><span class=cl> <span class=n>target</span><span class=p>.</span><span class=n>addDynamicallyLegalOp</span><span class=o>&lt;</span><span class=n>toy</span><span class=o>::</span><span class=n>PrintOp</span><span class=o>&gt;</span><span class=p>([](</span><span class=n>toy</span><span class=o>::</span><span class=n>PrintOp</span> <span class=n>op</span><span class=p>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=k>return</span> <span class=n>llvm</span><span class=o>::</span><span class=n>none_of</span><span class=p>(</span><span class=n>op</span><span class=o>-&gt;</span><span class=n>getOperandTypes</span><span class=p>(),</span> </span></span><span class=line><span class=cl> <span class=p>[](</span><span class=n>Type</span> <span class=n>type</span><span class=p>)</span> <span class=p>{</span> <span class=k>return</span> <span class=n>type</span><span class=p>.</span><span class=n>isa</span><span class=o>&lt;</span><span class=n>TensorType</span><span class=o>&gt;</span><span class=p>();</span> <span class=p>});</span> </span></span><span class=line><span class=cl> <span class=p>});</span> </span></span><span class=line><span class=cl> <span class=p>...</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><p>Above, we first set the toy dialect to illegal, and then the print operation as legal. We could have done this the other way around. Individual operations always take precedence over the (more generic) dialect definitions, so the order doesn&rsquo;t matter. See <code>ConversionTarget::getOpInfo</code> for the details.</p><h2 id=conversion-patterns>Conversion Patterns&nbsp;<a class=headline-hash href=#conversion-patterns>¶</a></h2><p>After the conversion target has been defined, we can define how to convert the <em>illegal</em> operations into <em>legal</em> ones. Similarly to the canonicalization framework introduced in <a href=/docs/Tutorials/Toy/Ch-3/>chapter 3</a>, the <a href=/docs/DialectConversion/><code>DialectConversion</code> framework</a> also uses <a href=/docs/Tutorials/QuickstartRewrites/>RewritePatterns</a> to perform the conversion logic. These patterns may be the <code>RewritePatterns</code> seen before or a new type of pattern specific to the conversion framework <code>ConversionPattern</code>. <code>ConversionPatterns</code> are different from traditional <code>RewritePatterns</code> in that they accept an additional <code>operands</code> parameter containing operands that have been remapped/replaced. This is used when dealing with type conversions, as the pattern will want to operate on values of the new type but match against the old. For our lowering, this invariant will be useful as it translates from the <a href=/docs/Dialects/Builtin/#rankedtensortype>TensorType</a> currently being operated on to the <a href=/docs/Dialects/Builtin/#memreftype>MemRefType</a>. Let&rsquo;s look at a snippet of lowering the <code>toy.transpose</code> operation:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-c++ data-lang=c++><span class=line><span class=cl><span class=c1>/// Lower the `toy.transpose` operation to an affine loop nest. </span></span></span><span class=line><span class=cl><span class=c1></span><span class=k>struct</span> <span class=nc>TransposeOpLowering</span> <span class=o>:</span> <span class=k>public</span> <span class=n>mlir</span><span class=o>::</span><span class=n>ConversionPattern</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=n>TransposeOpLowering</span><span class=p>(</span><span class=n>mlir</span><span class=o>::</span><span class=n>MLIRContext</span> <span class=o>*</span><span class=n>ctx</span><span class=p>)</span> </span></span><span class=line><span class=cl> <span class=o>:</span> <span class=n>mlir</span><span class=o>::</span><span class=n>ConversionPattern</span><span class=p>(</span><span class=n>TransposeOp</span><span class=o>::</span><span class=n>getOperationName</span><span class=p>(),</span> <span class=mi>1</span><span class=p>,</span> <span class=n>ctx</span><span class=p>)</span> <span class=p>{}</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>/// Match and rewrite the given `toy.transpose` operation, with the given </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>/// operands that have been remapped from `tensor&lt;...&gt;` to `memref&lt;...&gt;`. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>llvm</span><span class=o>::</span><span class=n>LogicalResult</span> </span></span><span class=line><span class=cl> <span class=n>matchAndRewrite</span><span class=p>(</span><span class=n>mlir</span><span class=o>::</span><span class=n>Operation</span> <span class=o>*</span><span class=n>op</span><span class=p>,</span> <span class=n>ArrayRef</span><span class=o>&lt;</span><span class=n>mlir</span><span class=o>::</span><span class=n>Value</span><span class=o>&gt;</span> <span class=n>operands</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=n>mlir</span><span class=o>::</span><span class=n>ConversionPatternRewriter</span> <span class=o>&amp;</span><span class=n>rewriter</span><span class=p>)</span> <span class=k>const</span> <span class=k>final</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=k>auto</span> <span class=n>loc</span> <span class=o>=</span> <span class=n>op</span><span class=o>-&gt;</span><span class=n>getLoc</span><span class=p>();</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>// Call to a helper function that will lower the current operation to a set </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// of affine loops. We provide a functor that operates on the remapped </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// operands, as well as the loop induction variables for the inner most </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// loop body. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>lowerOpToLoops</span><span class=p>(</span> </span></span><span class=line><span class=cl> <span class=n>op</span><span class=p>,</span> <span class=n>operands</span><span class=p>,</span> <span class=n>rewriter</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=p>[</span><span class=n>loc</span><span class=p>](</span><span class=n>mlir</span><span class=o>::</span><span class=n>PatternRewriter</span> <span class=o>&amp;</span><span class=n>rewriter</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=n>ArrayRef</span><span class=o>&lt;</span><span class=n>mlir</span><span class=o>::</span><span class=n>Value</span><span class=o>&gt;</span> <span class=n>memRefOperands</span><span class=p>,</span> </span></span><span class=line><span class=cl> <span class=n>ArrayRef</span><span class=o>&lt;</span><span class=n>mlir</span><span class=o>::</span><span class=n>Value</span><span class=o>&gt;</span> <span class=n>loopIvs</span><span class=p>)</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=c1>// Generate an adaptor for the remapped operands of the TransposeOp. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// This allows for using the nice named accessors that are generated </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// by the ODS. This adaptor is automatically provided by the ODS </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// framework. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>TransposeOpAdaptor</span> <span class=nf>transposeAdaptor</span><span class=p>(</span><span class=n>memRefOperands</span><span class=p>);</span> </span></span><span class=line><span class=cl> <span class=n>mlir</span><span class=o>::</span><span class=n>Value</span> <span class=n>input</span> <span class=o>=</span> <span class=n>transposeAdaptor</span><span class=p>.</span><span class=n>input</span><span class=p>();</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>// Transpose the elements by generating a load from the reverse </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// indices. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>SmallVector</span><span class=o>&lt;</span><span class=n>mlir</span><span class=o>::</span><span class=n>Value</span><span class=p>,</span> <span class=mi>2</span><span class=o>&gt;</span> <span class=n>reverseIvs</span><span class=p>(</span><span class=n>llvm</span><span class=o>::</span><span class=n>reverse</span><span class=p>(</span><span class=n>loopIvs</span><span class=p>));</span> </span></span><span class=line><span class=cl> <span class=k>return</span> <span class=n>rewriter</span><span class=p>.</span><span class=n>create</span><span class=o>&lt;</span><span class=n>mlir</span><span class=o>::</span><span class=n>AffineLoadOp</span><span class=o>&gt;</span><span class=p>(</span><span class=n>loc</span><span class=p>,</span> <span class=n>input</span><span class=p>,</span> <span class=n>reverseIvs</span><span class=p>);</span> </span></span><span class=line><span class=cl> <span class=p>});</span> </span></span><span class=line><span class=cl> <span class=k>return</span> <span class=nf>success</span><span class=p>();</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl><span class=p>};</span> </span></span></code></pre></div><p>Now we can prepare the list of patterns to use during the lowering process:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-c++ data-lang=c++><span class=line><span class=cl><span class=kt>void</span> <span class=n>ToyToAffineLoweringPass</span><span class=o>::</span><span class=n>runOnOperation</span><span class=p>()</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=p>...</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>// Now that the conversion target has been defined, we just need to provide </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// the set of patterns that will lower the Toy operations. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=n>mlir</span><span class=o>::</span><span class=n>RewritePatternSet</span> <span class=n>patterns</span><span class=p>(</span><span class=o>&amp;</span><span class=n>getContext</span><span class=p>());</span> </span></span><span class=line><span class=cl> <span class=n>patterns</span><span class=p>.</span><span class=n>add</span><span class=o>&lt;</span><span class=p>...,</span> <span class=n>TransposeOpLowering</span><span class=o>&gt;</span><span class=p>(</span><span class=o>&amp;</span><span class=n>getContext</span><span class=p>());</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=p>...</span> </span></span></code></pre></div><h2 id=partial-lowering>Partial Lowering&nbsp;<a class=headline-hash href=#partial-lowering>¶</a></h2><p>Once the patterns have been defined, we can perform the actual lowering. The <code>DialectConversion</code> framework provides several different modes of lowering, but, for our purposes, we will perform a partial lowering, as we will not convert <code>toy.print</code> at this time.</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-c++ data-lang=c++><span class=line><span class=cl><span class=kt>void</span> <span class=n>ToyToAffineLoweringPass</span><span class=o>::</span><span class=n>runOnOperation</span><span class=p>()</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=p>...</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c1>// With the target and rewrite patterns defined, we can now attempt the </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// conversion. The conversion will signal failure if any of our *illegal* </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=c1>// operations were not converted successfully. </span></span></span><span class=line><span class=cl><span class=c1></span> <span class=k>if</span> <span class=p>(</span><span class=n>mlir</span><span class=o>::</span><span class=n>failed</span><span class=p>(</span><span class=n>mlir</span><span class=o>::</span><span class=n>applyPartialConversion</span><span class=p>(</span><span class=n>getOperation</span><span class=p>(),</span> <span class=n>target</span><span class=p>,</span> <span class=n>patterns</span><span class=p>)))</span> </span></span><span class=line><span class=cl> <span class=n>signalPassFailure</span><span class=p>();</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><h3 id=design-considerations-with-partial-lowering>Design Considerations With Partial Lowering&nbsp;<a class=headline-hash href=#design-considerations-with-partial-lowering>¶</a></h3><p>Before diving into the result of our lowering, this is a good time to discuss potential design considerations when it comes to partial lowering. In our lowering, we transform from a value-type, TensorType, to an allocated (buffer-like) type, MemRefType. However, given that we do not lower the <code>toy.print</code> operation, we need to temporarily bridge these two worlds. There are many ways to go about this, each with their own tradeoffs:</p><ul><li><p>Generate <code>load</code> operations from the buffer</p><p>One option is to generate <code>load</code> operations from the buffer type to materialize an instance of the value type. This allows for the definition of the <code>toy.print</code> operation to remain unchanged. The downside to this approach is that the optimizations on the <code>affine</code> dialect are limited, because the <code>load</code> will actually involve a full copy that is only visible <em>after</em> our optimizations have been performed.</p></li><li><p>Generate a new version of <code>toy.print</code> that operates on the lowered type</p><p>Another option would be to have another, lowered, variant of <code>toy.print</code> that operates on the lowered type. The benefit of this option is that there is no hidden, unnecessary copy to the optimizer. The downside is that another operation definition is needed that may duplicate many aspects of the first. Defining a base class in <a href=/docs/DefiningDialects/Operations/>ODS</a> may simplify this, but you still need to treat these operations separately.</p></li><li><p>Update <code>toy.print</code> to allow for operating on the lowered type</p><p>A third option is to update the current definition of <code>toy.print</code> to allow for operating the on the lowered type. The benefit of this approach is that it is simple, does not introduce an additional hidden copy, and does not require another operation definition. The downside to this option is that it requires mixing abstraction levels in the <code>Toy</code> dialect.</p></li></ul><p>For the sake of simplicity, we will use the third option for this lowering. This involves updating the type constraints on the PrintOp in the operation definition file:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-tablegen data-lang=tablegen><span class=line><span class=cl><span class=k>def</span> <span class=nv>PrintOp</span> <span class=p>:</span> <span class=nv>Toy_Op</span><span class=p>&lt;</span><span class=s>&#34;print&#34;</span><span class=p>&gt;</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=p>...</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// The print operation takes an input tensor to print. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// We also allow a F64MemRef to enable interop during partial lowering. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=k>let</span> <span class=nv>arguments</span> <span class=p>=</span> <span class=p>(</span><span class=nv>ins</span> <span class=nv>AnyTypeOf</span><span class=p>&lt;[</span><span class=nv>F64Tensor</span><span class=p>,</span> <span class=nv>F64MemRef</span><span class=p>]&gt;:</span><span class=nv>$input</span><span class=p>);</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><h2 id=complete-toy-example>Complete Toy Example&nbsp;<a class=headline-hash href=#complete-toy-example>¶</a></h2><p>Let&rsquo;s take a concrete example:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl>toy<span class=p>.</span><span class=kt>func</span> <span class=nf>@main</span><span class=p>()</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%0</span> <span class=p>=</span> toy<span class=p>.</span><span class=kt>constant</span> dense<span class=p>&lt;[[</span><span class=m>1.000000e+00</span><span class=p>,</span> <span class=m>2.000000e+00</span><span class=p>,</span> <span class=m>3.000000e+00</span><span class=p>],</span> <span class=p>[</span><span class=m>4.000000e+00</span><span class=p>,</span> <span class=m>5.000000e+00</span><span class=p>,</span> <span class=m>6.000000e+00</span><span class=p>]]&gt;</span> <span class=p>:</span> <span class=kt>tensor</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%2</span> <span class=p>=</span> toy<span class=p>.</span>transpose<span class=p>(</span><span class=nv>%0</span> <span class=p>:</span> <span class=kt>tensor</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;)</span> to <span class=kt>tensor</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%3</span> <span class=p>=</span> toy<span class=p>.</span>mul <span class=nv>%2</span><span class=p>,</span> <span class=nv>%2</span> <span class=p>:</span> <span class=kt>tensor</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> toy<span class=p>.</span>print <span class=nv>%3</span> <span class=p>:</span> <span class=kt>tensor</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> toy<span class=p>.</span><span class=kt>return</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><p>With affine lowering added to our pipeline, we can now generate:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=kt>func</span><span class=p>.</span><span class=kt>func</span> <span class=nf>@main</span><span class=p>()</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%cst</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>1.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_0</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>2.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_1</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>3.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_2</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>4.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_3</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>5.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_4</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>6.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Allocating buffers for the inputs and outputs. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%0</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%1</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%2</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Initialize the input buffer with the constant values. </span></span></span><span class=line><span class=cl><span class=c></span> affine<span class=p>.</span>store <span class=nv>%cst</span><span class=p>,</span> <span class=nv>%2</span><span class=p>[</span><span class=m>0</span><span class=p>,</span> <span class=m>0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_0</span><span class=p>,</span> <span class=nv>%2</span><span class=p>[</span><span class=m>0</span><span class=p>,</span> <span class=m>1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_1</span><span class=p>,</span> <span class=nv>%2</span><span class=p>[</span><span class=m>0</span><span class=p>,</span> <span class=m>2</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_2</span><span class=p>,</span> <span class=nv>%2</span><span class=p>[</span><span class=m>1</span><span class=p>,</span> <span class=m>0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_3</span><span class=p>,</span> <span class=nv>%2</span><span class=p>[</span><span class=m>1</span><span class=p>,</span> <span class=m>1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_4</span><span class=p>,</span> <span class=nv>%2</span><span class=p>[</span><span class=m>1</span><span class=p>,</span> <span class=m>2</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Load the transpose value from the input buffer and store it into the </span></span></span><span class=line><span class=cl><span class=c></span> <span class=c>// next input buffer. </span></span></span><span class=line><span class=cl><span class=c></span> affine<span class=p>.</span>for <span class=nv>%arg0</span> <span class=p>=</span> <span class=m>0</span> to <span class=m>3</span> <span class=p>{</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>for <span class=nv>%arg1</span> <span class=p>=</span> <span class=m>0</span> to <span class=m>2</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%3</span> <span class=p>=</span> affine<span class=p>.</span>load <span class=nv>%2</span><span class=p>[</span><span class=nv>%arg1</span><span class=p>,</span> <span class=nv>%arg0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%3</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=nv>%arg0</span><span class=p>,</span> <span class=nv>%arg1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Multiply and store into the output buffer. </span></span></span><span class=line><span class=cl><span class=c></span> affine<span class=p>.</span>for <span class=nv>%arg0</span> <span class=p>=</span> <span class=m>0</span> to <span class=m>3</span> <span class=p>{</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>for <span class=nv>%arg1</span> <span class=p>=</span> <span class=m>0</span> to <span class=m>2</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%3</span> <span class=p>=</span> affine<span class=p>.</span>load <span class=nv>%1</span><span class=p>[</span><span class=nv>%arg0</span><span class=p>,</span> <span class=nv>%arg1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%4</span> <span class=p>=</span> affine<span class=p>.</span>load <span class=nv>%1</span><span class=p>[</span><span class=nv>%arg0</span><span class=p>,</span> <span class=nv>%arg1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%5</span> <span class=p>=</span> arith<span class=p>.</span>mulf <span class=nv>%3</span><span class=p>,</span> <span class=nv>%4</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%5</span><span class=p>,</span> <span class=nv>%0</span><span class=p>[</span><span class=nv>%arg0</span><span class=p>,</span> <span class=nv>%arg1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Print the value held by the buffer. </span></span></span><span class=line><span class=cl><span class=c></span> toy<span class=p>.</span>print <span class=nv>%0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%2</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%1</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>return</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><h2 id=taking-advantage-of-affine-optimization>Taking Advantage of Affine Optimization&nbsp;<a class=headline-hash href=#taking-advantage-of-affine-optimization>¶</a></h2><p>Our naive lowering is correct, but it leaves a lot to be desired with regards to efficiency. For example, the lowering of <code>toy.mul</code> has generated some redundant loads. Let&rsquo;s look at how adding a few existing optimizations to the pipeline can help clean this up. Adding the <code>LoopFusion</code> and <code>AffineScalarReplacement</code> passes to the pipeline gives the following result:</p><div class=highlight><pre tabindex=0 class=chroma><code class=language-mlir data-lang=mlir><span class=line><span class=cl><span class=kt>func</span><span class=p>.</span><span class=kt>func</span> <span class=nf>@main</span><span class=p>()</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=nv>%cst</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>1.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_0</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>2.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_1</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>3.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_2</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>4.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_3</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>5.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> <span class=nv>%cst_4</span> <span class=p>=</span> arith<span class=p>.</span><span class=kt>constant</span> <span class=m>6.000000e+00</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Allocating buffers for the inputs and outputs. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%0</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=nv>%1</span> <span class=p>=</span> <span class=kt>memref</span><span class=p>.</span>alloc<span class=p>()</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Initialize the input buffer with the constant values. </span></span></span><span class=line><span class=cl><span class=c></span> affine<span class=p>.</span>store <span class=nv>%cst</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=m>0</span><span class=p>,</span> <span class=m>0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_0</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=m>0</span><span class=p>,</span> <span class=m>1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_1</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=m>0</span><span class=p>,</span> <span class=m>2</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_2</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=m>1</span><span class=p>,</span> <span class=m>0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_3</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=m>1</span><span class=p>,</span> <span class=m>1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%cst_4</span><span class=p>,</span> <span class=nv>%1</span><span class=p>[</span><span class=m>1</span><span class=p>,</span> <span class=m>2</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> affine<span class=p>.</span>for <span class=nv>%arg0</span> <span class=p>=</span> <span class=m>0</span> to <span class=m>3</span> <span class=p>{</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>for <span class=nv>%arg1</span> <span class=p>=</span> <span class=m>0</span> to <span class=m>2</span> <span class=p>{</span> </span></span><span class=line><span class=cl> <span class=c>// Load the transpose value from the input buffer. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%2</span> <span class=p>=</span> affine<span class=p>.</span>load <span class=nv>%1</span><span class=p>[</span><span class=nv>%arg1</span><span class=p>,</span> <span class=nv>%arg0</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Multiply and store into the output buffer. </span></span></span><span class=line><span class=cl><span class=c></span> <span class=nv>%3</span> <span class=p>=</span> arith<span class=p>.</span>mulf <span class=nv>%2</span><span class=p>,</span> <span class=nv>%2</span> <span class=p>:</span> <span class=k>f64</span> </span></span><span class=line><span class=cl> affine<span class=p>.</span>store <span class=nv>%3</span><span class=p>,</span> <span class=nv>%0</span><span class=p>[</span><span class=nv>%arg0</span><span class=p>,</span> <span class=nv>%arg1</span><span class=p>]</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> <span class=p>}</span> </span></span><span class=line><span class=cl> </span></span><span class=line><span class=cl> <span class=c>// Print the value held by the buffer. </span></span></span><span class=line><span class=cl><span class=c></span> toy<span class=p>.</span>print <span class=nv>%0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%1</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>2x3x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>memref</span><span class=p>.</span>dealloc <span class=nv>%0</span> <span class=p>:</span> <span class=kt>memref</span><span class=p>&lt;</span><span class=m>3x2x</span><span class=k>f64</span><span class=p>&gt;</span> </span></span><span class=line><span class=cl> <span class=kt>return</span> </span></span><span class=line><span class=cl><span class=p>}</span> </span></span></code></pre></div><p>Here, we can see that a redundant allocation was removed, the two loop nests were fused, and some unnecessary <code>load</code>s were removed. You can build <code>toyc-ch5</code> and try yourself: <code>toyc-ch5 test/Examples/Toy/Ch5/affine-lowering.mlir -emit=mlir-affine</code>. We can also check our optimizations by adding <code>-opt</code>.</p><p>In this chapter we explored some aspects of partial lowering, with the intent to optimize. In the <a href=/docs/Tutorials/Toy/Ch-6/>next chapter</a> we will continue the discussion about dialect conversion by targeting LLVM for code generation.</p><div class=edit-meta><br></div><nav class=pagination><a class="nav nav-prev" href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-4/ title="Chapter 4: Enabling Generic Transformation with Interfaces"><i class="fas fa-arrow-left" aria-hidden=true></i> Prev - Chapter 4: Enabling Generic Transformation with Interfaces</a> <a class="nav nav-next" href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-6/ title="Chapter 6: Lowering to LLVM and CodeGeneration">Next - Chapter 6: Lowering to LLVM and CodeGeneration <i class="fas fa-arrow-right" aria-hidden=true></i></a></nav><footer><p class=powered>Powered by <a href=https://gohugo.io>Hugo</a>. Theme by <a href=https://themes.gohugo.io/hugo-theme-techdoc/>TechDoc</a>. Designed by <a href=https://github.com/thingsym/hugo-theme-techdoc>Thingsym</a>.</p></footer></main><div class=sidebar><nav class=slide-menu><ul><li><a href=https://mlir.llvm.org/>Home</a></li><li><a href=https://mlir.llvm.org/users/>Users of MLIR</a></li><li><a href=https://mlir.llvm.org/pubs/>MLIR Related Publications</a></li><li><a href=https://mlir.llvm.org/talks/>Talks</a></li><li><a href=https://mlir.llvm.org/deprecation/>Deprecations & Current Refactoring</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/getting_started/>Getting Started<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/getting_started/ReportingIssues/>Reporting Issues</a></li><li><a href=https://mlir.llvm.org/getting_started/Debugging/>Debugging Tips</a></li><li><a href=https://mlir.llvm.org/getting_started/Faq/>FAQ</a></li><li><a href=https://mlir.llvm.org/getting_started/Contributing/>How to Contribute</a></li><li><a href=https://mlir.llvm.org/getting_started/DeveloperGuide/>Developer Guide</a></li><li><a href=https://mlir.llvm.org/getting_started/openprojects/>Open Projects</a></li><li><a href=https://mlir.llvm.org/getting_started/Glossary/>Glossary</a></li><li><a href=https://mlir.llvm.org/getting_started/TestingGuide/>Testing Guide</a></li></ul></li><li class="parent has-sub-menu"><a href=https://mlir.llvm.org/docs/>Code Documentation<span class="mark opened">-</span></a><ul class=sub-menu><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Bindings/>Bindings<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Bindings/Python/>MLIR Python Bindings</a></li></ul></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Tools/>Tools<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tools/MLIRLSP/>MLIR : Language Server Protocol</a></li><li><a href=https://mlir.llvm.org/docs/Tools/mlir-reduce/>MLIR Reduce</a></li><li><a href=https://mlir.llvm.org/docs/Tools/mlir-rewrite/>mlir-rewrite</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/QuantPasses/></a></li><li><a href=https://mlir.llvm.org/docs/ActionTracing/>Action: Tracing and Debugging MLIR-based Compilers</a></li><li><a href=https://mlir.llvm.org/docs/BufferDeallocationInternals/>Buffer Deallocation - Internals</a></li><li><a href=https://mlir.llvm.org/docs/Bufferization/>Bufferization</a></li><li><a href=https://mlir.llvm.org/docs/DataLayout/>Data Layout Modeling</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/DefiningDialects/>Defining Dialects<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/DefiningDialects/Constraints/>Constraints</a></li><li><a href=https://mlir.llvm.org/docs/DefiningDialects/AttributesAndTypes/>Defining Dialect Attributes and Types</a></li><li><a href=https://mlir.llvm.org/docs/DefiningDialects/Operations/>Operation Definition Specification (ODS)</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Diagnostics/>Diagnostic Infrastructure</a></li><li><a href=https://mlir.llvm.org/docs/DialectConversion/>Dialect Conversion</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Dialects/>Dialects<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Dialects/DLTITransformOps/></a></li><li><a href=https://mlir.llvm.org/docs/Dialects/OpenACCDialect/>'acc' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Affine/>'affine' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/AMDGPU/>'amdgpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/AMX/>'amx' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArithOps/>'arith' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArmNeon/>'arm_neon' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArmSVE/>'arm_sve' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ArmSME/>'ArmSME' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/AsyncDialect/>'async' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/BufferizationOps/>'bufferization' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ControlFlowDialect/>'cf' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ComplexOps/>'complex' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/DLTIDialect/>'dlti' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/EmitC/>'emitc' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Func/>'func' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/GPU/>'gpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/IndexOps/>'index' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/IRDL/>'irdl' Dialect</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Dialects/Linalg/>'linalg' Dialect<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Dialects/Linalg/OpDSL/>Linalg OpDSL</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Dialects/LLVM/>'llvm' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MathOps/>'math' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MemRef/>'memref' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Mesh/>'mesh' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MLProgramOps/>'ml_program' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MPI/>'mpi' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/NVGPU/>'nvgpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/NVVMDialect/>'nvvm' Dialect</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Dialects/OpenMPDialect/>'omp' Dialect<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Dialects/OpenMPDialect/ODS/>ODS Documentation</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Dialects/PDLInterpOps/>'pdl_interp' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/PDLOps/>'pdl' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/PolynomialDialect/>'polynomial' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/PtrOps/>'ptr' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/QuantDialect/>'quant' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ROCDLDialect/>'rocdl' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/SCFDialect/>'scf' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/ShapeDialect/>'shape' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/SparseTensorOps/>'sparse_tensor' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/TensorOps/>'tensor' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/UBOps/>'ub' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/VCIXDialect/>'vcix' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Vector/>'vector' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/X86Vector/>'x86vector' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/XeGPU/>'xegpu' Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Builtin/>Builtin Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/MatchOpInterfaces/>OpInterface definitions</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/SPIR-V/>SPIR-V Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/TOSA/>Tensor Operator Set Architecture (TOSA) Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Dialects/Transform/>Transform Dialect</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Interfaces/>Interfaces</a></li><li><a href=https://mlir.llvm.org/docs/TargetLLVMIR/>LLVM IR Target</a></li><li><a href=https://mlir.llvm.org/docs/BytecodeFormat/>MLIR Bytecode Format</a></li><li><a href=https://mlir.llvm.org/docs/CAPI/>MLIR C API</a></li><li><a href=https://mlir.llvm.org/docs/LangRef/>MLIR Language Reference</a></li><li><a href=https://mlir.llvm.org/docs/ReleaseNotes/>MLIR Release Notes</a></li><li><a href=https://mlir.llvm.org/docs/Canonicalization/>Operation Canonicalization</a></li><li><a href=https://mlir.llvm.org/docs/OwnershipBasedBufferDeallocation/>Ownership-based Buffer Deallocation</a></li><li><a href=https://mlir.llvm.org/docs/PassManagement/>Pass Infrastructure</a></li><li><a href=https://mlir.llvm.org/docs/Passes/>Passes</a></li><li><a href=https://mlir.llvm.org/docs/PatternRewriter/>Pattern Rewriting : Generic DAG-to-DAG Rewriting</a></li><li><a href=https://mlir.llvm.org/docs/PDLL/>PDLL - PDL Language</a></li><li><a href=https://mlir.llvm.org/docs/Quantization/>Quantization</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Rationale/>Rationale<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Rationale/RationaleGenericDAGRewriter/>Generic DAG Rewriter Infrastructure Rationale</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/>Linalg Dialect Rationale: The Case For Compiler-Friendly Custom Operations</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/Rationale/>MLIR Rationale</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/MLIRForGraphAlgorithms/>MLIR: Incremental Application to Graph Algorithms in ML Frameworks</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/RationaleSimplifiedPolyhedralForm/>MLIR: The case for a simplified polyhedral form</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/SideEffectsAndSpeculation/>Side Effects & Speculation</a></li><li><a href=https://mlir.llvm.org/docs/Rationale/UsageOfConst/>Usage of 'const' in MLIR, for core IR types</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/ShapeInference/>Shape Inference</a></li><li><a href=https://mlir.llvm.org/docs/SPIRVToLLVMDialectConversion/>SPIR-V Dialect to LLVM Dialect conversion manual</a></li><li><a href=https://mlir.llvm.org/docs/SymbolsAndSymbolTables/>Symbols and Symbol Tables</a></li><li><a href=https://mlir.llvm.org/docs/DeclarativeRewrites/>Table-driven Declarative Rewrite Rule (DRR)</a></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Traits/>Traits<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Traits/Broadcastable/>The `Broadcastable` Trait</a></li></ul></li><li class="parent has-sub-menu"><a href=https://mlir.llvm.org/docs/Tutorials/>Tutorials<span class="mark opened">-</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tutorials/CreatingADialect/>Creating a Dialect</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/QuickstartRewrites/>Quickstart tutorial to adding MLIR graph rewrite</a></li><li class="parent has-sub-menu"><a href=https://mlir.llvm.org/docs/Tutorials/Toy/>Toy Tutorial<span class="mark opened">-</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-1/>Chapter 1: Toy Language and AST</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-2/>Chapter 2: Emitting Basic MLIR</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-3/>Chapter 3: High-level Language-Specific Analysis and Transformation</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-4/>Chapter 4: Enabling Generic Transformation with Interfaces</a></li><li class=active><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-5/>Chapter 5: Partial Lowering to Lower-Level Dialects for Optimization</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-6/>Chapter 6: Lowering to LLVM and CodeGeneration</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/Toy/Ch-7/>Chapter 7: Adding a Composite Type to Toy</a></li></ul></li><li class=has-sub-menu><a href=https://mlir.llvm.org/docs/Tutorials/transform/>Transform Dialect Tutorial<span class="mark closed">+</span></a><ul class=sub-menu><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch0/>Chapter 0: A Primer on “Structured” Linalg Operations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch1/>Chapter 1: Combining Existing Transformations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch2/>Chapter 2: Adding a Simple New Transformation Operation</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch3/>Chapter 3: More than Simple Transform Operations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/Ch4/>Chapter 4: Matching Payload with Transform Operations</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/transform/ChH/>Chapter H: Reproducing Halide Schedule</a></li></ul></li><li><a href=https://mlir.llvm.org/docs/Tutorials/UnderstandingTheIRStructure/>Understanding the IR Structure</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/MlirOpt/>Using `mlir-opt`</a></li><li><a href=https://mlir.llvm.org/docs/Tutorials/DataFlowAnalysis/>Writing DataFlow Analyses in MLIR</a></li></ul></li></ul></li></ul></nav><div class=sidebar-footer></div></div></div><a href=# id=backtothetop-fixed class=backtothetop data-backtothetop-duration=600 data-backtothetop-easing=easeOutQuart data-backtothetop-fixed-fadein=1000 data-backtothetop-fixed-fadeout=1000 data-backtothetop-fixed-bottom=10 data-backtothetop-fixed-right=20><span class="fa-layers fa-fw"><i class="fas fa-circle"></i> <i class="fas fa-arrow-circle-up"></i></span></a></div></body></html>

Pages: 1 2 3 4 5 6 7 8 9 10