CINXE.COM
How to JIT - an introduction - Eli Bendersky's website
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:og="http://ogp.me/ns#" xmlns:fb="https://www.facebook.com/2008/fbml"> <head> <title>How to JIT - an introduction - Eli Bendersky's website</title> <!-- Using the latest rendering mode for IE --> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link href="https://eli.thegreenplace.net/favicon.ico" rel="icon"> <!-- Bootstrap --> <link rel="stylesheet" href="https://eli.thegreenplace.net/theme/css/bootstrap.min.css" type="text/css"/> <link href="https://eli.thegreenplace.net/theme/css/font-awesome.min.css" rel="stylesheet"> <link href="https://eli.thegreenplace.net/theme/css/pygments/vs.css" rel="stylesheet"> <link rel="stylesheet" href="https://eli.thegreenplace.net/theme/css/style.css" type="text/css"/> <link href="https://eli.thegreenplace.net/feeds/all.atom.xml" type="application/atom+xml" rel="alternate" title="Eli Bendersky's website ATOM Feed"/> </head> <body> <div class="navbar navbar-default navbar-fixed-top" role="navigation"> <div class="container"> <div class="navbar-header"> <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button> <a href="https://eli.thegreenplace.net/" class="navbar-brand"> <img src="https://eli.thegreenplace.net/images/logosmall.png" width="32" height="32"/> Eli Bendersky's website </a> </div> <div class="collapse navbar-collapse navbar-ex1-collapse"> <ul class="nav navbar-nav navbar-right"> <li> <a href="https://eli.thegreenplace.net/pages/about"> <i class="fa fa-question"></i> <span class="icon-label">About</span> </a> </li> <li> <a href="https://eli.thegreenplace.net/pages/projects"> <i class="fa fa-github"></i> <span class="icon-label">Projects</span> </a> </li> <li> <a href="https://eli.thegreenplace.net/archives/all"> <i class="fa fa-th-list"></i> <span class="icon-label">Archives</span> </a> </li> </ul> </div> <!-- /.navbar-collapse --> </div> </div> <!-- /.navbar --> <div class="container"> <div class="row"> <section id="content"> <article> <header class="page-header"> <h1> <a href="https://eli.thegreenplace.net/2013/11/05/how-to-jit-an-introduction" rel="bookmark" title="Permalink to How to JIT - an introduction"> How to JIT - an introduction </a> </h1> </header> <div class="entry-content"> <div class="panel"> <div class="panel-body"> <footer class="post-info"> <span class="published"> <i class="fa fa-calendar"></i> <time> November 05, 2013 at 05:59</time> </span> <span class="label label-default">Tags</span> <a href="https://eli.thegreenplace.net/tag/c-c">C & C++</a> , <a href="https://eli.thegreenplace.net/tag/code-generation">Code generation</a> , <a href="https://eli.thegreenplace.net/tag/compilation">Compilation</a> </footer><!-- /.post-info --> </div> </div> <p>When I wrote the <a class="reference external" href="https://eli.thegreenplace.net/2013/10/17/getting-started-with-libjit-part-1/">introductory article for libjit</a>, I aimed it at programmers who know what JITs are, at least to some extent. I did mention what a JIT is, but only very briefly. The purpose of this article is to provide a better introductory overview of JITing, with code samples that don't rely on any libraries.</p> <div class="section" id="defining-jit"> <h3>Defining JIT</h3> <p>JIT is simply an acronym for "Just In Time". That, in itself, doesn't help much - the term is quite cryptic and seems to have little to do with programming. First, let's define what "a JIT" actually refers to. I find the following way to think about this useful:</p> <blockquote> Whenever a program, while running, creates and runs some new executable code which was not part of the program when it was stored on disk, it鈥檚 a JIT.</blockquote> <p>What about the historical usage of the term "JIT", though? Luckily, John Aycock from the University of Calgary has written a very interesting paper named "A Brief History of Just-In-Time" (google it, PDFs are available online) looking at JIT techniques from a historical point of view. According to Aycock's paper, the first mention of code generation and execution during program runtime is apparent as early as McCarthy's LISP paper from 1960. In later work, such as Thompson's 1968 regex paper, it was even more apparent (regexes are compiled into machine code and executed on the fly).</p> <p>The term JIT was first brought into use in computing literature by James Gosling for Java. Aycock mentions that Gosling has borrowed the term from the domain of <a class="reference external" href="http://en.wikipedia.org/wiki/Just_in_time_%28business%29">manufacturing</a> and started using it in the early 1990s.</p> <p>This is as far as I'll go into history here. Read the Aycock paper if you're interested in more details. Let's now see what the definition quoted above means in practice.</p> </div> <div class="section" id="jit-create-machine-code-then-run-it"> <h3>JIT - create machine code, then run it</h3> <p>I think that JIT technology is easier to explain when divided into two distinct phases:</p> <ul class="simple"> <li>Phase 1: create <a class="reference external" href="http://en.wikipedia.org/wiki/Machine_code">machine code</a> at program run-time.</li> <li>Phase 2: execute that machine code, also at program run-time.</li> </ul> <p>Phase 1 is where 99% of the challenges of JITing are. But it's also the less mystical part of the process, because this is exactly what a compiler does. Well known compilers like <tt class="docutils literal">gcc</tt> and <tt class="docutils literal">clang</tt> translate C/C++ source code into machine code. The machine code is emitted into an output stream, but it could very well be just kept in memory (and in fact, both <tt class="docutils literal">gcc</tt> and <tt class="docutils literal">clang/llvm</tt> have building blocks for keeping the code in memory for JIT execution). Phase 2 is what I want to focus on in this article.</p> </div> <div class="section" id="running-dynamically-generated-code"> <h3>Running dynamically-generated code</h3> <p>Modern operating systems are picky about what they allow a program to do at runtime. The wild-west days of the past came to an end with the advent of <a class="reference external" href="http://en.wikipedia.org/wiki/Protected_mode">protected mode</a>, which allows an OS to restrict chunks of virtual memory with various permissions. So in "normal" code, you can create new data dynamically on the heap, but <a class="reference external" href="http://en.wikipedia.org/wiki/Executable_space_protection">you can't just run stuff from the heap</a> without asking the OS to explicitly allow it.</p> <p>At this point I hope it's obvious that machine code is just data - a stream of bytes. So, this:</p> <div class="highlight" style="background: #ffffff"><pre style="line-height: 125%"><span style="color: #00007f; font-weight: bold">unsigned</span> <span style="color: #00007f; font-weight: bold">char</span>[] code = {<span style="color: #007f7f">0x48</span>, <span style="color: #007f7f">0x89</span>, <span style="color: #007f7f">0xf8</span>}; </pre></div> <p>Really depends on the eye of the beholder. To some, it's just some data that could represent anything. To others, it's the binary encoding of real, valid x86-64 machine code:</p> <div class="highlight" style="background: #ffffff"><pre style="line-height: 125%">mov %rdi, %rax </pre></div> <p>So getting machine code into memory is easy. But how to make it runnable, and then run it?</p> </div> <div class="section" id="let-s-see-some-code"> <h3>Let's see some code</h3> <p>The rest of this article contains code samples for a POSIX-compliant Unix OS (specifically Linux). On other OSes (like Windows) the code would be different in the details, but not in spirit. All modern OSes have convenient APIs to implement the same thing.</p> <p>Without further ado, here's how we dynamically create a function in memory and execute it. The function is intentionally very simple, implementing this C code:</p> <div class="highlight" style="background: #ffffff"><pre style="line-height: 125%"><span style="color: #00007f; font-weight: bold">long</span> <span style="color: #00007f">add4</span>(<span style="color: #00007f; font-weight: bold">long</span> num) { <span style="color: #00007f; font-weight: bold">return</span> num + <span style="color: #007f7f">4</span>; } </pre></div> <p>Here's a first try (the full code with a Makefile is available in <a class="reference external" href="https://github.com/eliben/libjit-samples">this repo</a>):</p> <div class="highlight" style="background: #ffffff"><pre style="line-height: 125%"><span style="color: #007f00">#include <stdio.h></span> <span style="color: #007f00">#include <stdlib.h></span> <span style="color: #007f00">#include <string.h></span> <span style="color: #007f00">#include <sys/mman.h></span> <span style="color: #007f00">// Allocates RWX memory of given size and returns a pointer to it. On failure,</span> <span style="color: #007f00">// prints out the error and returns NULL.</span> <span style="color: #00007f; font-weight: bold">void</span>* <span style="color: #00007f">alloc_executable_memory</span>(<span style="color: #00007f; font-weight: bold">size_t</span> size) { <span style="color: #00007f; font-weight: bold">void</span>* ptr = mmap(<span style="color: #007f7f">0</span>, size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -<span style="color: #007f7f">1</span>, <span style="color: #007f7f">0</span>); <span style="color: #00007f; font-weight: bold">if</span> (ptr == (<span style="color: #00007f; font-weight: bold">void</span>*)-<span style="color: #007f7f">1</span>) { perror(<span style="color: #7f007f">"mmap"</span>); <span style="color: #00007f; font-weight: bold">return</span> <span style="color: #00007f">NULL</span>; } <span style="color: #00007f; font-weight: bold">return</span> ptr; } <span style="color: #00007f; font-weight: bold">void</span> <span style="color: #00007f">emit_code_into_memory</span>(<span style="color: #00007f; font-weight: bold">unsigned</span> <span style="color: #00007f; font-weight: bold">char</span>* m) { <span style="color: #00007f; font-weight: bold">unsigned</span> <span style="color: #00007f; font-weight: bold">char</span> code[] = { <span style="color: #007f7f">0x48</span>, <span style="color: #007f7f">0x89</span>, <span style="color: #007f7f">0xf8</span>, <span style="color: #007f00">// mov %rdi, %rax</span> <span style="color: #007f7f">0x48</span>, <span style="color: #007f7f">0x83</span>, <span style="color: #007f7f">0xc0</span>, <span style="color: #007f7f">0x04</span>, <span style="color: #007f00">// add $4, %rax</span> <span style="color: #007f7f">0xc3</span> <span style="color: #007f00">// ret</span> }; memcpy(m, code, <span style="color: #00007f; font-weight: bold">sizeof</span>(code)); } <span style="color: #00007f; font-weight: bold">const</span> <span style="color: #00007f; font-weight: bold">size_t</span> SIZE = <span style="color: #007f7f">1024</span>; <span style="color: #00007f; font-weight: bold">typedef</span> <span style="color: #00007f; font-weight: bold">long</span> (*JittedFunc)(<span style="color: #00007f; font-weight: bold">long</span>); <span style="color: #007f00">// Allocates RWX memory directly.</span> <span style="color: #00007f; font-weight: bold">void</span> <span style="color: #00007f">run_from_rwx</span>() { <span style="color: #00007f; font-weight: bold">void</span>* m = alloc_executable_memory(SIZE); emit_code_into_memory(m); JittedFunc func = m; <span style="color: #00007f; font-weight: bold">int</span> result = func(<span style="color: #007f7f">2</span>); printf(<span style="color: #7f007f">"result = %d\n"</span>, result); } </pre></div> <p>The main 3 steps performed by this code are:</p> <ol class="arabic simple"> <li>Use <tt class="docutils literal">mmap</tt> to allocate a readable, writable and executable chunk of memory on the heap.</li> <li>Copy the machine code implementing <tt class="docutils literal">add4</tt> into this chunk.</li> <li>Execute code from this chunk by casting it to a function pointer and calling through it.</li> </ol> <p>Note that step 3 can only happen because the memory chunk containing the machine code is <em>executable</em>. Without setting the right permission, that call would result in a runtime error from the OS (most likely a segmentation fault). This would happen if, for example, we allocated <tt class="docutils literal">m</tt> with a regular call to <tt class="docutils literal">malloc</tt>, which allocates readable and writable, but not executable memory.</p> </div> <div class="section" id="digression-heap-malloc-and-mmap"> <h3>Digression - heap, malloc and mmap</h3> <p>Diligent readers may have noticed a half-slip I made in the previous section, by referring to memory returned from <tt class="docutils literal">mmap</tt> as "heap memory". Very strictly speaking, "heap" is a name that designates the memory used by <tt class="docutils literal">malloc</tt>, <tt class="docutils literal">free</tt> et. al. to manage runtime-allocated memory, as opposed to "stack" which is managed implicitly by the compiler.</p> <p>That said, it's not so simple :-) While traditionally (i.e. a long time ago) <tt class="docutils literal">malloc</tt> only used one source for its memory (the <tt class="docutils literal">sbrk</tt> system call), these days most malloc implementations use <tt class="docutils literal">mmap</tt> in many cases. The details differ between OSes and implementations, but often <tt class="docutils literal">mmap</tt> is used for the large chunks and <tt class="docutils literal">sbrk</tt> for the small chunks. The tradeoffs have to do with the relative efficiency of the two methods of requesting more memory from the OS.</p> <p>So calling memory provided by <tt class="docutils literal">mmap</tt> "heap memory" is not a mistake, IMHO, and that's what I intend to keep on doing.</p> </div> <div class="section" id="caring-more-about-security"> <h3>Caring more about security</h3> <p>The code shown above has a problem - it's a security hole. The reason is the RWX (Readable, Writable, eXecutable) chunk of memory it allocates - a paradise for attacks and exploits. So let's be a bit more responsible about it. Here's some slightly modified code:</p> <div class="highlight" style="background: #ffffff"><pre style="line-height: 125%"><span style="color: #007f00">// Allocates RW memory of given size and returns a pointer to it. On failure,</span> <span style="color: #007f00">// prints out the error and returns NULL. Unlike malloc, the memory is allocated</span> <span style="color: #007f00">// on a page boundary so it's suitable for calling mprotect.</span> <span style="color: #00007f; font-weight: bold">void</span>* <span style="color: #00007f">alloc_writable_memory</span>(<span style="color: #00007f; font-weight: bold">size_t</span> size) { <span style="color: #00007f; font-weight: bold">void</span>* ptr = mmap(<span style="color: #007f7f">0</span>, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -<span style="color: #007f7f">1</span>, <span style="color: #007f7f">0</span>); <span style="color: #00007f; font-weight: bold">if</span> (ptr == (<span style="color: #00007f; font-weight: bold">void</span>*)-<span style="color: #007f7f">1</span>) { perror(<span style="color: #7f007f">"mmap"</span>); <span style="color: #00007f; font-weight: bold">return</span> <span style="color: #00007f">NULL</span>; } <span style="color: #00007f; font-weight: bold">return</span> ptr; } <span style="color: #007f00">// Sets a RX permission on the given memory, which must be page-aligned. Returns</span> <span style="color: #007f00">// 0 on success. On failure, prints out the error and returns -1.</span> <span style="color: #00007f; font-weight: bold">int</span> <span style="color: #00007f">make_memory_executable</span>(<span style="color: #00007f; font-weight: bold">void</span>* m, <span style="color: #00007f; font-weight: bold">size_t</span> size) { <span style="color: #00007f; font-weight: bold">if</span> (mprotect(m, size, PROT_READ | PROT_EXEC) == -<span style="color: #007f7f">1</span>) { perror(<span style="color: #7f007f">"mprotect"</span>); <span style="color: #00007f; font-weight: bold">return</span> -<span style="color: #007f7f">1</span>; } <span style="color: #00007f; font-weight: bold">return</span> <span style="color: #007f7f">0</span>; } <span style="color: #007f00">// Allocates RW memory, emits the code into it and sets it to RX before</span> <span style="color: #007f00">// executing.</span> <span style="color: #00007f; font-weight: bold">void</span> <span style="color: #00007f">emit_to_rw_run_from_rx</span>() { <span style="color: #00007f; font-weight: bold">void</span>* m = alloc_writable_memory(SIZE); emit_code_into_memory(m); make_memory_executable(m, SIZE); JittedFunc func = m; <span style="color: #00007f; font-weight: bold">int</span> result = func(<span style="color: #007f7f">2</span>); printf(<span style="color: #7f007f">"result = %d\n"</span>, result); } </pre></div> <p>It's equivalent to the earlier snippet in all respects except one: the memory is first allocated with RW permissions (just like a normal <tt class="docutils literal">malloc</tt> would do). This is all we really need to write our machine code into it. When the code is there, we use <tt class="docutils literal">mprotect</tt> to change the chunk's permission from RW to RX, making it executable but <em>no longer writable</em>. So the effect is the same, but at no point in the execution of our program the chunk is both writable and executable, which is good from a security point of view.</p> </div> <div class="section" id="what-about-malloc"> <h3>What about malloc?</h3> <p>Could we use <tt class="docutils literal">malloc</tt> instead of <tt class="docutils literal">mmap</tt> for allocating the chunk in the previous snippet? After all, RW memory is exactly what <tt class="docutils literal">malloc</tt> provides. Yes, we could. However, it's more trouble than it's worth, really. The reason is that protection bits can only be set on virtual memory page boundaries. Therefore, had we used <tt class="docutils literal">malloc</tt> we'd have to manually ensure that the allocation is aligned at a page boundary. Otherwise, <tt class="docutils literal">mprotect</tt> could have unwanted effects from failing to enabling/disabling more than actually required. <tt class="docutils literal">mmap</tt> takes care of this for us by only allocating at page boundaries (because <tt class="docutils literal">mmap</tt>, by design, maps whole pages).</p> </div> <div class="section" id="tying-loose-ends"> <h3>Tying loose ends</h3> <p>This article started with a high-level overview of what we mean when we say JIT, and ended with hands-on code snippets that show how to dynamically emit machine code into memory and execute it.</p> <p>The technique shown here is pretty much how real JIT engines (e.g. LLVM and libjit) emit and run executable machine code from memory. What remains is just a "simple" matter of synthesizing that machine code from something else.</p> <p>LLVM has a full compiler available, so it can actually translate C and C++ code (through LLVM IR) to machine code at runtime, and then execute it. libjit picks the ball up at a much lower level - it can serve as a backend for a compiler. In fact, <a class="reference external" href="https://eli.thegreenplace.net/2013/10/17/getting-started-with-libjit-part-1/">my introductory article on libjit</a> already demonstrates how to emit and run non-trivial code with libjit. But JITing is a more general concept. Emitting code at run-time can be done for <a class="reference external" href="http://pyevolve.sourceforge.net/wordpress/?p=914">data structures</a>, <a class="reference external" href="http://sljit.sourceforge.net/">regular expressions</a> and even <a class="reference external" href="http://luajit.org/ext_ffi.html">accessing C from language VMs</a>. Digging in my blog's archives helped me find a mention of some <a class="reference external" href="https://eli.thegreenplace.net/2005/09/04/cool-hack-creating-custom-subroutines-on-the-fly-in-perl/">JITing I did 8 years ago</a>. That was Perl code generating more Perl code at run-time (from a XML description of a serialization format), but the idea is the same.</p> <p>This is why I felt that splitting the JITing concept into two phases is important. For phase 2 (which was explained in this article), the implementation is relatively obvious and uses well defined OS APIs. For phase 1, the possibilites are endless and what you do ultimately depends on the application you're developing.</p> </div> </div> <!-- /.entry-content --> <hr/> <div class="dotted-links"> <p class="align-center"> For comments, please send me <a href="mailto:eliben@gmail.com"><i class="fa fa-envelope-o"></i> an email</a>. </p> </div> </article> </section> </div> </div> <footer> <div class="container"> <hr> <div class="row"> <div class="col-xs-10"> © 2003-2025 Eli Bendersky </div> <div class="col-xs-2"><p class="pull-right"><i class="fa fa-arrow-up"></i> <a href="#">Back to top</a></p></div> </div> </div> </footer> <script src="//code.jquery.com/jquery-2.2.4.min.js"></script> <!-- Include all compiled plugins (below), or include individual files as needed --> <script src="https://eli.thegreenplace.net/theme/js/bootstrap.min.js"></script> <!-- Using goatcounter to count visitors. The count.js script is vendored in. --> <script data-goatcounter="https://stats.thegreenplace.net/count" async src="https://eli.thegreenplace.net/theme/js/count.js"></script> </body> </html>