CINXE.COM
LKML: Nicolas Pitre: Re: [patch 2/3] mutex subsystem: fastpath inlining
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: Nicolas Pitre: Re: [patch 2/3] mutex subsystem: fastpath inlining</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by Nicolas Pitre" href="/groupie.php?aid=152" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/12"> [Dec]</a> 聽 <a class="nb" href="/lkml/2005/12/27"> [27]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/12/27/133" onclick="this.href='/lkml/headers'+'/2005/12/27/133';">[headers]</a>聽 <a href="/lkml/bounce/2005/12/27/133">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/12/23/78">First message in thread</a></li><li><a href="/lkml/2005/12/26/84">Nicolas Pitre</a><ul><li><a href="/lkml/2005/12/27/50">Ingo Molnar</a><ul><li class="origin"><a href="/lkml/2005/12/28/19">Nicolas Pitre</a><ul><li><a href="/lkml/2005/12/28/19">Ingo Molnar</a><ul><li><a href="/lkml/2005/12/28/211">Nicolas Pitre</a></li></ul></li></ul></li></ul></li></ul></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Tue, 27 Dec 2005 16:59:20 -0500 (EST)</td></tr><tr><td class="lp">From</td><td class="rp" itemprop="author">Nicolas Pitre <></td></tr><tr><td class="lp">Subject</td><td class="rp" itemprop="name">Re: [patch 2/3] mutex subsystem: fastpath inlining</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">On Tue, 27 Dec 2005, Ingo Molnar wrote:<br /><br />> <br />> * Nicolas Pitre <nico@cam.org> wrote:<br />> <br />> > Some architectures, notably ARM for instance, might benefit from <br />> > inlining the mutex fast paths. [...]<br />> <br />> what is the effect on text size? Could you post the before- and <br />> after-patch vmlinux 'size kernel/test.o' output in the nondebug case, <br />> with Arjan's latest 'convert a couple of semaphore users to mutexes' <br />> patch applied? [make sure you've got enough of those users compiled in, <br />> so that the inlining cost is truly measured. Perhaps also do <br />> before/after 'size' output of a few affected .o files, without mixing <br />> kernel/mutex.o into it, like vmlinux does.]<br /><br />Theory should be convincing enough. First of all, all the semaphore <br />fast paths are always inlined currently, on all architectures I've <br />looked at. A down() fast path is always looking like this:<br /><br /> mrs ip, cpsr<br /> orr lr, ip, #128<br /> msr cpsr_c, lr<br /> ldr lr, [r0]<br /> subs lr, lr, #1<br /> str lr, [r0]<br /> msr cpsr_c, ip<br /> movmi ip, r0<br /> blmi __down_failed<br /><br />So our starting point for comparison is 9 instructions for every down() <br />occurence in the kernel. Same thing for up(). Every instruction is <br />invariably 4 bytes.<br /><br />Now let's look at the typical mutex_lock():<br /><br /> mov r4, #0<br /> swp r3, r4, [r0]<br /> cmp r3, #1<br /> blne __mutex_lock_noinline<br /><br />This is 4 instructions. Further more, the first "mov r4, #0" can often <br />be eliminated when gcc can cse the constant 0 from another <br />register. We're talking about 3 instructions then, down from 9 !<br /><br />We therefore saves between 20 and 24 bytes of kernel .text for every <br />down() and every up() simply going with mutexes.<br /><br />Now if the mutex_lock and mutex_unlock were not inlined, the above 3 or <br />4 instructions would become one or two per call site, which is still a <br />gain in space, however not as important as the one provided by the move <br />from semaphores to mutexes. It however would be more costly in terms of <br />cycles since a function prologue and epilogue is somewhat costly on ARM, <br />especially with frame pointer enabled (I'll let RMK elaborate on his <br />reasons for not disabling them).<br /><br />And for mutex_lock_interruptible(), the inlined fastpath is not bigger <br />than the non-inlined one, considering that the return value has to be <br />tested (the test is done twice in the non-inlined case: once inside the <br />function, and once outside of it) while the inlined version needs only <br />one test. They are therefore equivalent in terms of space.<br /><br /><br />Nicolas<br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo@vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-12-27 23:01 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>