CINXE.COM
LKML: Andrew Morton: Re: [patch 0/9] mutex subsystem, -V4
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: Andrew Morton: Re: [patch 0/9] mutex subsystem, -V4</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by Andrew Morton" href="/groupie.php?aid=28391" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/12"> [Dec]</a> 聽 <a class="nb" href="/lkml/2005/12/22"> [22]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/12/22/99" onclick="this.href='/lkml/headers'+'/2005/12/22/99';">[headers]</a>聽 <a href="/lkml/bounce/2005/12/22/99">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/12/22/69">First message in thread</a></li><li><a href="/lkml/2005/12/22/81">Andrew Morton</a><ul><li><a href="/lkml/2005/12/22/88">Ingo Molnar</a><ul><li class="origin"><a href="/lkml/2005/12/22/104">Andrew Morton</a><ul><li><a href="/lkml/2005/12/22/104">Arjan van de Ven</a><ul><li><a href="/lkml/2005/12/22/113">Andrew Morton</a></li></ul></li><li><a href="/lkml/2005/12/22/132">Nicolas Pitre</a></li><li><a href="/lkml/2005/12/22/270">Paul Mackerras</a></li></ul></li></ul></li></ul></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Thu, 22 Dec 2005 05:07:01 -0800</td></tr><tr><td class="lp">From</td><td class="rp" itemprop="author">Andrew Morton <></td></tr><tr><td class="lp">Subject</td><td class="rp" itemprop="name">Re: [patch 0/9] mutex subsystem, -V4</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">Ingo Molnar <mingo@elte.hu> wrote:<br />><br />> ... <br />><br />> here are the top 10 reasons of why i think the generic mutex code should <br />> be considered for upstream integration:<br /><br />Appropriate for the changelog, please.<br /><br />> - 'struct mutex' is smaller: on x86, 'struct semaphore' is 20 bytes, <br />> 'struct mutex' is 16 bytes. A smaller structure size means less RAM <br />> footprint, and better CPU-cache utilization.<br /><br />Because of the .sleepers thing. Perhaps a revamped semaphore wouldn't need<br />thsat field?<br /><br />> - tighter code. On x86 i get the following .text sizes when <br />> switching all mutex-alike semaphores in the kernel to the mutex <br />> subsystem:<br />> <br />> text data bss dec hex filename<br />> 3280380 868188 396860 4545428 455b94 vmlinux-semaphore<br />> 3255329 865296 396732 4517357 44eded vmlinux-mutex<br />> <br />> that's 25051 bytes of code saved, or a 0.76% win - off the hottest <br />> codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%) <br />> Smaller code means better icache footprint, which is one of the <br />> major optimization goals in the Linux kernel currently.<br /><br />Why is the mutex-using kernel any smaller? IOW: from where do these<br />savings come?<br /><br />> - the mutex subsystem is faster and has superior scalability for <br />> contented workloads. On an 8-way x86 system, running a mutex-based <br />> kernel and testing creat+unlink+close (of separate, per-task files)<br />> in /tmp with 16 parallel tasks, the average number of ops/sec is:<br />> <br />> Semaphores: Mutexes:<br />> <br />> $ ./test-mutex V 16 10 $ ./test-mutex V 16 10<br />> 8 CPUs, running 16 tasks. 8 CPUs, running 16 tasks.<br />> checking VFS performance. checking VFS performance.<br />> avg loops/sec: 34713 avg loops/sec: 84153<br />> CPU utilization: 63% CPU utilization: 22%<br />> <br />> i.e. in this workload, the mutex based kernel was 2.4 times faster <br />> than the semaphore based kernel, _and_ it also had 2.8 times less CPU <br />> utilization. (In terms of 'ops per CPU cycle', the semaphore kernel <br />> performed 551 ops/sec per 1% of CPU time used, while the mutex kernel <br />> performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times <br />> more efficient.)<br />> <br />> the scalability difference is visible even on a 2-way P4 HT box:<br />> <br />> Semaphores: Mutexes:<br />> <br />> $ ./test-mutex V 16 10 $ ./test-mutex V 16 10<br />> 4 CPUs, running 16 tasks. 8 CPUs, running 16 tasks.<br />> checking VFS performance. checking VFS performance.<br />> avg loops/sec: 127659 avg loops/sec: 181082<br />> CPU utilization: 100% CPU utilization: 34%<br />> <br />> (the straight performance advantage of mutexes is 41%, the per-cycle <br />> efficiency of mutexes is 4.1 times better.)<br /><br />Why is the mutex-using kernel more scalable?<br /><br />Can semaphores be tuned to offer the same scalability improvements?<br /><br />> ...<br />><br />> - the per-call-site inlining cost of the slowpath is cheaper and <br />> smaller than that of semaphores, by one instruction, because the <br />> mutex trampoline code does not do a "lea %0,%%eax" that the semaphore <br />> code does before calling __down_failed. The mutex subsystem uses out <br />> of line code currently so this makes only a small difference in .text<br />> size, but in case we want to inline mutexes, they will be cheaper <br />> than semaphores.<br /><br />Cannot the semaphore code be improved in the same manner?<br /><br />> - No wholesale or dangerous migration path. The migration to mutexes is <br />> fundamentally opt-in, safe and easy: multiple type-based and .config <br />> based migration helpers are provided to make the migration to mutexes <br />> easy. Migration is as finegrained as it gets, so robustness of the <br />> kernel or out-of-tree code should not be jeopardized at any stage. <br />> The migration helpers can be eliminated once migration is completed, <br />> once the kernel has been separated into 'mutex users' and 'semaphore <br />> users'. Out-of-tree code automatically defaults to semaphore <br />> semantics, mutexes are not forced upon anyone, at any stage of the <br />> migration.<br /><br />IOW: a complete PITA, sorry.<br /><br />> - 'struct mutex' semantics are well-defined and are enforced if<br />> CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have <br />> virtually no debugging code or instrumentation. The mutex subsystem <br />> checks and enforces the following rules:<br />> <br />> * - only one task can hold the mutex at a time<br />> * - only the owner can unlock the mutex<br />> * - multiple unlocks are not permitted<br />> * - recursive locking is not permitted<br />> * - a mutex object must be initialized via the API<br />> * - a mutex object must not be initialized via memset or copying<br />> * - task may not exit with mutex held<br />> * - memory areas where held locks reside must not be freed<br /><br />Pretty much all of that could be added to semaphores-when-used-as-mutexes. <br />Without introducing a whole new locking mechanism.<br /><br />> furthermore, there are also convenience features in the debugging <br />> code:<br />> <br />> * - uses symbolic names of mutexes, whenever they are printed in debug output<br />> * - point-of-acquire tracking, symbolic lookup of function names<br />> * - list of all locks held in the system, printout of them<br />> * - owner tracking<br />> * - detects self-recursing locks and prints out all relevant info<br />> * - detects multi-task circular deadlocks and prints out all affected<br />> * locks and tasks (and only those tasks)<br /><br />Ditto, I expect.<br /><br />> - the generic mutex subsystem is also one more step towards enabling <br />> the fully preemptable -rt kernel. Ok, this shouldnt bother the <br />> upstream kernel too much at the moment, but it's a personal driving<br />> force for me nevertheless ;-)<br /><br />Actually that's the only compelling reason I've yet seen, sorry.<br /><br />What is it about these mutexes which is useful to RT and why cannot that<br />feature be provided by semaphores?<br /><br /><br />Ingo, there appear to be quite a few straw-man arguments here. You're<br />comparing a subsystem (semaphores) which obviously could do with a lot of<br />fixing and enhancing with something new which has had a lot of recent<br />feature work out into it.<br /><br />I'd prefer to see mutexes compared with semaphores after you've put as much<br />work into improving semaphores as you have into developing mutexes.<br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo@vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-12-22 14:11 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>