CINXE.COM
LKML: David Howells: Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: David Howells: Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by David Howells" href="/groupie.php?aid=345" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/12"> [Dec]</a> 聽 <a class="nb" href="/lkml/2005/12/16"> [16]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/12/16/43" onclick="this.href='/lkml/headers'+'/2005/12/16/43';">[headers]</a>聽 <a href="/lkml/bounce/2005/12/16/43">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/12/12/244">First message in thread</a></li><li><a href="/lkml/2005/12/14/110">David Howells</a><ul><li><a href="/lkml/2005/12/14/116">Jakub Jelinek</a></li><li><a href="/lkml/2005/12/15/427">Nick Piggin</a><ul><li class="origin"><a href="/lkml/2005/12/16/45">David Howells</a><ul><li><a href="/lkml/2005/12/16/45">David Howells</a><ul><li><a href="/lkml/2005/12/16/159">Linus Torvalds</a></li></ul></li><li><a href="/lkml/2005/12/16/88">Nick Piggin</a><ul><li><a href="/lkml/2005/12/16/95">Russell King</a></li><li><a href="/lkml/2005/12/17/39">Nikita Danilov</a></li></ul></li><li><a href="/lkml/2005/12/16/156">Linus Torvalds</a></li></ul></li></ul></li></ul></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">From</td><td class="rp" itemprop="author">David Howells <></td></tr><tr><td class="lp">Subject</td><td class="rp" itemprop="name">Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation</td></tr><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Fri, 16 Dec 2005 11:02:22 +0000</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">Nick Piggin <nickpiggin@yahoo.com.au> wrote:<br /><br />> > (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64,<br />> > sparc.<br />> > (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.<br />> > <br />> <br />> cmpxchg is basically exactly equivalent to a store-conditional, so 2 and 3<br />> are the same level.<br /><br />No, they're not. LL/SC is more flexible than CMPXCHG because under some<br />circumstances, you can get away without doing the SC, and because sometimes<br />you can do one LL/SC in lieu of two CMPXCHG's because LL/SC allows you to<br />retrieve the value, consider it and then modify it if you want to. With<br />CMPXCHG you have to anticipate, and so you're more likely to get it wrong.<br /><br />> I don't know why you don't implement a "good" default implementation with<br />> atomic_cmpxchg.<br /><br />Because it wouldn't be a good default. I'm thinking the best default is simply<br />to wrap a counting semaphore. Where overriding this really matters is class 1<br />CPUs that don't have CMPXCHG, LL/SC, or in the x86 case, LOCK INC/DEC.<br /><br />I've had a play with x86, and on there CMPXCHG, XCHG and XADD give worse<br />performance than INC/DEC for some reason. I assume this is something to do<br />with how the PPro CPU optimises itself. On PPro CPUs at least, counting<br />semaphores really are the most efficient way. CMPXCHG, whilst it ought to be<br />better, really isn't.<br /><br />One thing I have noticed, though, is that the counting semaphore tends to be<br />quite uneven in its distribution across threads in a situation where a lot of<br />threads are all trying to thrash the semaphore at the same time:<br /><br /> insmod /tmp/synchro-test.ko v=1 do_sched=1 sm=20 ism=1<br /><br />gives:<br /><br /> SEM: 2% 1% 2% 5% 4% 4% 3% 11% 2% 33% 1% 1% 5% 2% 2% 1% 2% 3% 3% 4%<br /><br />on a dual 200MHz PPro.<br /><br />Whereas my mutexes are much more even:<br /><br /> MTX: 5% 5% 4% 4% 4% 5% 5% 5% 5% 4% 4% 4% 4% 4% 6% 5% 4% 4% 4% 6%<br /><br />(See attached module).<br /><br />Now, I don't think that this situation is very likely to crop up in ordinary<br />use, but it seems odd.<br /><br />David<br /><br />/* synchro-test.c: run some threads to test the synchronisation primitives<br /> *<br /> * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.<br /> * Written by David Howells (dhowells@redhat.com)<br /> *<br /> * This program is free software; you can redistribute it and/or<br /> * modify it under the terms of the GNU General Public License<br /> * as published by the Free Software Foundation; either version<br /> * 2 of the License, or (at your option) any later version.<br /> *<br /> * run as something like:<br /> *<br /> * insmod synchro-test.ko rd=2 wr=2<br /> * insmod synchro-test.ko mx=1<br /> * insmod synchro-test.ko sm=2 ism=1<br /> * insmod synchro-test.ko sm=2 ism=2<br /> */<br /><br />#include <linux/config.h><br />#include <linux/module.h><br />#include <linux/poll.h><br />#include <linux/moduleparam.h><br />#include <linux/stat.h><br />#include <linux/init.h><br />#include <asm/atomic.h><br />#include <linux/personality.h><br />#include <linux/smp_lock.h><br />#include <linux/delay.h><br />#include <linux/timer.h><br />#include <linux/completion.h><br />#include <linux/mutex.h><br /><br />#define VALIDATE_OPERATORS 0<br /><br />static int nummx = 0;<br />static int numsm = 0, seminit = 4;<br />static int numrd = 0, numwr = 0, numdg = 0;<br />static int elapse = 5, load = 0, do_sched = 0;<br />static int verbose = 0;<br /><br />MODULE_AUTHOR("David Howells");<br />MODULE_DESCRIPTION("Synchronisation primitive test demo");<br />MODULE_LICENSE("GPL");<br /><br />module_param_named(v, verbose, int, 0);<br />MODULE_PARM_DESC(verbose, "Verbosity");<br /><br />module_param_named(mx, nummx, int, 0);<br />MODULE_PARM_DESC(nummx, "Number of mutex threads");<br /><br />module_param_named(sm, numsm, int, 0);<br />MODULE_PARM_DESC(numsm, "Number of semaphore threads");<br /><br />module_param_named(ism, seminit, int, 0);<br />MODULE_PARM_DESC(seminit, "Initial semaphore value");<br /><br />module_param_named(rd, numrd, int, 0);<br />MODULE_PARM_DESC(numrd, "Number of reader threads");<br /><br />module_param_named(wr, numwr, int, 0);<br />MODULE_PARM_DESC(numwr, "Number of writer threads");<br /><br />module_param_named(dg, numdg, int, 0);<br />MODULE_PARM_DESC(numdg, "Number of downgrader threads");<br /><br />module_param(elapse, int, 0);<br />MODULE_PARM_DESC(elapse, "Number of seconds to run for");<br /><br />module_param(load, int, 0);<br />MODULE_PARM_DESC(load, "Length of load in uS");<br /><br />module_param(do_sched, int, 0);<br />MODULE_PARM_DESC(do_sched, "True if each thread should schedule regularly");<br /><br />/* the semaphores under test */<br />static struct mutex ____cacheline_aligned mutex;<br />static struct semaphore ____cacheline_aligned sem;<br />static struct rw_semaphore ____cacheline_aligned rwsem;<br /><br />static atomic_t ____cacheline_aligned do_stuff = ATOMIC_INIT(0);<br /><br />#if VALIDATE_OPERATORS<br />static atomic_t ____cacheline_aligned mutexes = ATOMIC_INIT(0);<br />static atomic_t ____cacheline_aligned semaphores = ATOMIC_INIT(0);<br />static atomic_t ____cacheline_aligned readers = ATOMIC_INIT(0);<br />static atomic_t ____cacheline_aligned writers = ATOMIC_INIT(0);<br />#endif<br /><br />static unsigned int ____cacheline_aligned mutexes_taken[20];<br />static unsigned int ____cacheline_aligned semaphores_taken[20];<br />static unsigned int ____cacheline_aligned reads_taken[20];<br />static unsigned int ____cacheline_aligned writes_taken[20];<br />static unsigned int ____cacheline_aligned downgrades_taken[20];<br /><br />static struct completion ____cacheline_aligned mx_comp[20];<br />static struct completion ____cacheline_aligned sm_comp[20];<br />static struct completion ____cacheline_aligned rd_comp[20];<br />static struct completion ____cacheline_aligned wr_comp[20];<br />static struct completion ____cacheline_aligned dg_comp[20];<br /><br />static struct timer_list ____cacheline_aligned timer;<br /><br />#define ACCOUNT(var, N) var##_taken[N]++;<br /><br />#if VALIDATE_OPERATORS<br />#define TRACK(var, dir) atomic_##dir(&(var))<br /><br />#define CHECK(var, cond, val) \<br />do { \<br /> int x = atomic_read(&(var)); \<br /> if (unlikely(!(x cond (val)))) \<br /> printk("check [%s %s %d, == %d] failed in %s\n", \<br /> #var, #cond, (val), x, __func__); \<br />} while (0)<br /><br />#else<br />#define TRACK(var, dir) do {} while(0)<br />#define CHECK(var, cond, val) do {} while(0)<br />#endif<br /><br />static inline void do_mutex_lock(unsigned int N)<br />{<br /> mutex_lock(&mutex);<br /><br /> ACCOUNT(mutexes, N);<br /> TRACK(mutexes, inc);<br /> CHECK(mutexes, ==, 1);<br />}<br /><br />static inline void do_mutex_unlock(unsigned int N)<br />{<br /> CHECK(mutexes, ==, 1);<br /> TRACK(mutexes, dec);<br /><br /> mutex_unlock(&mutex);<br />}<br /><br />static inline void do_down(unsigned int N)<br />{<br /> CHECK(mutexes, <, seminit);<br /><br /> down(&sem);<br /><br /> ACCOUNT(semaphores, N);<br /> TRACK(semaphores, inc);<br />}<br /><br />static inline void do_up(unsigned int N)<br />{<br /> CHECK(semaphores, >, 0);<br /> TRACK(semaphores, dec);<br /><br /> up(&sem);<br />}<br /><br />static inline void do_down_read(unsigned int N)<br />{<br /> down_read(&rwsem);<br /><br /> ACCOUNT(reads, N);<br /> TRACK(readers, inc);<br /> CHECK(readers, >, 0);<br /> CHECK(writers, ==, 0);<br />}<br /><br />static inline void do_up_read(unsigned int N)<br />{<br /> CHECK(readers, >, 0);<br /> CHECK(writers, ==, 0);<br /> TRACK(readers, dec);<br /><br /> up_read(&rwsem);<br />}<br /><br />static inline void do_down_write(unsigned int N)<br />{<br /> down_write(&rwsem);<br /><br /> ACCOUNT(writes, N);<br /> TRACK(writers, inc);<br /> CHECK(writers, ==, 1);<br /> CHECK(readers, ==, 0);<br />}<br /><br />static inline void do_up_write(unsigned int N)<br />{<br /> CHECK(writers, ==, 1);<br /> CHECK(readers, ==, 0);<br /> TRACK(writers, dec);<br /><br /> up_write(&rwsem);<br />}<br /><br />static inline void do_downgrade_write(unsigned int N)<br />{<br /> CHECK(writers, ==, 1);<br /> CHECK(readers, ==, 0);<br /> TRACK(writers, dec);<br /> TRACK(readers, inc);<br /><br /> downgrade_write(&rwsem);<br /><br /> ACCOUNT(downgrades, N);<br />}<br /><br />static inline void sched(void)<br />{<br /> if (do_sched)<br /> schedule();<br />}<br /><br />int mutexer(void *arg)<br />{<br /> unsigned int N = (unsigned long) arg;<br /><br /> daemonize("Mutex%u", N);<br /><br /> while (atomic_read(&do_stuff)) {<br /> do_mutex_lock(N);<br /> if (load)<br /> udelay(load);<br /> do_mutex_unlock(N);<br /> sched();<br /> }<br /><br /> if (verbose >= 2)<br /> printk("%s: done\n", current->comm);<br /> complete_and_exit(&mx_comp[N], 0);<br />}<br /><br />int semaphorer(void *arg)<br />{<br /> unsigned int N = (unsigned long) arg;<br /><br /> daemonize("Sem%u", N);<br /><br /> while (atomic_read(&do_stuff)) {<br /> do_down(N);<br /> if (load)<br /> udelay(load);<br /> do_up(N);<br /> sched();<br /> }<br /><br /> if (verbose >= 2)<br /> printk("%s: done\n", current->comm);<br /> complete_and_exit(&sm_comp[N], 0);<br />}<br /><br />int reader(void *arg)<br />{<br /> unsigned int N = (unsigned long) arg;<br /><br /> daemonize("Read%u", N);<br /><br /> while (atomic_read(&do_stuff)) {<br /> do_down_read(N);<br />#ifdef LOAD_TEST<br /> if (load)<br /> udelay(load);<br />#endif<br /> do_up_read(N);<br /> sched();<br /> }<br /><br /> if (verbose >= 2)<br /> printk("%s: done\n", current->comm);<br /> complete_and_exit(&rd_comp[N], 0);<br />}<br /><br />int writer(void *arg)<br />{<br /> unsigned int N = (unsigned long) arg;<br /><br /> daemonize("Write%u", N);<br /><br /> while (atomic_read(&do_stuff)) {<br /> do_down_write(N);<br />#ifdef LOAD_TEST<br /> if (load)<br /> udelay(load);<br />#endif<br /> do_up_write(N);<br /> sched();<br /> }<br /><br /> if (verbose >= 2)<br /> printk("%s: done\n", current->comm);<br /> complete_and_exit(&wr_comp[N], 0);<br />}<br /><br />int downgrader(void *arg)<br />{<br /> unsigned int N = (unsigned long) arg;<br /><br /> daemonize("Down%u", N);<br /><br /> while (atomic_read(&do_stuff)) {<br /> do_down_write(N);<br />#ifdef LOAD_TEST<br /> if (load)<br /> udelay(load);<br />#endif<br /> do_downgrade_write(N);<br />#ifdef LOAD_TEST<br /> if (load)<br /> udelay(load);<br />#endif<br /> do_up_read(N);<br /> sched();<br /> }<br /><br /> if (verbose >= 2)<br /> printk("%s: done\n", current->comm);<br /> complete_and_exit(&dg_comp[N], 0);<br />}<br /><br />static void stop_test(unsigned long dummy)<br />{<br /> atomic_set(&do_stuff, 0);<br />}<br /><br />static unsigned int total(const char *what, unsigned int counts[], int num)<br />{<br /> unsigned int tot = 0, max = 0, min = UINT_MAX, zeros = 0, cnt;<br /> int loop;<br /><br /> for (loop = 0; loop < num; loop++) {<br /> cnt = counts[loop];<br /><br /> if (cnt == 0) {<br /> zeros++;<br /> min = 0;<br /> continue;<br /> }<br /><br /> tot += cnt;<br /> if (tot > max)<br /> max = tot;<br /> if (tot < min)<br /> min = tot;<br /> }<br /><br /> if (verbose && tot > 0) {<br /> printk("%s:", what);<br /><br /> for (loop = 0; loop < num; loop++) {<br /> cnt = counts[loop];<br /><br /> if (cnt == 0)<br /> printk(" zzz");<br /> else<br /> printk(" %d%%", cnt * 100 / tot);<br /> }<br /><br /> printk("\n");<br /> }<br /><br /> return tot;<br />}<br /><br />/*****************************************************************************/<br />/*<br /> *<br /> */<br />static int __init do_tests(void)<br />{<br /> unsigned long loop;<br /> unsigned int mutex_total, sem_total, rd_total, wr_total, dg_total;<br /><br /> if (nummx < 0 || nummx > 20 ||<br /> numsm < 0 || numsm > 20 ||<br /> numrd < 0 || numrd > 20 ||<br /> numwr < 0 || numwr > 20 ||<br /> numdg < 0 || numdg > 20 ||<br /> seminit < 1 ||<br /> elapse < 1<br /> ) {<br /> printk("Parameter out of range\n");<br /> return -ERANGE;<br /> }<br /><br /> if ((nummx | numsm | numrd | numwr | numdg) == 0) {<br /> printk("Nothing to do\n");<br /> return -EINVAL;<br /> }<br /><br /> if (verbose)<br /> printk("\nStarting synchronisation primitive tests...\n");<br /><br /> mutex_init(&mutex);<br /> sema_init(&sem, seminit);<br /> init_rwsem(&rwsem);<br /> atomic_set(&do_stuff, 1);<br /><br /> /* kick off all the children */<br /> for (loop = 0; loop < 20; loop++) {<br /> if (loop < nummx) {<br /> init_completion(&mx_comp[loop]);<br /> kernel_thread(mutexer, (void *) loop, 0);<br /> }<br /><br /> if (loop < numsm) {<br /> init_completion(&sm_comp[loop]);<br /> kernel_thread(semaphorer, (void *) loop, 0);<br /> }<br /><br /> if (loop < numrd) {<br /> init_completion(&rd_comp[loop]);<br /> kernel_thread(reader, (void *) loop, 0);<br /> }<br /><br /> if (loop < numwr) {<br /> init_completion(&wr_comp[loop]);<br /> kernel_thread(writer, (void *) loop, 0);<br /> }<br /><br /> if (loop < numdg) {<br /> init_completion(&dg_comp[loop]);<br /> kernel_thread(downgrader, (void *) loop, 0);<br /> }<br /> }<br /><br /> /* set a stop timer */<br /> init_timer(&timer);<br /> timer.function = stop_test;<br /> timer.expires = jiffies + elapse * HZ;<br /> add_timer(&timer);<br /><br /> /* now wait until it's all done */<br /> for (loop = 0; loop < nummx; loop++)<br /> wait_for_completion(&mx_comp[loop]);<br /><br /> for (loop = 0; loop < numsm; loop++)<br /> wait_for_completion(&sm_comp[loop]);<br /><br /> for (loop = 0; loop < numrd; loop++)<br /> wait_for_completion(&rd_comp[loop]);<br /><br /> for (loop = 0; loop < numwr; loop++)<br /> wait_for_completion(&wr_comp[loop]);<br /><br /> for (loop = 0; loop < numdg; loop++)<br /> wait_for_completion(&dg_comp[loop]);<br /><br /> atomic_set(&do_stuff, 0);<br /> del_timer(&timer);<br /><br /> if (mutex_is_locked(&mutex))<br /> printk(KERN_ERR "Mutex is still locked!\n");<br /><br /> /* count up */<br /> mutex_total = total("MTX", mutexes_taken, nummx);<br /> sem_total = total("SEM", semaphores_taken, numsm);<br /> rd_total = total("RD ", reads_taken, numrd);<br /> wr_total = total("WR ", writes_taken, numwr);<br /> dg_total = total("DG ", downgrades_taken, numdg);<br /><br /> /* print the results */<br /> if (verbose) {<br /> printk("mutexes taken: %u\n", mutex_total);<br /> printk("semaphores taken: %u\n", sem_total);<br /> printk("reads taken: %u\n", rd_total);<br /> printk("writes taken: %u\n", wr_total);<br /> printk("downgrades taken: %u\n", dg_total);<br /> }<br /> else {<br /> printk("%3d %3d %3d %3d %3d %c %3d %9u %9u %9u %9u %9u\n",<br /> nummx, numsm, numrd, numwr, numdg,<br /> do_sched ? 's' : '-',<br /> load,<br /> mutex_total,<br /> sem_total,<br /> rd_total,<br /> wr_total,<br /> dg_total);<br /> }<br /><br /> /* tell insmod to discard the module */<br /> if (verbose)<br /> printk("Tests complete\n");<br /> return -ENOANO;<br /><br />} /* end do_tests() */<br /><br />module_init(do_tests);<br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo@vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-12-16 12:05 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>