CINXE.COM

LKML: Kenny Simpson: Re: RAID controller safety

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: Kenny Simpson: Re: RAID controller safety</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by Kenny Simpson" href="/groupie.php?aid=9089" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/12"> [Dec]</a> 聽 <a class="nb" href="/lkml/2005/12/30"> [30]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/12/30/234" onclick="this.href='/lkml/headers'+'/2005/12/30/234';">[headers]</a>聽 <a href="/lkml/bounce/2005/12/30/234">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/12/29/114">First message in thread</a></li><li><a href="/lkml/2005/12/30/109">Kenny Simpson</a><ul><li><a href="/lkml/2005/12/30/117">Bernd Eckenfels</a></li><li><a href="/lkml/2005/12/30/209">Alan Cox</a><ul><li><a href="/lkml/2005/12/30/233">Kenny Simpson</a></li><li class="origin"><a href="">Kenny Simpson</a></li><li><a href="/lkml/2005/12/31/8">Kenny Simpson</a></li><li><a href="/lkml/2005/12/31/16">Kenny Simpson</a></li></ul></li></ul></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Fri, 30 Dec 2005 19:25:06 -0800 (PST)</td></tr><tr><td class="lp">From</td><td class="rp" itemprop="author">Kenny Simpson &lt;&gt;</td></tr><tr><td class="lp">Subject</td><td class="rp" itemprop="name">Re: RAID controller safety</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">--- Alan Cox &lt;alan&#64;lxorguk.ukuu.org.uk&gt; wrote:<br /><br />&gt; On Gwe, 2005-12-30 at 10:58 -0800, Kenny Simpson wrote:<br />&gt; &gt; So all writes would be treated as syncronous in the write-through case (no battery), making<br />&gt; fsync<br />&gt; &gt; a no-op?<br />&gt; <br />&gt; fsync is never a no-op. fsync ensures material the OS is caching hits<br />&gt; disk drivers/disks. Barriers or write through on the disk driver ensure<br />&gt; that it hits the media.<br />&gt; <br />&gt; The two are independant<br />&gt; <br /><br />Ok, the light is slowly coming on for me... <br /><br /> Lets see if I get it:<br />fsync, according to POSIX, will flush all pending writes<br /><a href="http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html">http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html</a><br />Linux, according to the man page, takes this a little further an says that data is on stable<br />storage.<br /><br />Stable storage for a battery-backed RAID controller means its battery-backed cache. Stable<br />storage for a RAID controller w/o battery means that the data is on disk. I2O controllers are<br />told the caching requirement for each write in the command.<br /><br />To tell a disk to force data to the platter, it needs to be sent a specific command (which some<br />drives ignore), or the write cache must be disabled (write-through mode). The specific command<br />varies depending on SCSI vs. SATA vs. TCQ, etc..<br /><br />Ignoring O_DIRECT, Linux writes out data from the page cache. Data gets written out when the OS<br />decides (high memory pressure, timers expire, etc..), or when a program requests it (fsync).<br /><br />For Linux, to make the fsync command have stability (Durability), it must not only send the data<br />to the controller, but must inform the controller to force the data to stable storage, and then<br />wait for the controller to report the writes as completed.<br /><br />An battery-backed I2O controller only needs to be sent the writes as write-back cache. A<br />non-battery-backed I2O controller needs to be sent these writes as write-through.<br />In both cases, the controller should set the drives themselves to be write-through.<br /><br />To match the POSIX behavior, the onus on the OS is just the push out the data to the driver. To<br />get the further reliability, the driver, contoller (and firmware), and drives (and firmware) must<br />all function as advertised. If any one of these fail, the reliability is lost. Linux can at most<br />hope to control the driver.<br /><br />Ok, with all that out of the way...<br /><br />Are there any known drivers that do not correctly pass on barriers/flush/sync to their<br />controllers?<br /><br /> In my observations, and what others have told me in private emails, the I2O driver is such a<br />driver at least for my non-battery-backed controller (Adaptec 2015S). I read in the comments of<br />the I2O driver that it should set the write-cache flag for writes as write-through for<br />non-battery-backed controllers, but I don't observe that setting via blktool, and basic<br />write/fsync benchmarks run too fast for the drives I have (4x 10kRPM in RAID-10).<br /> Also from reading the source, I only see the write-cache flag being set to write-back. I see no<br />test for controller properties, or anything else that would modify this setting (except for the<br />ioctl).<br /> Of course, I could be mis-reading the I2O spec and all this is up to the controller to know if<br />it has a battery, so the controller is responsible for doing the right thing, and the flag in the<br />I2O driver is irrelavant for this.<br /><br />Thanks for your patience,<br />-Kenny<br /><br /><br /><br /> <br />__________________________________________ <br />Yahoo! DSL 聳 Something to write home about. <br />Just $16.99/mo. or less. <br />dsl.yahoo.com <br /><br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo&#64;vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-12-31 04:31 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>

Pages: 1 2 3 4 5 6 7 8 9 10