CINXE.COM

LKML: Rutger Nijlunsing: Proposal for shell-patch-format [was: Re: more git updates..]

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: Rutger Nijlunsing: Proposal for shell-patch-format [was: Re: more git updates..]</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by Rutger Nijlunsing" href="/groupie.php?aid=27567" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/4"> [Apr]</a> 聽 <a class="nb" href="/lkml/2005/4/10"> [10]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/4/10/37" onclick="this.href='/lkml/headers'+'/2005/4/10/37';">[headers]</a>聽 <a href="/lkml/bounce/2005/4/10/37">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/4/9/103">First message in thread</a></li><li><a href="/lkml/2005/4/9/109">Linus Torvalds</a><ul><li><a href="/lkml/2005/4/9/110">Linus Torvalds</a><ul><li><a href="/lkml/2005/4/9/136">Linus Torvalds</a><ul><li><a href="/lkml/2005/4/9/155">Petr Baudis</a><ul><li><a href="/lkml/2005/4/10/72">Petr Baudis</a></li></ul></li><li><a href="/lkml/2005/4/10/25">Christopher Li</a><ul><li><a href="/lkml/2005/4/10/41">Ralph Corderoy</a></li><li><a href="/lkml/2005/4/10/125">Paul Jackson</a></li><li><a href="/lkml/2005/4/11/103">(H. Peter Anvin)</a></li></ul></li><li><a href="/lkml/2005/4/11/74">Ingo Molnar</a><ul><li><a href="/lkml/2005/4/11/110">Paul Jackson</a></li></ul></li><li><a href="/lkml/2005/4/12/32">David Eger</a><ul><li><a href="/lkml/2005/4/12/67">Petr Baudis</a></li></ul></li></ul></li></ul></li><li><a href="/lkml/2005/4/9/150">Paul Jackson</a><ul><li><a href="/lkml/2005/4/9/152">Paul Jackson</a></li></ul></li><li><a href="/lkml/2005/4/9/151">Paul Jackson</a></li><li><a href="/lkml/2005/4/10/8">Junio C Hamano</a><ul><li><a href="/lkml/2005/4/10/13">Christopher Li</a><ul><li><a href="/lkml/2005/4/10/18">Junio C Hamano</a><ul><li><a href="/lkml/2005/4/10/23">Petr Baudis</a></li><li><a href="/lkml/2005/4/10/27">Christopher Li</a></li></ul></li><li><a href="/lkml/2005/4/10/21">Wichert Akkerman</a></li><li><a href="/lkml/2005/4/10/22">Petr Baudis</a><ul><li><a href="/lkml/2005/4/10/28">Christopher Li</a></li></ul></li></ul></li><li class="origin"><a href="">Rutger Nijlunsing</a></li><li><a href="/lkml/2005/4/10/70">Linus Torvalds</a><ul><li><a href="/lkml/2005/4/10/76">Rutger Nijlunsing</a></li><li><a href="/lkml/2005/4/10/118">Paul Jackson</a><ul><li><a href="/lkml/2005/4/10/142">Linus Torvalds</a></li></ul></li></ul></li></ul></li><li><a href="/lkml/2005/4/10/44"> tony.luck&#64;intel ...</a><ul><li><a href="/lkml/2005/4/10/71">Linus Torvalds</a><ul><li><a href="/lkml/2005/4/12/306">Helge Hafting</a></li></ul></li><li><a href="/lkml/2005/4/10/103">Paul Jackson</a><ul><li><a href="/lkml/2005/4/10/165">Bernd Eckenfels</a><ul><li><a href="/lkml/2005/4/11/50">Anton Altaparmakov</a></li></ul></li></ul></li></ul></li></ul></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Sun, 10 Apr 2005 13:21:39 +0200</td></tr><tr><td class="lp">From</td><td class="rp" itemprop="author">Rutger Nijlunsing &lt;&gt;</td></tr><tr><td class="lp">Subject</td><td class="rp" itemprop="name">Proposal for shell-patch-format [was: Re: more git updates..]</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">On Sun, Apr 10, 2005 at 12:51:59AM -0700, Junio C Hamano wrote:<br />&gt; Listing the file paths and their sigs included in a tree to make<br />&gt; a snapshot of a tree state sounds fine, and diffing two trees by<br />&gt; looking at the sigs between two such files sounds fine as well.<br />&gt; <br />&gt; But I am wondering what your plans are to handle renames---or<br />&gt; does git already represent them?<br /><br />git doesn't represent transitions (or deltas), but only state. So it's<br />not (much) more then a .tar file from version-management perspective;<br />the only difference being that a git-tree has a comment field and a<br />predecessor-reference, which are currently not used in determining the<br />'patch' between two trees.<br /><br />Deltas are derived by comparing different versions and determining<br />the difference by reverse-engineering the differences which got us<br />from version A to version B.<br /><br />Deltas are currently described as patch(1)es. Patches don't have the<br />concept of 'renaming', so even after determining that file X has been<br />renamed to Y, we have no container for this fact. A patch(1) only<br />contains local-file-edits: substitute lines by other lines.<br /><br />Deltas are not needed to follow a tree; deltas are useful for merging<br />branches of versions, and for reviewing purposes. This is comparable<br />to using tar for version-management: it is very common to weekly tar<br />your current version of your project as a poor-mans-version management<br />for one-person one-project.<br /><br />So what is needed is a way to represent deltas which can contain more<br />than only traditional patches. I would propose a simple format: <br />the shell-script in a fixed-format.<br /><br />Shell-patch format in EBNF:<br /> &lt;shellpatch&gt; ::= ( &lt;comment&gt;? &lt;command&gt;* )*<br /> &lt;comment&gt; ::= &lt;commentline&gt;+<br /> The comments contains the text describing the function of the<br /> patch following it.<br /> &lt;commentline&gt; ::= "# " &lt;text&gt;<br /> &lt;command&gt; ::=<br /> "mv " &lt;pathname&gt; " " &lt;pathname&gt; "\n" |<br /> "cp " &lt;filename&gt; " " &lt;filename&gt; "\n" |<br /> "chmod " &lt;mode&gt; &lt;pathname&gt; "\n" |<br /> "patch &lt;&lt;__UNIQUE_STRING__\n" &lt;patch&gt; "__UNIQUE_STRING__\n"<br /> (where UNIQUE_STRING must not be contained in patch)<br /> &lt;filename&gt; ::= &lt;pathname&gt;<br /> (but pointing to a file)<br /> &lt;pathname&gt; ::= a pathname relative to '.';<br /> escaping special characters the shell-way;<br /> may not contain '..'.<br /><br />Example:<br /> # Rename file b to a1, and change a line.<br /> mv b a1<br /> patch &lt;&lt;__END__<br /> *** a1 Sun Apr 10 11:43:37 2005<br /> --- a2 Sun Apr 10 11:43:41 2005<br /> ***************<br /> *** 1,4 ****<br /> 1<br /> 2<br /> ! from<br /> 3<br /> --- 1,4 ----<br /> 1<br /> 2<br /> ! to<br /> 3<br /> __END__<br /><br />Advantages:<br /> - ASCII!<br /> - a shell-patch is executable without extra tooling<br /> - a shell-patch is readable and therefore reviewable<br /> - a shell-patch is forward-compatible: a shell-patch acts<br /> like a patch (since patch(1) ignores garbage around patch :),<br /> but not backwards-compatible.<br /> - extensible<br /> - the heavy-lifting is done by 'patch'<br />Disadvantages:<br /> - no deltas for binary files<br /><br />Open issues:<br /> - &lt;comment&gt; could be made more structured; maybe containing fields<br /> like Sujbect:, Author:, Signed-By:, certificates, ...<br /> (BitKeeper seems to be using "# " &lt;field&gt; ":" &lt;value&gt; "\n" lines)<br /> - patch(1) doesn't know any directories. Should shell-patch<br /> know directories? This implies commands working on directories to<br /> (like directory renaming, mode changing, ...). Otherwise directories<br /> are implicit (a file in a directories implies the existance of that<br /> directory). Also implies mkdir and rmdir as shell-patch commands.<br /> - extra commands might be useful to conserve more state(changes):<br /> ln -s -- for symbolic links;<br /> ln -- for hard links;<br /> chown -- for permissions;<br /> chattr -- for storing extended attributes<br /> touch -- for setting timestamps (probably creation time only,<br /> since mtime is something git relies on)<br /> ...and for the really adventurous:<br /> sed 's,&lt;fromstring&gt;,&lt;tostring&gt;,' -- for substitutions<br /> (this is something darcs supports, but which I think is too<br /> bothersome to use since it is difficult to reverse engineere<br /> from two random trees)<br />Why a fixed format at all?<br /> - This way, the executable shell-patch can be proven to be<br /> harmless to the machine: 'rm -rf /' is a valid shell-script,<br /> but not a valid shell-patch (since 'rm' is not valid command,<br /> random flags like '-rf' are not supported, and '/' is an absolute<br /> pathname.<br /> - A fixed format enables tooling to support such a patch format;<br /> for example creating the reverse-patch, merging patches (yeah,<br /> 'cat' also merges patches...).<br /><br />...what has this to do with git? Not much and everything, depending<br />on how you look onto it. 'git' is 'tar', and 'shell-patch' is 'patch';<br />both orthogonal concepts but very usable in combination. We'll look at<br />getting from two git trees to a shell-patch.<br /><br />Diffing the trees would not only look at the file and per file at the<br />hashes, but also the other way around: which hash values are used more<br />than once. For files with the same hash value, compare the contents<br />(and rest of attributes); this is needed since the mapping from file<br />contents to sha1 is one-way. When the contents is the same, the<br />shell-patch-command to generate is obviously a 'cp'.<br /><br />For example, we have got two trees in git (pathname -&gt; hash value):<br /> tree1/file1 -&gt; 1234<br /> tree1/file2 -&gt; 4567<br />and<br /> tree2/file1 -&gt; 3456<br /> tree2/file3 -&gt; 4567<br /> tree2/file4 -&gt; 4567<br /><br />..this could generate shell-patch:<br /><br /> # Comments-go-here<br /> mv tree2/file2 tree2/file3<br /> cp tree2/file3 tree2/file4<br /> patch tree1/file1 &lt;&lt;__FILE_PATCH__<br /> (patch-goes-here)<br /> __FILE_PATCH__<br /><br />...by an algorithm which starts by determining all renames, then all<br />copies, and finally all patches.<br /><br />Comments?<br /><br /><br />-- <br />Rutger Nijlunsing ---------------------- linux-kernel at tux.tmfweb.nl<br />never attribute to a conspiracy which can be explained by incompetence<br />----------------------------------------------------------------------<br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo&#64;vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-04-10 13:25 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>

Pages: 1 2 3 4 5 6 7 8 9 10