CINXE.COM

LKML: Roland Dreier: [PATCH 03/13] [RFC] ipath copy routines

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: Roland Dreier: [PATCH 03/13] [RFC] ipath copy routines</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by Roland Dreier" href="/groupie.php?aid=3215" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/12"> [Dec]</a> 聽 <a class="nb" href="/lkml/2005/12/16"> [16]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/12/16/291" onclick="this.href='/lkml/headers'+'/2005/12/16/291';">[headers]</a>聽 <a href="/lkml/bounce/2005/12/16/291">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/12/16/290">First message in thread</a></li><li><a href="/lkml/2005/12/16/293">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/289">Roland Dreier</a><ul><li class="origin"><a href="/lkml/2005/12/16/303">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/303">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/301">Roland Dreier</a></li><li><a href="/lkml/2005/12/17/72">Andrew Morton</a></li></ul></li><li><a href="/lkml/2005/12/17/25">Pekka Enberg</a><ul><li><a href="/lkml/2005/12/17/92">Robert Walsh</a></li></ul></li><li><a href="/lkml/2005/12/17/31">Christoph Hellwig</a></li><li><a href="/lkml/2005/12/17/78">Andrew Morton</a><ul><li><a href="/lkml/2005/12/17/103">Robert Walsh</a></li></ul></li></ul></li></ul></li><li><a href="/lkml/2005/12/17/23">Pekka Enberg</a><ul><li><a href="/lkml/2005/12/17/97">Robert Walsh</a></li></ul></li><li><a href="/lkml/2005/12/17/29">Christoph Hellwig</a><ul><li><a href="/lkml/2005/12/17/96">(Eric W. Biederman)</a><ul><li><a href="/lkml/2005/12/17/129">Andi Kleen</a><ul><li><a href="/lkml/2005/12/18/34">(Eric W. Biederman)</a></li></ul></li></ul></li><li><a href="/lkml/2005/12/17/98">Robert Walsh</a><ul><li><a href="/lkml/2005/12/17/100">Arjan van de Ven</a><ul><li><a href="/lkml/2005/12/17/104">Robert Walsh</a></li></ul></li></ul></li></ul></li><li><a href="/lkml/2005/12/17/77">Andrew Morton</a><ul><li><a href="/lkml/2005/12/17/102">Robert Walsh</a><ul><li><a href="/lkml/2005/12/17/126">Andrew Morton</a></li></ul></li><li><a href="/lkml/2005/12/19/207">Robert Walsh</a></li></ul></li></ul></li></ul><div class="threadlist">Patch in this message</div><ul class="threadlist"><li><a href="/lkml/diff/2005/12/16/291/1">Get diff 1</a></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">Subject</td><td class="rp" itemprop="name">[PATCH 03/13] [RFC] ipath copy routines</td></tr><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Fri, 16 Dec 2005 15:48:54 -0800</td></tr><tr><td class="lp">From</td><td class="rp" itemprop="author">Roland Dreier &lt;&gt;</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">Copy routines for ipath driver<br /><br />---<br /><br /> drivers/infiniband/hw/ipath/ipath_copy.c | 666 ++++++++++++++++++++++++++<br /> drivers/infiniband/hw/ipath/ipath_dwordcpy.S | 62 ++<br /> 2 files changed, 728 insertions(+), 0 deletions(-)<br /> create mode 100644 drivers/infiniband/hw/ipath/ipath_copy.c<br /> create mode 100644 drivers/infiniband/hw/ipath/ipath_dwordcpy.S<br /><br />99f636a78e0d759ab663a7abb29e6a71b32a552d<br />diff --git a/drivers/infiniband/hw/ipath/ipath_copy.c b/drivers/infiniband/hw/ipath/ipath_copy.c<br />new file mode 100644<br />index 0000000..26211ad<br />--- /dev/null<br />+++ b/drivers/infiniband/hw/ipath/ipath_copy.c<br />&#64;&#64; -0,0 +1,666 &#64;&#64;<br />+/*<br />+ * Copyright (c) 2003, 2004, 2005. PathScale, Inc. All rights reserved.<br />+ *<br />+ * This software is available to you under a choice of one of two<br />+ * licenses. You may choose to be licensed under the terms of the GNU<br />+ * General Public License (GPL) Version 2, available from the file<br />+ * COPYING in the main directory of this source tree, or the<br />+ * OpenIB.org BSD license below:<br />+ *<br />+ * Redistribution and use in source and binary forms, with or<br />+ * without modification, are permitted provided that the following<br />+ * conditions are met:<br />+ *<br />+ * - Redistributions of source code must retain the above<br />+ * copyright notice, this list of conditions and the following<br />+ * disclaimer.<br />+ *<br />+ * - Redistributions in binary form must reproduce the above<br />+ * copyright notice, this list of conditions and the following<br />+ * disclaimer in the documentation and/or other materials<br />+ * provided with the distribution.<br />+ *<br />+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,<br />+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF<br />+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND<br />+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS<br />+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN<br />+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN<br />+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE<br />+ * SOFTWARE.<br />+ *<br />+ * Patent licenses, if any, provided herein do not apply to<br />+ * combinations of this program with other software, or any other<br />+ * product whatsoever.<br />+ *<br />+ * $Id: ipath_copy.c 4365 2005-12-10 00:04:16Z rjwalsh $<br />+ */<br />+<br />+/*<br />+ * This file provides support for doing sk_buff buffer swapping between<br />+ * the low level driver eager buffers, and the network layer. It's part<br />+ * of the core driver, rather than the ether driver, because it relies<br />+ * on variables and functions in the core driver. It exports a single<br />+ * entry point for use in the ipath_ether module.<br />+ */<br />+<br />+#include &lt;linux/kernel.h&gt;<br />+#include &lt;linux/errno.h&gt;<br />+#include &lt;linux/types.h&gt;<br />+#include &lt;asm/io.h&gt;<br />+#include &lt;asm/byteorder.h&gt;<br />+#include &lt;asm/bitops.h&gt;<br />+#include &lt;linux/skbuff.h&gt;<br />+#include &lt;linux/netdevice.h&gt;<br />+<br />+#include &lt;linux/crc32.h&gt; /* we can generate our own crc's for testing */<br />+<br />+#include "ipath_kernel.h"<br />+#include "ips_common.h"<br />+#include "ipath_layer.h"<br />+<br />+#define TRUE 1<br />+#define FALSE 0<br />+<br />+/*<br />+ * Allocate a PIO send buffer, initialize the header and copy it out.<br />+ */<br />+static int layer_send_getpiobuf(struct copy_data_s *cdp)<br />+{<br />+ int whichpb;<br />+ uint32_t device = cdp-&gt;device;<br />+ uint32_t extra_bytes;<br />+ uint32_t len, nwords;<br />+ uint32_t *piobuf;<br />+<br />+ whichpb = ipath_getpiobuf(device);<br />+ if (whichpb &lt; 0) {<br />+ cdp-&gt;error = whichpb;<br />+ return whichpb;<br />+ }<br />+<br />+ /*<br />+ * Compute the max amount of data that can fit into a PIO buffer.<br />+ * buffer size - header size - trigger qword length &amp; flags - CRC<br />+ */<br />+ len = devdata[device].ipath_ibmaxlen -<br />+ sizeof(ether_header_typ) - 8 - (SIZE_OF_CRC &lt;&lt; 2);<br />+ if (len &gt; (cdp-&gt;len + cdp-&gt;extra))<br />+ len = (cdp-&gt;len + cdp-&gt;extra);<br />+ /* Compute word aligment (i.e., (len &amp; 3) ? 4 - (len &amp; 3) : 0) */<br />+ extra_bytes = (4 - len) &amp; 3;<br />+ nwords = (sizeof(ether_header_typ) + len + extra_bytes) &gt;&gt; 2;<br />+ cdp-&gt;hdr-&gt;lrh[2] = htons(nwords + SIZE_OF_CRC);<br />+ cdp-&gt;hdr-&gt;bth[0] = htonl((OPCODE_ITH4X &lt;&lt; 24) + (extra_bytes &lt;&lt; 20) +<br />+ IPS_DEFAULT_P_KEY);<br />+ cdp-&gt;hdr-&gt;sub_opcode = OPCODE_ENCAP;<br />+<br />+ cdp-&gt;hdr-&gt;bth[2] = 0;<br />+ /* Generate an interrupt on the receive side for the last fragment. */<br />+ cdp-&gt;hdr-&gt;iph.pkt_flags = ((cdp-&gt;len+cdp-&gt;extra) == len) ? INFINIPATH_KPF_INTR : 0;<br />+ cdp-&gt;hdr-&gt;iph.chksum =<br />+ (uint16_t) IPS_LRH_BTH +<br />+ (uint16_t) (nwords + SIZE_OF_CRC) -<br />+ (uint16_t) ((cdp-&gt;hdr-&gt;iph.ver_port_tid_offset &gt;&gt; 16) &amp; 0xFFFF) -<br />+ (uint16_t) (cdp-&gt;hdr-&gt;iph.ver_port_tid_offset &amp; 0xFFFF) -<br />+ (uint16_t) cdp-&gt;hdr-&gt;iph.pkt_flags;<br />+<br />+ piobuf = (uint32_t *) (((char *)(devdata[device].ipath_kregbase)) +<br />+ devdata[device].ipath_piobufbase +<br />+ whichpb * devdata[device].ipath_palign);<br />+ _IPATH_VDBG("send %d (%x %x %x %x %x %x %x)\n", nwords,<br />+ cdp-&gt;hdr-&gt;lrh[0],<br />+ cdp-&gt;hdr-&gt;lrh[1],<br />+ cdp-&gt;hdr-&gt;lrh[2],<br />+ cdp-&gt;hdr-&gt;lrh[3],<br />+ cdp-&gt;hdr-&gt;bth[0], cdp-&gt;hdr-&gt;bth[1], cdp-&gt;hdr-&gt;bth[2]);<br />+ /*<br />+ * Write len to control qword, no flags.<br />+ * +1 is for the qword padding of pbc.<br />+ */<br />+ *((uint64_t *) piobuf) = (uint64_t) (nwords + 1);<br />+ piobuf += 2;<br />+ ipath_dwordcpy(piobuf, (uint32_t *) cdp-&gt;hdr,<br />+ sizeof(ether_header_typ) &gt;&gt; 2);<br />+ cdp-&gt;csum_pio = &amp;((ether_header_typ *) piobuf)-&gt;csum;<br />+ cdp-&gt;to = piobuf + (sizeof(ether_header_typ) &gt;&gt; 2);<br />+ cdp-&gt;flen = nwords - (sizeof(ether_header_typ) &gt;&gt; 2);<br />+ cdp-&gt;hdr-&gt;frag_num++;<br />+ return 0;<br />+}<br />+<br />+/*<br />+ * Copy data out of one or a chain of sk_buffs, into the PIO buffer.<br />+ * Fragment an sk_buff into multiple IB packets if the amount of data is<br />+ * more than a single eager send.<br />+ * Offset and len are in bytes.<br />+ * Note that this function is recursive!<br />+ */<br />+static void copy_bits(const struct sk_buff *skb, unsigned int offset,<br />+ unsigned int len, struct copy_data_s *cdp)<br />+{<br />+ unsigned int start = skb_headlen(skb);<br />+ unsigned int i, copy;<br />+ uint32_t n;<br />+ u8 *p;<br />+<br />+ /* Copy header. */<br />+ if ((int)(copy = start - offset) &gt; 0) {<br />+ if (copy &gt; len)<br />+ copy = len;<br />+ p = skb-&gt;data + offset;<br />+ offset += copy;<br />+ len -= copy;<br />+ /*<br />+ * If the alignment buffer is not empty, fill it and write<br />+ * it out.<br />+ */<br />+ if (cdp-&gt;extra) {<br />+ if (cdp-&gt;extra == 4)<br />+ goto extra_copy_bits_done;<br />+<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+<br />+ if (++cdp-&gt;extra == 4) {<br />+extra_copy_bits_done:<br />+ if (cdp-&gt;flen == 0<br />+ &amp;&amp; layer_send_getpiobuf(cdp) &lt; 0)<br />+ return;<br />+ *cdp-&gt;to++ = cdp-&gt;u.w;<br />+ cdp-&gt;extra = 0;<br />+ cdp-&gt;flen -= 1;<br />+ break;<br />+ }<br />+ }<br />+ }<br />+ while (copy &gt;= 4) {<br />+ if (cdp-&gt;flen == 0 &amp;&amp; layer_send_getpiobuf(cdp) &lt; 0)<br />+ return;<br />+ n = copy &gt;&gt; 2;<br />+ if (n &gt; cdp-&gt;flen)<br />+ n = cdp-&gt;flen;<br />+ ipath_dwordcpy(cdp-&gt;to, (uint32_t *) p, n);<br />+ cdp-&gt;to += n;<br />+ cdp-&gt;flen -= n;<br />+ n &lt;&lt;= 2;<br />+ p += n;<br />+ cdp-&gt;offset += n;<br />+ cdp-&gt;len -= n;<br />+ copy -= n;<br />+ }<br />+ /*<br />+ * Either cdp-&gt;extra is zero or copy is zero which means that<br />+ * the loop here can't cause the alignment buffer to fill up.<br />+ */<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra++] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+<br />+ }<br />+ if (len == 0)<br />+ return;<br />+ }<br />+<br />+ for (i = 0; i &lt; skb_shinfo(skb)-&gt;nr_frags; i++) {<br />+ skb_frag_t *frag = &amp;skb_shinfo(skb)-&gt;frags[i];<br />+ unsigned int end;<br />+<br />+ end = start + frag-&gt;size;<br />+ if ((int)(copy = end - offset) &gt; 0) {<br />+ u8 *vaddr;<br />+<br />+ if (copy &gt; len)<br />+ copy = len;<br />+ vaddr = kmap_skb_frag(frag);<br />+ p = vaddr + frag-&gt;page_offset + offset - start;<br />+ offset += copy;<br />+ len -= copy;<br />+ /*<br />+ * If the alignment buffer is not empty, fill<br />+ * it and write it out.<br />+ */<br />+ if (cdp-&gt;extra) {<br />+ if (cdp-&gt;extra == 4)<br />+ goto extra1_copy_bits_done;<br />+<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+<br />+ if (++cdp-&gt;extra == 4) {<br />+extra1_copy_bits_done:<br />+ if (cdp-&gt;flen == 0<br />+ &amp;&amp; layer_send_getpiobuf(cdp)<br />+ &lt; 0)<br />+ return;<br />+ *cdp-&gt;to++ = cdp-&gt;u.w;<br />+ cdp-&gt;extra = 0;<br />+ cdp-&gt;flen -= 1;<br />+ break;<br />+ }<br />+ }<br />+ }<br />+ while (copy &gt;= 4) {<br />+ if (cdp-&gt;flen == 0<br />+ &amp;&amp; layer_send_getpiobuf(cdp) &lt; 0)<br />+ return;<br />+ n = copy &gt;&gt; 2;<br />+ if (n &gt; cdp-&gt;flen)<br />+ n = cdp-&gt;flen;<br />+ ipath_dwordcpy(cdp-&gt;to, (uint32_t *) p, n);<br />+ cdp-&gt;to += n;<br />+ cdp-&gt;flen -= n;<br />+ n &lt;&lt;= 2;<br />+ p += n;<br />+ cdp-&gt;offset += n;<br />+ cdp-&gt;len -= n;<br />+ copy -= n;<br />+ }<br />+ /*<br />+ * Either cdp-&gt;extra is zero or copy is zero<br />+ * which means that the loop here can't cause<br />+ * the alignment buffer to fill up.<br />+ */<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra++] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+ }<br />+ kunmap_skb_frag(vaddr);<br />+<br />+ if (len == 0)<br />+ return;<br />+ }<br />+ start = end;<br />+ }<br />+<br />+ if (skb_shinfo(skb)-&gt;frag_list) {<br />+ struct sk_buff *list = skb_shinfo(skb)-&gt;frag_list;<br />+<br />+ for (; list; list = list-&gt;next) {<br />+ unsigned int end;<br />+<br />+ end = start + list-&gt;len;<br />+ if ((int)(copy = end - offset) &gt; 0) {<br />+ if (copy &gt; len)<br />+ copy = len;<br />+ copy_bits(list, offset - start, copy, cdp);<br />+ if (cdp-&gt;error || (len -= copy) == 0)<br />+ return;<br />+ }<br />+ start = end;<br />+ }<br />+ }<br />+ if (len)<br />+ cdp-&gt;error = -EFAULT;<br />+}<br />+<br />+/*<br />+ * Copy data out of one or a chain of sk_buffs, into the PIO buffer, generating<br />+ * the checksum as we go.<br />+ * Fragment an sk_buff into multiple IB packets if the amount of data is<br />+ * more than a single eager send.<br />+ * Offset and len are in bytes.<br />+ * Note that this function is recursive!<br />+ */<br />+static void copy_and_csum_bits(const struct sk_buff *skb, unsigned int offset,<br />+ unsigned int len, struct copy_data_s *cdp)<br />+{<br />+ unsigned int start = skb_headlen(skb);<br />+ unsigned int i, copy;<br />+ unsigned int csum2;<br />+ uint32_t n;<br />+ u8 *p;<br />+<br />+ /* Copy header. */<br />+ if ((int)(copy = start - offset) &gt; 0) {<br />+ if (copy &gt; len)<br />+ copy = len;<br />+ p = skb-&gt;data + offset;<br />+ offset += copy;<br />+ len -= copy;<br />+ if (!cdp-&gt;checksum_calc) {<br />+ cdp-&gt;checksum_calc = TRUE;<br />+<br />+ csum2 = csum_partial(p, copy, 0);<br />+ cdp-&gt;csum = csum_block_add(cdp-&gt;csum, csum2, cdp-&gt;pos);<br />+ cdp-&gt;pos += copy;<br />+ }<br />+ /*<br />+ * If the alignment buffer is not empty, fill it and<br />+ * write it out.<br />+ */<br />+ if (cdp-&gt;extra) {<br />+ if (cdp-&gt;extra == 4)<br />+ goto extra_copy_and_csum_bits_done;<br />+<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+ if (++cdp-&gt;extra == 4) {<br />+extra_copy_and_csum_bits_done:<br />+ if (cdp-&gt;flen == 0<br />+ &amp;&amp; layer_send_getpiobuf(cdp) &lt; 0)<br />+ return;<br />+ /*<br />+ * write the checksum before<br />+ * the last PIO write.<br />+ */<br />+ if (cdp-&gt;flen == 1) {<br />+ *cdp-&gt;csum_pio =<br />+ csum_fold(cdp-&gt;csum);<br />+ mb();<br />+ }<br />+ *cdp-&gt;to++ = cdp-&gt;u.w;<br />+ cdp-&gt;extra = 0;<br />+ cdp-&gt;flen -= 1;<br />+ break;<br />+ }<br />+ }<br />+ }<br />+<br />+ while (copy &gt;= 4) {<br />+ if (cdp-&gt;flen == 0 &amp;&amp; layer_send_getpiobuf(cdp) &lt; 0)<br />+ return;<br />+<br />+ n = copy &gt;&gt; 2;<br />+ if (n &gt; cdp-&gt;flen)<br />+ n = cdp-&gt;flen;<br />+ /* write the checksum before the last PIO write. */<br />+ if (cdp-&gt;flen == n) {<br />+ *cdp-&gt;csum_pio = csum_fold(cdp-&gt;csum);<br />+ mb();<br />+ }<br />+ ipath_dwordcpy(cdp-&gt;to, (uint32_t *) p, n);<br />+ cdp-&gt;to += n;<br />+ cdp-&gt;flen -= n;<br />+ n &lt;&lt;= 2;<br />+ p += n;<br />+ cdp-&gt;offset += n;<br />+ cdp-&gt;len -= n;<br />+ copy -= n;<br />+ }<br />+ /*<br />+ * Either cdp-&gt;extra is zero or copy is zero which means that<br />+ * the loop here can't cause the alignment buffer to fill up.<br />+ */<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra++] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+ }<br />+<br />+ cdp-&gt;checksum_calc = FALSE;<br />+<br />+ if (len == 0)<br />+ return;<br />+ }<br />+<br />+ for (i = 0; i &lt; skb_shinfo(skb)-&gt;nr_frags; i++) {<br />+ skb_frag_t *frag = &amp;skb_shinfo(skb)-&gt;frags[i];<br />+ unsigned int end;<br />+<br />+ end = start + frag-&gt;size;<br />+ if ((int)(copy = end - offset) &gt; 0) {<br />+ u8 *vaddr;<br />+<br />+ if (copy &gt; len)<br />+ copy = len;<br />+ vaddr = kmap_skb_frag(frag);<br />+ p = vaddr + frag-&gt;page_offset + offset - start;<br />+ offset += copy;<br />+ len -= copy;<br />+<br />+ if (!cdp-&gt;checksum_calc) {<br />+ cdp-&gt;checksum_calc = TRUE;<br />+<br />+ csum2 = csum_partial(p, copy, 0);<br />+ cdp-&gt;csum = csum_block_add(cdp-&gt;csum, csum2,<br />+ cdp-&gt;pos);<br />+ cdp-&gt;pos += copy;<br />+ }<br />+ /*<br />+ * If the alignment buffer is not empty, fill<br />+ * it and write it out.<br />+ */<br />+ if (cdp-&gt;extra) {<br />+ if (cdp-&gt;extra == 4)<br />+ goto extra1_copy_and_csum_bits_done;<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+<br />+ if (++cdp-&gt;extra == 4) {<br />+extra1_copy_and_csum_bits_done:<br />+ if (cdp-&gt;flen == 0<br />+ &amp;&amp; layer_send_getpiobuf(cdp)<br />+ &lt; 0) {<br />+ kunmap_skb_frag(vaddr);<br />+ return;<br />+ }<br />+ /*<br />+ * write the checksum<br />+ * before the last PIO<br />+ * write.<br />+ */<br />+ if (cdp-&gt;flen == 1) {<br />+ *cdp-&gt;csum_pio =<br />+ csum_fold(cdp-&gt;<br />+ csum);<br />+ mb();<br />+ }<br />+ *cdp-&gt;to++ = cdp-&gt;u.w;<br />+ cdp-&gt;extra = 0;<br />+ cdp-&gt;flen -= 1;<br />+ break;<br />+ }<br />+ }<br />+ }<br />+ while (copy &gt;= 4) {<br />+ if (cdp-&gt;flen == 0<br />+ &amp;&amp; layer_send_getpiobuf(cdp) &lt; 0) {<br />+ kunmap_skb_frag(vaddr);<br />+ return;<br />+ }<br />+ n = copy &gt;&gt; 2;<br />+ if (n &gt; cdp-&gt;flen)<br />+ n = cdp-&gt;flen;<br />+ /*<br />+ * write the checksum before the last<br />+ * PIO write.<br />+ */<br />+ if (cdp-&gt;flen == n) {<br />+ *cdp-&gt;csum_pio = csum_fold(cdp-&gt;csum);<br />+ mb();<br />+ }<br />+ ipath_dwordcpy(cdp-&gt;to, (uint32_t *) p, n);<br />+ cdp-&gt;to += n;<br />+ cdp-&gt;flen -= n;<br />+ n &lt;&lt;= 2;<br />+ p += n;<br />+ cdp-&gt;offset += n;<br />+ cdp-&gt;len -= n;<br />+ copy -= n;<br />+ }<br />+ /*<br />+ * Either cdp-&gt;extra is zero or copy is zero<br />+ * which means that the loop here can't cause<br />+ * the alignment buffer to fill up.<br />+ */<br />+ while (copy != 0) {<br />+ cdp-&gt;u.buf[cdp-&gt;extra++] = *p++;<br />+ copy--;<br />+ cdp-&gt;offset++;<br />+ cdp-&gt;len--;<br />+ }<br />+ kunmap_skb_frag(vaddr);<br />+<br />+ cdp-&gt;checksum_calc = FALSE;<br />+<br />+ if (len == 0)<br />+ return;<br />+ }<br />+ start = end;<br />+ }<br />+<br />+ if (skb_shinfo(skb)-&gt;frag_list) {<br />+ struct sk_buff *list = skb_shinfo(skb)-&gt;frag_list;<br />+<br />+ for (; list; list = list-&gt;next) {<br />+ unsigned int end;<br />+<br />+ end = start + list-&gt;len;<br />+ if ((int)(copy = end - offset) &gt; 0) {<br />+ if (copy &gt; len)<br />+ copy = len;<br />+ copy_and_csum_bits(list, offset - start, copy, cdp);<br />+ if (cdp-&gt;error || (len -= copy) == 0)<br />+ return;<br />+ offset += copy;<br />+ }<br />+ start = end;<br />+ }<br />+ }<br />+ if (len)<br />+ cdp-&gt;error = -EFAULT;<br />+}<br />+<br />+/*<br />+ * Note that the header should have the unchanging parts<br />+ * initialized but the rest of the header is computed as needed in<br />+ * order to break up skb data buffers larger than the hardware MTU.<br />+ * In other words, the Linux network stack MTU can be larger than the<br />+ * hardware MTU.<br />+ */<br />+int ipath_layer_send_skb(struct copy_data_s *cdata)<br />+{<br />+ int ret = 0;<br />+ uint16_t vlsllnh;<br />+ int device = cdata-&gt;device;<br />+<br />+ if (device &gt;= infinipath_max) {<br />+ _IPATH_INFO("Invalid unit %u, failing\n", device);<br />+ return -EINVAL;<br />+ }<br />+ if (!(devdata[device].ipath_flags &amp; IPATH_RCVHDRSZ_SET)) {<br />+ _IPATH_INFO("send while not open\n");<br />+ ret = -EINVAL;<br />+ } else<br />+ if ((devdata[device].ipath_flags &amp; (IPATH_LINKUNK | IPATH_LINKDOWN))<br />+ || devdata[device].ipath_lid == 0) {<br />+ /* lid check is for when sma hasn't yet configured */<br />+ ret = -ENETDOWN;<br />+ _IPATH_VDBG("send while not ready, mylid=%u, flags=0x%x\n",<br />+ devdata[device].ipath_lid,<br />+ devdata[device].ipath_flags);<br />+ }<br />+ vlsllnh = *((uint16_t *) cdata-&gt;hdr);<br />+ if (vlsllnh != htons(IPS_LRH_BTH)) {<br />+ _IPATH_DBG("Warning: lrh[0] wrong (%x, not %x); not sending\n",<br />+ vlsllnh, htons(IPS_LRH_BTH));<br />+ ret = -EINVAL;<br />+ }<br />+ if (ret)<br />+ goto done;<br />+<br />+ cdata-&gt;error = 0; /* clear last calls error */<br />+<br />+ if (cdata-&gt;skb-&gt;ip_summed == CHECKSUM_HW) {<br />+ unsigned int csstart = cdata-&gt;skb-&gt;h.raw - cdata-&gt;skb-&gt;data;<br />+<br />+ /*<br />+ * Computing the checksum is a bit tricky since if we fragment<br />+ * the packet, the fragment that should contain the checksum<br />+ * will have already been sent. The solution is to<br />+ * store the checksum in the header of the last fragment<br />+ * just before we write the last data word which triggers<br />+ * the last fragment to be sent. The receiver will<br />+ * check the header "tag" field, see that there is a<br />+ * checksum, and store the checksum back into the packet.<br />+ *<br />+ * Save the offset of the two byte checksum.<br />+ * Note that we have to add 2 to account for the two<br />+ * bytes of the ethernet address we stripped from the<br />+ * packet and put in the header.<br />+ */<br />+ cdata-&gt;hdr-&gt;csum_offset = csstart + cdata-&gt;skb-&gt;csum + 2;<br />+<br />+ if (cdata-&gt;offset &lt; csstart)<br />+ copy_bits(cdata-&gt;skb, cdata-&gt;offset,<br />+ csstart - cdata-&gt;offset, cdata);<br />+<br />+ if (cdata-&gt;error) {<br />+ return (cdata-&gt;error);<br />+<br />+ }<br />+<br />+ if (cdata-&gt;offset &lt; cdata-&gt;skb-&gt;len)<br />+ copy_and_csum_bits(cdata-&gt;skb, cdata-&gt;offset,<br />+ cdata-&gt;skb-&gt;len - cdata-&gt;offset,<br />+ cdata);<br />+<br />+ if (cdata-&gt;error) {<br />+ return (cdata-&gt;error);<br />+ }<br />+<br />+ if (cdata-&gt;extra) {<br />+ while (cdata-&gt;extra &lt; 4)<br />+ cdata-&gt;u.buf[cdata-&gt;extra++] = 0;<br />+ if (cdata-&gt;flen != 0<br />+ || layer_send_getpiobuf(cdata) &gt;= 0) {<br />+ /*<br />+ * write the checksum before the last<br />+ * PIO write.<br />+ */<br />+ *cdata-&gt;csum_pio = csum_fold(cdata-&gt;csum);<br />+ mb();<br />+ *cdata-&gt;to = cdata-&gt;u.w;<br />+ }<br />+ }<br />+ } else {<br />+ copy_bits(cdata-&gt;skb, cdata-&gt;offset,<br />+ cdata-&gt;skb-&gt;len - cdata-&gt;offset, cdata);<br />+<br />+ if (cdata-&gt;error) {<br />+ return (cdata-&gt;error);<br />+ }<br />+<br />+ if (cdata-&gt;extra) {<br />+ while (cdata-&gt;extra &lt; 4)<br />+ cdata-&gt;u.buf[cdata-&gt;extra++] = 0;<br />+ if (cdata-&gt;flen != 0<br />+ || layer_send_getpiobuf(cdata) &gt;= 0)<br />+ *cdata-&gt;to = cdata-&gt;u.w;<br />+ }<br />+ }<br />+<br />+ if (cdata-&gt;error) {<br />+ ret = cdata-&gt;error;<br />+ if (cdata-&gt;error != -EBUSY)<br />+ /* just means no PIO buffers available */<br />+ _IPATH_UNIT_ERROR(device,<br />+ "layer_send copy_bits failed with error %d\n",<br />+ -ret);<br />+ }<br />+<br />+ ipath_stats.sps_ether_spkts++; /* another ether packet sent */<br />+<br />+done:<br />+ return ret;<br />+}<br />+<br />+EXPORT_SYMBOL(ipath_layer_send_skb);<br />diff --git a/drivers/infiniband/hw/ipath/ipath_dwordcpy.S b/drivers/infiniband/hw/ipath/ipath_dwordcpy.S<br />new file mode 100644<br />index 0000000..fdd8ec7<br />--- /dev/null<br />+++ b/drivers/infiniband/hw/ipath/ipath_dwordcpy.S<br />&#64;&#64; -0,0 +1,62 &#64;&#64;<br />+/*<br />+ * Copyright (c) 2003, 2004, 2005. PathScale, Inc. All rights reserved.<br />+ *<br />+ * This software is available to you under a choice of one of two<br />+ * licenses. You may choose to be licensed under the terms of the GNU<br />+ * General Public License (GPL) Version 2, available from the file<br />+ * COPYING in the main directory of this source tree, or the<br />+ * OpenIB.org BSD license below:<br />+ *<br />+ * Redistribution and use in source and binary forms, with or<br />+ * without modification, are permitted provided that the following<br />+ * conditions are met:<br />+ *<br />+ * - Redistributions of source code must retain the above<br />+ * copyright notice, this list of conditions and the following<br />+ * disclaimer.<br />+ *<br />+ * - Redistributions in binary form must reproduce the above<br />+ * copyright notice, this list of conditions and the following<br />+ * disclaimer in the documentation and/or other materials<br />+ * provided with the distribution.<br />+ *<br />+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,<br />+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF<br />+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND<br />+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS<br />+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN<br />+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN<br />+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE<br />+ * SOFTWARE.<br />+ *<br />+ * Patent licenses, if any, provided herein do not apply to<br />+ * combinations of this program with other software, or any other<br />+ * product whatsoever.<br />+ *<br />+ * $Id: ipath_dwordcpy.S 4365 2005-12-10 00:04:16Z rjwalsh $<br />+ */<br />+ <br />+/*<br />+ * ipath_dwordcpy - Copy a memory block, primarily for writing to the<br />+ * InfiniPath PIO buffers, which only support dword multiple writes, and<br />+ * thus can not use memcpy(). For this reason, we use nothing smaller than<br />+ * dword writes.<br />+ * It is also used as a fast copy routine in some places that have been<br />+ * measured to win over memcpy, and the performance delta matters.<br />+ *<br />+ * Count is number of dwords; might not be a qword multiple.<br />+*/<br />+<br />+ .globl ipath_dwordcpy<br />+/* rdi destination, rsi source, rdx count */<br />+ipath_dwordcpy:<br />+ movl %edx,%ecx<br />+ shrl $1,%ecx<br />+ andl $1,%edx <br />+ cld<br />+ rep <br />+ movsq <br />+ movl %edx,%ecx<br />+ rep<br />+ movsd<br />+ ret<br />-- <br />0.99.9n<br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo&#64;vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-12-17 00:53 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>

Pages: 1 2 3 4 5 6 7 8 9 10