CINXE.COM

LKML: "Paul E. McKenney": Re: [PATCH 10/13] [RFC] ipath verbs, part 1

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>LKML: "Paul E. McKenney": Re: [PATCH 10/13] [RFC] ipath verbs, part 1</title><link href="/css/message.css" rel="stylesheet" type="text/css" /><link href="/css/wrap.css" rel="alternate stylesheet" type="text/css" title="wrap" /><link href="/css/nowrap.css" rel="stylesheet" type="text/css" title="nowrap" /><link href="/favicon.ico" rel="shortcut icon" /><script src="/js/simple-calendar.js" type="text/javascript"></script><script src="/js/styleswitcher.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="lkml.org : last 100 messages" href="/rss.php" /><link rel="alternate" type="application/rss+xml" title="lkml.org : last messages by &quot;Paul E. McKenney&quot;" href="/groupie.php?aid=4246" /><!--Matomo--><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["setDoNotTrack", true]); _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//m.lkml.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '1']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!--End Matomo Code--></head><body onload="es.jasper.simpleCalendar.init();" itemscope="itemscope" itemtype="http://schema.org/BlogPosting"><table border="0" cellpadding="0" cellspacing="0"><tr><td width="180" align="center"><a href="/"><img style="border:0;width:135px;height:32px" src="/images/toprowlk.gif" alt="lkml.org" /></a></td><td width="32">聽</td><td class="nb"><div><a class="nb" href="/lkml"> [lkml]</a> 聽 <a class="nb" href="/lkml/2005"> [2005]</a> 聽 <a class="nb" href="/lkml/2005/12"> [Dec]</a> 聽 <a class="nb" href="/lkml/2005/12/18"> [18]</a> 聽 <a class="nb" href="/lkml/last100"> [last100]</a> 聽 <a href="/rss.php"><img src="/images/rss-or.gif" border="0" alt="RSS Feed" /></a></div><div>Views: <a href="#" class="nowrap" onclick="setActiveStyleSheet('wrap');return false;">[wrap]</a><a href="#" class="wrap" onclick="setActiveStyleSheet('nowrap');return false;">[no wrap]</a> 聽 <a class="nb" href="/lkml/mheaders/2005/12/18/89" onclick="this.href='/lkml/headers'+'/2005/12/18/89';">[headers]</a>聽 <a href="/lkml/bounce/2005/12/18/89">[forward]</a>聽 </div></td><td width="32">聽</td></tr><tr><td valign="top"><div class="es-jasper-simpleCalendar" baseurl="/lkml/"></div><div class="threadlist">Messages in this thread</div><ul class="threadlist"><li class="root"><a href="/lkml/2005/12/16/290">First message in thread</a></li><li><a href="/lkml/2005/12/16/305">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/299">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/295">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/302">Roland Dreier</a><ul><li><a href="/lkml/2005/12/16/292">Roland Dreier</a></li></ul></li></ul></li><li class="origin"><a href="/lkml/2005/12/18/92">"Paul E. McKenney"</a><ul><li><a href="/lkml/2005/12/18/92">Robert Walsh</a></li><li><a href="/lkml/2005/12/19/163">Ralph Campbell</a></li></ul></li></ul></li><li><a href="/lkml/2005/12/17/75">Andrew Morton</a><ul><li><a href="/lkml/2005/12/20/295">Robert Walsh</a></li></ul></li></ul></li></ul></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerl.gif" width="32" height="32" alt="/" /></td><td class="c" rowspan="2" valign="top" style="padding-top: 1em"><table><tr><td><table><tr><td class="lp">Date</td><td class="rp" itemprop="datePublished">Sun, 18 Dec 2005 11:59:22 -0800</td></tr><tr><td class="lp">From</td><td class="rp" itemprop="author">"Paul E. McKenney" &lt;&gt;</td></tr><tr><td class="lp">Subject</td><td class="rp" itemprop="name">Re: [PATCH 10/13] [RFC] ipath verbs, part 1</td></tr></table></td><td></td></tr></table><pre itemprop="articleBody">On Fri, Dec 16, 2005 at 03:48:55PM -0800, Roland Dreier wrote:<br />&gt; First half of ipath verbs driver<br /><br />Some RCU-related questions interspersed. Basic question is "where is<br />the lock-free read-side traversal?"<br /><br /> Thanx, Paul<br /><br />&gt; ---<br />&gt; <br />&gt; drivers/infiniband/hw/ipath/ipath_verbs.c | 3244 +++++++++++++++++++++++++++++<br />&gt; 1 files changed, 3244 insertions(+), 0 deletions(-)<br />&gt; create mode 100644 drivers/infiniband/hw/ipath/ipath_verbs.c<br />&gt; <br />&gt; 72075ecec75f8c42e444a7d7d8ffcf340a845b96<br />&gt; diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c<br />&gt; new file mode 100644<br />&gt; index 0000000..808326e<br />&gt; --- /dev/null<br />&gt; +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c<br />&gt; &#64;&#64; -0,0 +1,3244 &#64;&#64;<br />&gt; +/*<br />&gt; + * Copyright (c) 2005. PathScale, Inc. All rights reserved.<br />&gt; + *<br />&gt; + * This software is available to you under a choice of one of two<br />&gt; + * licenses. You may choose to be licensed under the terms of the GNU<br />&gt; + * General Public License (GPL) Version 2, available from the file<br />&gt; + * COPYING in the main directory of this source tree, or the<br />&gt; + * OpenIB.org BSD license below:<br />&gt; + *<br />&gt; + * Redistribution and use in source and binary forms, with or<br />&gt; + * without modification, are permitted provided that the following<br />&gt; + * conditions are met:<br />&gt; + *<br />&gt; + * - Redistributions of source code must retain the above<br />&gt; + * copyright notice, this list of conditions and the following<br />&gt; + * disclaimer.<br />&gt; + *<br />&gt; + * - Redistributions in binary form must reproduce the above<br />&gt; + * copyright notice, this list of conditions and the following<br />&gt; + * disclaimer in the documentation and/or other materials<br />&gt; + * provided with the distribution.<br />&gt; + *<br />&gt; + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,<br />&gt; + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF<br />&gt; + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND<br />&gt; + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS<br />&gt; + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN<br />&gt; + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN<br />&gt; + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE<br />&gt; + * SOFTWARE.<br />&gt; + *<br />&gt; + * Patent licenses, if any, provided herein do not apply to<br />&gt; + * combinations of this program with other software, or any other<br />&gt; + * product whatsoever.<br />&gt; + *<br />&gt; + * $Id: ipath_verbs.c 4491 2005-12-15 22:20:31Z rjwalsh $<br />&gt; + */<br />&gt; +<br />&gt; +#include &lt;linux/config.h&gt;<br />&gt; +#include &lt;linux/version.h&gt;<br />&gt; +#include &lt;linux/pci.h&gt;<br />&gt; +#include &lt;linux/err.h&gt;<br />&gt; +#include &lt;rdma/ib_pack.h&gt;<br />&gt; +#include &lt;rdma/ib_smi.h&gt;<br />&gt; +#include &lt;rdma/ib_mad.h&gt;<br />&gt; +#include &lt;rdma/ib_user_verbs.h&gt;<br />&gt; +<br />&gt; +#include &lt;asm/uaccess.h&gt;<br />&gt; +#include &lt;asm-generic/bug.h&gt;<br />&gt; +<br />&gt; +#include "ipath_common.h"<br />&gt; +#include "ips_common.h"<br />&gt; +#include "ipath_layer.h"<br />&gt; +#include "ipath_verbs.h"<br />&gt; +<br />&gt; +/*<br />&gt; + * Compare the lower 24 bits of the two values.<br />&gt; + * Returns an integer &lt;, ==, or &gt; than zero.<br />&gt; + */<br />&gt; +static inline int cmp24(u32 a, u32 b)<br />&gt; +{<br />&gt; + return (((int) a) - ((int) b)) &lt;&lt; 8;<br />&gt; +}<br />&gt; +<br />&gt; +#define MODNAME "ib_ipath"<br />&gt; +#define DRIVER_LOAD_MSG "PathScale " MODNAME " loaded: "<br />&gt; +#define PFX MODNAME ": "<br />&gt; +<br />&gt; +<br />&gt; +/* Not static, because we don't want the compiler removing it */<br />&gt; +const char ipath_verbs_version[] = "ipath_verbs " _IPATH_IDSTR;<br />&gt; +<br />&gt; +unsigned int ib_ipath_qp_table_size = 251;<br />&gt; +module_param(ib_ipath_qp_table_size, uint, 0444);<br />&gt; +MODULE_PARM_DESC(ib_ipath_qp_table_size, "QP table size");<br />&gt; +<br />&gt; +unsigned int ib_ipath_lkey_table_size = 12;<br />&gt; +module_param(ib_ipath_lkey_table_size, uint, 0444);<br />&gt; +MODULE_PARM_DESC(ib_ipath_lkey_table_size,<br />&gt; + "LKEY table size in bits (2^n, 1 &lt;= n &lt;= 23)");<br />&gt; +<br />&gt; +unsigned int ib_ipath_debug; /* debug mask */<br />&gt; +module_param(ib_ipath_debug, uint, 0644);<br />&gt; +MODULE_PARM_DESC(ib_ipath_debug, "Verbs debug mask");<br />&gt; +<br />&gt; +<br />&gt; +static void ipath_ud_loopback(struct ipath_qp *sqp, struct ipath_sge_state *ss,<br />&gt; + u32 len, struct ib_send_wr *wr, struct ib_wc *wc);<br />&gt; +static void ipath_ruc_loopback(struct ipath_qp *sqp, struct ib_wc *wc);<br />&gt; +static int ipath_destroy_qp(struct ib_qp *ibqp);<br />&gt; +<br />&gt; +MODULE_LICENSE("GPL");<br />&gt; +MODULE_AUTHOR("PathScale &lt;infinipath-support&#64;pathscale.com&gt;");<br />&gt; +MODULE_DESCRIPTION("Pathscale InfiniPath driver");<br />&gt; +<br />&gt; +enum {<br />&gt; + IPATH_FAULT_RC_DROP_SEND_F = 1,<br />&gt; + IPATH_FAULT_RC_DROP_SEND_M,<br />&gt; + IPATH_FAULT_RC_DROP_SEND_L,<br />&gt; + IPATH_FAULT_RC_DROP_SEND_O,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_WRITE_F,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_WRITE_M,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_WRITE_L,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_WRITE_O,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_READ_RESP_F,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_READ_RESP_M,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_READ_RESP_L,<br />&gt; + IPATH_FAULT_RC_DROP_RDMA_READ_RESP_O,<br />&gt; + IPATH_FAULT_RC_DROP_ACK,<br />&gt; +};<br />&gt; +<br />&gt; +enum {<br />&gt; + IPATH_TRANS_INVALID = 0,<br />&gt; + IPATH_TRANS_ANY2RST,<br />&gt; + IPATH_TRANS_RST2INIT,<br />&gt; + IPATH_TRANS_INIT2INIT,<br />&gt; + IPATH_TRANS_INIT2RTR,<br />&gt; + IPATH_TRANS_RTR2RTS,<br />&gt; + IPATH_TRANS_RTS2RTS,<br />&gt; + IPATH_TRANS_SQERR2RTS,<br />&gt; + IPATH_TRANS_ANY2ERR,<br />&gt; + IPATH_TRANS_RTS2SQD, /* XXX Wait for expected ACKs &amp; signal event */<br />&gt; + IPATH_TRANS_SQD2SQD, /* error if not drained &amp; parameter change */<br />&gt; + IPATH_TRANS_SQD2RTS, /* error if not drained */<br />&gt; +};<br />&gt; +<br />&gt; +enum {<br />&gt; + IPATH_POST_SEND_OK = 0x0001,<br />&gt; + IPATH_POST_RECV_OK = 0x0002,<br />&gt; + IPATH_PROCESS_RECV_OK = 0x0004,<br />&gt; + IPATH_PROCESS_SEND_OK = 0x0008,<br />&gt; +};<br />&gt; +<br />&gt; +static int state_ops[IB_QPS_ERR + 1] = {<br />&gt; + [IB_QPS_RESET] = 0,<br />&gt; + [IB_QPS_INIT] = IPATH_POST_RECV_OK,<br />&gt; + [IB_QPS_RTR] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK,<br />&gt; + [IB_QPS_RTS] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK |<br />&gt; + IPATH_POST_SEND_OK | IPATH_PROCESS_SEND_OK,<br />&gt; + [IB_QPS_SQD] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK |<br />&gt; + IPATH_POST_SEND_OK,<br />&gt; + [IB_QPS_SQE] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK,<br />&gt; + [IB_QPS_ERR] = 0,<br />&gt; +};<br />&gt; +<br />&gt; +/*<br />&gt; + * Convert the AETH credit code into the number of credits.<br />&gt; + */<br />&gt; +static u32 credit_table[31] = {<br />&gt; + 0, /* 0 */<br />&gt; + 1, /* 1 */<br />&gt; + 2, /* 2 */<br />&gt; + 3, /* 3 */<br />&gt; + 4, /* 4 */<br />&gt; + 6, /* 5 */<br />&gt; + 8, /* 6 */<br />&gt; + 12, /* 7 */<br />&gt; + 16, /* 8 */<br />&gt; + 24, /* 9 */<br />&gt; + 32, /* A */<br />&gt; + 48, /* B */<br />&gt; + 64, /* C */<br />&gt; + 96, /* D */<br />&gt; + 128, /* E */<br />&gt; + 192, /* F */<br />&gt; + 256, /* 10 */<br />&gt; + 384, /* 11 */<br />&gt; + 512, /* 12 */<br />&gt; + 768, /* 13 */<br />&gt; + 1024, /* 14 */<br />&gt; + 1536, /* 15 */<br />&gt; + 2048, /* 16 */<br />&gt; + 3072, /* 17 */<br />&gt; + 4096, /* 18 */<br />&gt; + 6144, /* 19 */<br />&gt; + 8192, /* 1A */<br />&gt; + 12288, /* 1B */<br />&gt; + 16384, /* 1C */<br />&gt; + 24576, /* 1D */<br />&gt; + 32768 /* 1E */<br />&gt; +};<br />&gt; +<br />&gt; +/*<br />&gt; + * Convert the AETH RNR timeout code into the number of milliseconds.<br />&gt; + */<br />&gt; +static u32 rnr_table[32] = {<br />&gt; + 656, /* 0 */<br />&gt; + 1, /* 1 */<br />&gt; + 1, /* 2 */<br />&gt; + 1, /* 3 */<br />&gt; + 1, /* 4 */<br />&gt; + 1, /* 5 */<br />&gt; + 1, /* 6 */<br />&gt; + 1, /* 7 */<br />&gt; + 1, /* 8 */<br />&gt; + 1, /* 9 */<br />&gt; + 1, /* A */<br />&gt; + 1, /* B */<br />&gt; + 1, /* C */<br />&gt; + 1, /* D */<br />&gt; + 2, /* E */<br />&gt; + 2, /* F */<br />&gt; + 3, /* 10 */<br />&gt; + 4, /* 11 */<br />&gt; + 6, /* 12 */<br />&gt; + 8, /* 13 */<br />&gt; + 11, /* 14 */<br />&gt; + 16, /* 15 */<br />&gt; + 21, /* 16 */<br />&gt; + 31, /* 17 */<br />&gt; + 41, /* 18 */<br />&gt; + 62, /* 19 */<br />&gt; + 82, /* 1A */<br />&gt; + 123, /* 1B */<br />&gt; + 164, /* 1C */<br />&gt; + 246, /* 1D */<br />&gt; + 328, /* 1E */<br />&gt; + 492 /* 1F */<br />&gt; +};<br />&gt; +<br />&gt; +/*<br />&gt; + * Translate ib_wr_opcode into ib_wc_opcode.<br />&gt; + */<br />&gt; +static enum ib_wc_opcode wc_opcode[] = {<br />&gt; + [IB_WR_RDMA_WRITE] = IB_WC_RDMA_WRITE,<br />&gt; + [IB_WR_RDMA_WRITE_WITH_IMM] = IB_WC_RDMA_WRITE,<br />&gt; + [IB_WR_SEND] = IB_WC_SEND,<br />&gt; + [IB_WR_SEND_WITH_IMM] = IB_WC_SEND,<br />&gt; + [IB_WR_RDMA_READ] = IB_WC_RDMA_READ,<br />&gt; + [IB_WR_ATOMIC_CMP_AND_SWP] = IB_WC_COMP_SWAP,<br />&gt; + [IB_WR_ATOMIC_FETCH_AND_ADD] = IB_WC_FETCH_ADD<br />&gt; +};<br />&gt; +<br />&gt; +/*<br />&gt; + * Array of device pointers.<br />&gt; + */<br />&gt; +static uint32_t number_of_devices;<br />&gt; +static struct ipath_ibdev **ipath_devices;<br />&gt; +<br />&gt; +/*<br />&gt; + * Global table of GID to attached QPs.<br />&gt; + * The table is global to all ipath devices since a send from one QP/device<br />&gt; + * needs to be locally routed to any locally attached QPs on the same<br />&gt; + * or different device.<br />&gt; + */<br />&gt; +static struct rb_root mcast_tree;<br />&gt; +static spinlock_t mcast_lock = SPIN_LOCK_UNLOCKED;<br />&gt; +<br />&gt; +/*<br />&gt; + * Allocate a structure to link a QP to the multicast GID structure.<br />&gt; + */<br />&gt; +static struct ipath_mcast_qp *ipath_mcast_qp_alloc(struct ipath_qp *qp)<br />&gt; +{<br />&gt; + struct ipath_mcast_qp *mqp;<br />&gt; +<br />&gt; + mqp = kmalloc(sizeof(*mqp), GFP_KERNEL);<br />&gt; + if (!mqp)<br />&gt; + return NULL;<br />&gt; +<br />&gt; + mqp-&gt;qp = qp;<br />&gt; + atomic_inc(&amp;qp-&gt;refcount);<br />&gt; +<br />&gt; + return mqp;<br />&gt; +}<br />&gt; +<br />&gt; +static void ipath_mcast_qp_free(struct ipath_mcast_qp *mqp)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = mqp-&gt;qp;<br />&gt; +<br />&gt; + /* Notify ipath_destroy_qp() if it is waiting. */<br />&gt; + if (atomic_dec_and_test(&amp;qp-&gt;refcount))<br />&gt; + wake_up(&amp;qp-&gt;wait);<br />&gt; +<br />&gt; + kfree(mqp);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Allocate a structure for the multicast GID.<br />&gt; + * A list of QPs will be attached to this structure.<br />&gt; + */<br />&gt; +static struct ipath_mcast *ipath_mcast_alloc(union ib_gid *mgid)<br />&gt; +{<br />&gt; + struct ipath_mcast *mcast;<br />&gt; +<br />&gt; + mcast = kmalloc(sizeof(*mcast), GFP_KERNEL);<br />&gt; + if (!mcast)<br />&gt; + return NULL;<br />&gt; +<br />&gt; + mcast-&gt;mgid = *mgid;<br />&gt; + INIT_LIST_HEAD(&amp;mcast-&gt;qp_list);<br />&gt; + init_waitqueue_head(&amp;mcast-&gt;wait);<br />&gt; + atomic_set(&amp;mcast-&gt;refcount, 0);<br />&gt; +<br />&gt; + return mcast;<br />&gt; +}<br />&gt; +<br />&gt; +static void ipath_mcast_free(struct ipath_mcast *mcast)<br />&gt; +{<br />&gt; + struct ipath_mcast_qp *p, *tmp;<br />&gt; +<br />&gt; + list_for_each_entry_safe(p, tmp, &amp;mcast-&gt;qp_list, list)<br />&gt; + ipath_mcast_qp_free(p);<br />&gt; +<br />&gt; + kfree(mcast);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Search the global table for the given multicast GID.<br />&gt; + * Return it or NULL if not found.<br />&gt; + * The caller is responsible for decrementing the reference count if found.<br />&gt; + */<br />&gt; +static struct ipath_mcast *ipath_mcast_find(union ib_gid *mgid)<br />&gt; +{<br />&gt; + struct rb_node *n;<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;mcast_lock, flags);<br />&gt; + n = mcast_tree.rb_node;<br />&gt; + while (n) {<br />&gt; + struct ipath_mcast *mcast;<br />&gt; + int ret;<br />&gt; +<br />&gt; + mcast = rb_entry(n, struct ipath_mcast, rb_node);<br />&gt; +<br />&gt; + ret = memcmp(mgid-&gt;raw, mcast-&gt;mgid.raw, sizeof(union ib_gid));<br />&gt; + if (ret &lt; 0)<br />&gt; + n = n-&gt;rb_left;<br />&gt; + else if (ret &gt; 0)<br />&gt; + n = n-&gt;rb_right;<br />&gt; + else {<br />&gt; + atomic_inc(&amp;mcast-&gt;refcount);<br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; + return mcast;<br />&gt; + }<br />&gt; + }<br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + return NULL;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Insert the multicast GID into the table and<br />&gt; + * attach the QP structure.<br />&gt; + * Return zero if both were added.<br />&gt; + * Return EEXIST if the GID was already in the table but the QP was added.<br />&gt; + * Return ESRCH if the QP was already attached and neither structure was added.<br />&gt; + */<br />&gt; +static int ipath_mcast_add(struct ipath_mcast *mcast,<br />&gt; + struct ipath_mcast_qp *mqp)<br />&gt; +{<br />&gt; + struct rb_node **n = &amp;mcast_tree.rb_node;<br />&gt; + struct rb_node *pn = NULL;<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + while (*n) {<br />&gt; + struct ipath_mcast *tmcast;<br />&gt; + struct ipath_mcast_qp *p;<br />&gt; + int ret;<br />&gt; +<br />&gt; + pn = *n;<br />&gt; + tmcast = rb_entry(pn, struct ipath_mcast, rb_node);<br />&gt; +<br />&gt; + ret = memcmp(mcast-&gt;mgid.raw, tmcast-&gt;mgid.raw,<br />&gt; + sizeof(union ib_gid));<br />&gt; + if (ret &lt; 0) {<br />&gt; + n = &amp;pn-&gt;rb_left;<br />&gt; + continue;<br />&gt; + }<br />&gt; + if (ret &gt; 0) {<br />&gt; + n = &amp;pn-&gt;rb_right;<br />&gt; + continue;<br />&gt; + }<br />&gt; +<br />&gt; + /* Search the QP list to see if this is already there. */<br />&gt; + list_for_each_entry_rcu(p, &amp;tmcast-&gt;qp_list, list) {<br /><br />Given that we hold the global mcast_lock, how is RCU helping here?<br /><br />Is there a lock-free read-side traversal path somewhere that I am<br />missing?<br /><br />&gt; + if (p-&gt;qp == mqp-&gt;qp) {<br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; + return ESRCH;<br />&gt; + }<br />&gt; + }<br />&gt; + list_add_tail_rcu(&amp;mqp-&gt;list, &amp;tmcast-&gt;qp_list);<br /><br />Ditto...<br /><br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; + return EEXIST;<br />&gt; + }<br />&gt; +<br />&gt; + list_add_tail_rcu(&amp;mqp-&gt;list, &amp;mcast-&gt;qp_list);<br /><br />Ditto...<br /><br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + atomic_inc(&amp;mcast-&gt;refcount);<br />&gt; + rb_link_node(&amp;mcast-&gt;rb_node, pn, n);<br />&gt; + rb_insert_color(&amp;mcast-&gt;rb_node, &amp;mcast_tree);<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +static int ipath_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid,<br />&gt; + u16 lid)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = to_iqp(ibqp);<br />&gt; + struct ipath_mcast *mcast;<br />&gt; + struct ipath_mcast_qp *mqp;<br />&gt; +<br />&gt; + /*<br />&gt; + * Allocate data structures since its better to do this outside of<br />&gt; + * spin locks and it will most likely be needed.<br />&gt; + */<br />&gt; + mcast = ipath_mcast_alloc(gid);<br />&gt; + if (mcast == NULL)<br />&gt; + return -ENOMEM;<br />&gt; + mqp = ipath_mcast_qp_alloc(qp);<br />&gt; + if (mqp == NULL) {<br />&gt; + ipath_mcast_free(mcast);<br />&gt; + return -ENOMEM;<br />&gt; + }<br />&gt; + switch (ipath_mcast_add(mcast, mqp)) {<br />&gt; + case ESRCH:<br />&gt; + /* Neither was used: can't attach the same QP twice. */<br />&gt; + ipath_mcast_qp_free(mqp);<br />&gt; + ipath_mcast_free(mcast);<br />&gt; + return -EINVAL;<br />&gt; + case EEXIST: /* The mcast wasn't used */<br />&gt; + ipath_mcast_free(mcast);<br />&gt; + break;<br />&gt; + default:<br />&gt; + break;<br />&gt; + }<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +static int ipath_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid,<br />&gt; + u16 lid)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = to_iqp(ibqp);<br />&gt; + struct ipath_mcast *mcast = NULL;<br />&gt; + struct ipath_mcast_qp *p, *tmp;<br />&gt; + struct rb_node *n;<br />&gt; + unsigned long flags;<br />&gt; + int last = 0;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + /* Find the GID in the mcast table. */<br />&gt; + n = mcast_tree.rb_node;<br />&gt; + while (1) {<br />&gt; + int ret;<br />&gt; +<br />&gt; + if (n == NULL) {<br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; + return 0;<br />&gt; + }<br />&gt; +<br />&gt; + mcast = rb_entry(n, struct ipath_mcast, rb_node);<br />&gt; + ret = memcmp(gid-&gt;raw, mcast-&gt;mgid.raw, sizeof(union ib_gid));<br />&gt; + if (ret &lt; 0)<br />&gt; + n = n-&gt;rb_left;<br />&gt; + else if (ret &gt; 0)<br />&gt; + n = n-&gt;rb_right;<br />&gt; + else<br />&gt; + break;<br />&gt; + }<br />&gt; +<br />&gt; + /* Search the QP list. */<br />&gt; + list_for_each_entry_safe(p, tmp, &amp;mcast-&gt;qp_list, list) {<br />&gt; + if (p-&gt;qp != qp)<br />&gt; + continue;<br />&gt; + /*<br />&gt; + * We found it, so remove it, but don't poison the forward link<br />&gt; + * until we are sure there are no list walkers.<br />&gt; + */<br />&gt; + list_del_rcu(&amp;p-&gt;list);<br /><br />Ditto...<br /><br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + /* If this was the last attached QP, remove the GID too. */<br />&gt; + if (list_empty(&amp;mcast-&gt;qp_list)) {<br />&gt; + rb_erase(&amp;mcast-&gt;rb_node, &amp;mcast_tree);<br />&gt; + last = 1;<br />&gt; + }<br />&gt; + break;<br />&gt; + }<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;mcast_lock, flags);<br />&gt; +<br />&gt; + if (p) {<br />&gt; + /*<br />&gt; + * Wait for any list walkers to finish before freeing the<br />&gt; + * list element.<br />&gt; + */<br />&gt; + wait_event(mcast-&gt;wait, atomic_read(&amp;mcast-&gt;refcount) &lt;= 1);<br />&gt; + ipath_mcast_qp_free(p);<br />&gt; + }<br />&gt; + if (last) {<br />&gt; + atomic_dec(&amp;mcast-&gt;refcount);<br />&gt; + wait_event(mcast-&gt;wait, !atomic_read(&amp;mcast-&gt;refcount));<br />&gt; + ipath_mcast_free(mcast);<br />&gt; + }<br />&gt; +<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Copy data to SGE memory.<br />&gt; + */<br />&gt; +static void copy_sge(struct ipath_sge_state *ss, void *data, u32 length)<br />&gt; +{<br />&gt; + struct ipath_sge *sge = &amp;ss-&gt;sge;<br />&gt; +<br />&gt; + while (length) {<br />&gt; + u32 len = sge-&gt;length;<br />&gt; +<br />&gt; + BUG_ON(len == 0);<br />&gt; + if (len &gt; length)<br />&gt; + len = length;<br />&gt; + memcpy(sge-&gt;vaddr, data, len);<br />&gt; + sge-&gt;vaddr += len;<br />&gt; + sge-&gt;length -= len;<br />&gt; + sge-&gt;sge_length -= len;<br />&gt; + if (sge-&gt;sge_length == 0) {<br />&gt; + if (--ss-&gt;num_sge)<br />&gt; + *sge = *ss-&gt;sg_list++;<br />&gt; + } else if (sge-&gt;length == 0 &amp;&amp; sge-&gt;mr != NULL) {<br />&gt; + if (++sge-&gt;n &gt;= IPATH_SEGSZ) {<br />&gt; + if (++sge-&gt;m &gt;= sge-&gt;mr-&gt;mapsz)<br />&gt; + break;<br />&gt; + sge-&gt;n = 0;<br />&gt; + }<br />&gt; + sge-&gt;vaddr = sge-&gt;mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].vaddr;<br />&gt; + sge-&gt;length = sge-&gt;mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].length;<br />&gt; + }<br />&gt; + data += len;<br />&gt; + length -= len;<br />&gt; + }<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Skip over length bytes of SGE memory.<br />&gt; + */<br />&gt; +static void skip_sge(struct ipath_sge_state *ss, u32 length)<br />&gt; +{<br />&gt; + struct ipath_sge *sge = &amp;ss-&gt;sge;<br />&gt; +<br />&gt; + while (length &gt; sge-&gt;sge_length) {<br />&gt; + length -= sge-&gt;sge_length;<br />&gt; + ss-&gt;sge = *ss-&gt;sg_list++;<br />&gt; + }<br />&gt; + while (length) {<br />&gt; + u32 len = sge-&gt;length;<br />&gt; +<br />&gt; + BUG_ON(len == 0);<br />&gt; + if (len &gt; length)<br />&gt; + len = length;<br />&gt; + sge-&gt;vaddr += len;<br />&gt; + sge-&gt;length -= len;<br />&gt; + sge-&gt;sge_length -= len;<br />&gt; + if (sge-&gt;sge_length == 0) {<br />&gt; + if (--ss-&gt;num_sge)<br />&gt; + *sge = *ss-&gt;sg_list++;<br />&gt; + } else if (sge-&gt;length == 0 &amp;&amp; sge-&gt;mr != NULL) {<br />&gt; + if (++sge-&gt;n &gt;= IPATH_SEGSZ) {<br />&gt; + if (++sge-&gt;m &gt;= sge-&gt;mr-&gt;mapsz)<br />&gt; + break;<br />&gt; + sge-&gt;n = 0;<br />&gt; + }<br />&gt; + sge-&gt;vaddr = sge-&gt;mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].vaddr;<br />&gt; + sge-&gt;length = sge-&gt;mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].length;<br />&gt; + }<br />&gt; + length -= len;<br />&gt; + }<br />&gt; +}<br />&gt; +<br />&gt; +static inline u32 alloc_qpn(struct ipath_qp_table *qpt)<br />&gt; +{<br />&gt; + u32 i, offset, max_scan, qpn;<br />&gt; + struct qpn_map *map;<br />&gt; +<br />&gt; + qpn = qpt-&gt;last + 1;<br />&gt; + if (qpn &gt;= QPN_MAX)<br />&gt; + qpn = 2;<br />&gt; + offset = qpn &amp; BITS_PER_PAGE_MASK;<br />&gt; + map = &amp;qpt-&gt;map[qpn / BITS_PER_PAGE];<br />&gt; + max_scan = qpt-&gt;nmaps - !offset;<br />&gt; + for (i = 0;;) {<br />&gt; + if (unlikely(!map-&gt;page)) {<br />&gt; + unsigned long page = get_zeroed_page(GFP_KERNEL);<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + /*<br />&gt; + * Free the page if someone raced with us<br />&gt; + * installing it:<br />&gt; + */<br />&gt; + spin_lock_irqsave(&amp;qpt-&gt;lock, flags);<br />&gt; + if (map-&gt;page)<br />&gt; + free_page(page);<br />&gt; + else<br />&gt; + map-&gt;page = (void *)page;<br />&gt; + spin_unlock_irqrestore(&amp;qpt-&gt;lock, flags);<br />&gt; + if (unlikely(!map-&gt;page))<br />&gt; + break;<br />&gt; + }<br />&gt; + if (likely(atomic_read(&amp;map-&gt;n_free))) {<br />&gt; + do {<br />&gt; + if (!test_and_set_bit(offset, map-&gt;page)) {<br />&gt; + atomic_dec(&amp;map-&gt;n_free);<br />&gt; + qpt-&gt;last = qpn;<br />&gt; + return qpn;<br />&gt; + }<br />&gt; + offset = find_next_offset(map, offset);<br />&gt; + qpn = mk_qpn(qpt, map, offset);<br />&gt; + /*<br />&gt; + * This test differs from alloc_pidmap().<br />&gt; + * If find_next_offset() does find a zero bit,<br />&gt; + * we don't need to check for QPN wrapping<br />&gt; + * around past our starting QPN. We<br />&gt; + * just need to be sure we don't loop forever.<br />&gt; + */<br />&gt; + } while (offset &lt; BITS_PER_PAGE &amp;&amp; qpn &lt; QPN_MAX);<br />&gt; + }<br />&gt; + /*<br />&gt; + * In order to keep the number of pages allocated to a minimum,<br />&gt; + * we scan the all existing pages before increasing the size<br />&gt; + * of the bitmap table.<br />&gt; + */<br />&gt; + if (++i &gt; max_scan) {<br />&gt; + if (qpt-&gt;nmaps == QPNMAP_ENTRIES)<br />&gt; + break;<br />&gt; + map = &amp;qpt-&gt;map[qpt-&gt;nmaps++];<br />&gt; + offset = 0;<br />&gt; + } else if (map &lt; &amp;qpt-&gt;map[qpt-&gt;nmaps]) {<br />&gt; + ++map;<br />&gt; + offset = 0;<br />&gt; + } else {<br />&gt; + map = &amp;qpt-&gt;map[0];<br />&gt; + offset = 2;<br />&gt; + }<br />&gt; + qpn = mk_qpn(qpt, map, offset);<br />&gt; + }<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +static inline void free_qpn(struct ipath_qp_table *qpt, u32 qpn)<br />&gt; +{<br />&gt; + struct qpn_map *map;<br />&gt; +<br />&gt; + map = qpt-&gt;map + qpn / BITS_PER_PAGE;<br />&gt; + if (map-&gt;page)<br />&gt; + clear_bit(qpn &amp; BITS_PER_PAGE_MASK, map-&gt;page);<br />&gt; + atomic_inc(&amp;map-&gt;n_free);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Allocate the next available QPN and put the QP into the hash table.<br />&gt; + * The hash table holds a reference to the QP.<br />&gt; + */<br />&gt; +static int ipath_alloc_qpn(struct ipath_qp_table *qpt, struct ipath_qp *qp,<br />&gt; + enum ib_qp_type type)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; + u32 qpn;<br />&gt; +<br />&gt; + if (type == IB_QPT_SMI)<br />&gt; + qpn = 0;<br />&gt; + else if (type == IB_QPT_GSI)<br />&gt; + qpn = 1;<br />&gt; + else {<br />&gt; + /* Allocate the next available QPN */<br />&gt; + qpn = alloc_qpn(qpt);<br />&gt; + if (qpn == 0) {<br />&gt; + return -ENOMEM;<br />&gt; + }<br />&gt; + }<br />&gt; + qp-&gt;ibqp.qp_num = qpn;<br />&gt; +<br />&gt; + /* Add the QP to the hash table. */<br />&gt; + spin_lock_irqsave(&amp;qpt-&gt;lock, flags);<br />&gt; +<br />&gt; + qpn %= qpt-&gt;max;<br />&gt; + qp-&gt;next = qpt-&gt;table[qpn];<br />&gt; + qpt-&gt;table[qpn] = qp;<br />&gt; + atomic_inc(&amp;qp-&gt;refcount);<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;qpt-&gt;lock, flags);<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Remove the QP from the table so it can't be found asynchronously by<br />&gt; + * the receive interrupt routine.<br />&gt; + */<br />&gt; +static void ipath_free_qp(struct ipath_qp_table *qpt, struct ipath_qp *qp)<br />&gt; +{<br />&gt; + struct ipath_qp *q, **qpp;<br />&gt; + unsigned long flags;<br />&gt; + int fnd = 0;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;qpt-&gt;lock, flags);<br />&gt; +<br />&gt; + /* Remove QP from the hash table. */<br />&gt; + qpp = &amp;qpt-&gt;table[qp-&gt;ibqp.qp_num % qpt-&gt;max];<br />&gt; + for (; (q = *qpp) != NULL; qpp = &amp;q-&gt;next) {<br />&gt; + if (q == qp) {<br />&gt; + *qpp = qp-&gt;next;<br />&gt; + qp-&gt;next = NULL;<br />&gt; + atomic_dec(&amp;qp-&gt;refcount);<br />&gt; + fnd = 1;<br />&gt; + break;<br />&gt; + }<br />&gt; + }<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;qpt-&gt;lock, flags);<br />&gt; +<br />&gt; + if (!fnd)<br />&gt; + return;<br />&gt; +<br />&gt; + /* If QPN is not reserved, mark QPN free in the bitmap. */<br />&gt; + if (qp-&gt;ibqp.qp_num &gt; 1)<br />&gt; + free_qpn(qpt, qp-&gt;ibqp.qp_num);<br />&gt; +<br />&gt; + wait_event(qp-&gt;wait, !atomic_read(&amp;qp-&gt;refcount));<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Remove all QPs from the table.<br />&gt; + */<br />&gt; +static void ipath_free_all_qps(struct ipath_qp_table *qpt)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; + struct ipath_qp *qp, *nqp;<br />&gt; + u32 n;<br />&gt; +<br />&gt; + for (n = 0; n &lt; qpt-&gt;max; n++) {<br />&gt; + spin_lock_irqsave(&amp;qpt-&gt;lock, flags);<br />&gt; + qp = qpt-&gt;table[n];<br />&gt; + qpt-&gt;table[n] = NULL;<br />&gt; + spin_unlock_irqrestore(&amp;qpt-&gt;lock, flags);<br />&gt; +<br />&gt; + while (qp) {<br />&gt; + nqp = qp-&gt;next;<br />&gt; + if (qp-&gt;ibqp.qp_num &gt; 1)<br />&gt; + free_qpn(qpt, qp-&gt;ibqp.qp_num);<br />&gt; + if (!atomic_dec_and_test(&amp;qp-&gt;refcount) ||<br />&gt; + !ipath_destroy_qp(&amp;qp-&gt;ibqp))<br />&gt; + _VERBS_INFO("QP memory leak!\n");<br />&gt; + qp = nqp;<br />&gt; + }<br />&gt; + }<br />&gt; +<br />&gt; + for (n = 0; n &lt; ARRAY_SIZE(qpt-&gt;map); n++) {<br />&gt; + if (qpt-&gt;map[n].page)<br />&gt; + free_page((unsigned long)qpt-&gt;map[n].page);<br />&gt; + }<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Return the QP with the given QPN.<br />&gt; + * The caller is responsible for decrementing the QP reference count when done.<br />&gt; + */<br />&gt; +static struct ipath_qp *ipath_lookup_qpn(struct ipath_qp_table *qpt, u32 qpn)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; + struct ipath_qp *qp;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;qpt-&gt;lock, flags);<br />&gt; +<br />&gt; + for (qp = qpt-&gt;table[qpn % qpt-&gt;max]; qp; qp = qp-&gt;next) {<br />&gt; + if (qp-&gt;ibqp.qp_num == qpn) {<br />&gt; + atomic_inc(&amp;qp-&gt;refcount);<br />&gt; + break;<br />&gt; + }<br />&gt; + }<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;qpt-&gt;lock, flags);<br />&gt; + return qp;<br />&gt; +}<br />&gt; +<br />&gt; +static int ipath_alloc_lkey(struct ipath_lkey_table *rkt,<br />&gt; + struct ipath_mregion *mr)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; + u32 r;<br />&gt; + u32 n;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;rkt-&gt;lock, flags);<br />&gt; +<br />&gt; + /* Find the next available LKEY */<br />&gt; + r = n = rkt-&gt;next;<br />&gt; + for (;;) {<br />&gt; + if (rkt-&gt;table[r] == NULL)<br />&gt; + break;<br />&gt; + r = (r + 1) &amp; (rkt-&gt;max - 1);<br />&gt; + if (r == n) {<br />&gt; + spin_unlock_irqrestore(&amp;rkt-&gt;lock, flags);<br />&gt; + _VERBS_INFO("LKEY table full\n");<br />&gt; + return 0;<br />&gt; + }<br />&gt; + }<br />&gt; + rkt-&gt;next = (r + 1) &amp; (rkt-&gt;max - 1);<br />&gt; + /*<br />&gt; + * Make sure lkey is never zero which is reserved to indicate an<br />&gt; + * unrestricted LKEY.<br />&gt; + */<br />&gt; + rkt-&gt;gen++;<br />&gt; + mr-&gt;lkey = (r &lt;&lt; (32 - ib_ipath_lkey_table_size)) |<br />&gt; + ((((1 &lt;&lt; (24 - ib_ipath_lkey_table_size)) - 1) &amp; rkt-&gt;gen) &lt;&lt; 8);<br />&gt; + if (mr-&gt;lkey == 0) {<br />&gt; + mr-&gt;lkey |= 1 &lt;&lt; 8;<br />&gt; + rkt-&gt;gen++;<br />&gt; + }<br />&gt; + rkt-&gt;table[r] = mr;<br />&gt; + spin_unlock_irqrestore(&amp;rkt-&gt;lock, flags);<br />&gt; +<br />&gt; + return 1;<br />&gt; +}<br />&gt; +<br />&gt; +static void ipath_free_lkey(struct ipath_lkey_table *rkt, u32 lkey)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; + u32 r;<br />&gt; +<br />&gt; + if (lkey == 0)<br />&gt; + return;<br />&gt; + r = lkey &gt;&gt; (32 - ib_ipath_lkey_table_size);<br />&gt; + spin_lock_irqsave(&amp;rkt-&gt;lock, flags);<br />&gt; + rkt-&gt;table[r] = NULL;<br />&gt; + spin_unlock_irqrestore(&amp;rkt-&gt;lock, flags);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Check the IB SGE for validity and initialize our internal version of it.<br />&gt; + * Return 1 if OK, else zero.<br />&gt; + */<br />&gt; +static int ipath_lkey_ok(struct ipath_lkey_table *rkt, struct ipath_sge *isge,<br />&gt; + struct ib_sge *sge, int acc)<br />&gt; +{<br />&gt; + struct ipath_mregion *mr;<br />&gt; + size_t off;<br />&gt; +<br />&gt; + /*<br />&gt; + * We use LKEY == zero to mean a physical kmalloc() address.<br />&gt; + * This is a bit of a hack since we rely on dma_map_single()<br />&gt; + * being reversible by calling bus_to_virt().<br />&gt; + */<br />&gt; + if (sge-&gt;lkey == 0) {<br />&gt; + isge-&gt;mr = NULL;<br />&gt; + isge-&gt;vaddr = bus_to_virt(sge-&gt;addr);<br />&gt; + isge-&gt;length = sge-&gt;length;<br />&gt; + isge-&gt;sge_length = sge-&gt;length;<br />&gt; + return 1;<br />&gt; + }<br />&gt; + spin_lock(&amp;rkt-&gt;lock);<br />&gt; + mr = rkt-&gt;table[(sge-&gt;lkey &gt;&gt; (32 - ib_ipath_lkey_table_size))];<br />&gt; + spin_unlock(&amp;rkt-&gt;lock);<br />&gt; + if (unlikely(mr == NULL || mr-&gt;lkey != sge-&gt;lkey))<br />&gt; + return 0;<br />&gt; +<br />&gt; + off = sge-&gt;addr - mr-&gt;user_base;<br />&gt; + if (unlikely(sge-&gt;addr &lt; mr-&gt;user_base ||<br />&gt; + off + sge-&gt;length &gt; mr-&gt;length ||<br />&gt; + (mr-&gt;access_flags &amp; acc) != acc))<br />&gt; + return 0;<br />&gt; +<br />&gt; + off += mr-&gt;offset;<br />&gt; + isge-&gt;mr = mr;<br />&gt; + isge-&gt;m = 0;<br />&gt; + isge-&gt;n = 0;<br />&gt; + while (off &gt;= mr-&gt;map[isge-&gt;m]-&gt;segs[isge-&gt;n].length) {<br />&gt; + off -= mr-&gt;map[isge-&gt;m]-&gt;segs[isge-&gt;n].length;<br />&gt; + if (++isge-&gt;n &gt;= IPATH_SEGSZ) {<br />&gt; + isge-&gt;m++;<br />&gt; + isge-&gt;n = 0;<br />&gt; + }<br />&gt; + }<br />&gt; + isge-&gt;vaddr = mr-&gt;map[isge-&gt;m]-&gt;segs[isge-&gt;n].vaddr + off;<br />&gt; + isge-&gt;length = mr-&gt;map[isge-&gt;m]-&gt;segs[isge-&gt;n].length - off;<br />&gt; + isge-&gt;sge_length = sge-&gt;length;<br />&gt; + return 1;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Initialize the qp-&gt;s_sge after a restart.<br />&gt; + * The QP s_lock should be held.<br />&gt; + */<br />&gt; +static void ipath_init_restart(struct ipath_qp *qp, struct ipath_swqe *wqe)<br />&gt; +{<br />&gt; + struct ipath_ibdev *dev;<br />&gt; + u32 len;<br />&gt; +<br />&gt; + len = ((qp-&gt;s_psn - wqe-&gt;psn) &amp; 0xFFFFFF) *<br />&gt; + ib_mtu_enum_to_int(qp-&gt;path_mtu);<br />&gt; + qp-&gt;s_sge.sge = wqe-&gt;sg_list[0];<br />&gt; + qp-&gt;s_sge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + qp-&gt;s_sge.num_sge = wqe-&gt;wr.num_sge;<br />&gt; + skip_sge(&amp;qp-&gt;s_sge, len);<br />&gt; + qp-&gt;s_len = wqe-&gt;length - len;<br />&gt; + dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + spin_lock(&amp;dev-&gt;pending_lock);<br />&gt; + if (qp-&gt;timerwait.next == LIST_POISON1)<br />&gt; + list_add_tail(&amp;qp-&gt;timerwait,<br />&gt; + &amp;dev-&gt;pending[dev-&gt;pending_index]);<br />&gt; + spin_unlock(&amp;dev-&gt;pending_lock);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Check the IB virtual address, length, and RKEY.<br />&gt; + * Return 1 if OK, else zero.<br />&gt; + * The QP r_rq.lock should be held.<br />&gt; + */<br />&gt; +static int ipath_rkey_ok(struct ipath_ibdev *dev, struct ipath_sge_state *ss,<br />&gt; + u32 len, u64 vaddr, u32 rkey, int acc)<br />&gt; +{<br />&gt; + struct ipath_lkey_table *rkt = &amp;dev-&gt;lk_table;<br />&gt; + struct ipath_sge *sge = &amp;ss-&gt;sge;<br />&gt; + struct ipath_mregion *mr;<br />&gt; + size_t off;<br />&gt; +<br />&gt; + spin_lock(&amp;rkt-&gt;lock);<br />&gt; + mr = rkt-&gt;table[(rkey &gt;&gt; (32 - ib_ipath_lkey_table_size))];<br />&gt; + spin_unlock(&amp;rkt-&gt;lock);<br />&gt; + if (unlikely(mr == NULL || mr-&gt;lkey != rkey))<br />&gt; + return 0;<br />&gt; +<br />&gt; + off = vaddr - mr-&gt;iova;<br />&gt; + if (unlikely(vaddr &lt; mr-&gt;iova || off + len &gt; mr-&gt;length ||<br />&gt; + (mr-&gt;access_flags &amp; acc) == 0))<br />&gt; + return 0;<br />&gt; +<br />&gt; + off += mr-&gt;offset;<br />&gt; + sge-&gt;mr = mr;<br />&gt; + sge-&gt;m = 0;<br />&gt; + sge-&gt;n = 0;<br />&gt; + while (off &gt;= mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].length) {<br />&gt; + off -= mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].length;<br />&gt; + if (++sge-&gt;n &gt;= IPATH_SEGSZ) {<br />&gt; + sge-&gt;m++;<br />&gt; + sge-&gt;n = 0;<br />&gt; + }<br />&gt; + }<br />&gt; + sge-&gt;vaddr = mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].vaddr + off;<br />&gt; + sge-&gt;length = mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].length - off;<br />&gt; + sge-&gt;sge_length = len;<br />&gt; + ss-&gt;sg_list = NULL;<br />&gt; + ss-&gt;num_sge = 1;<br />&gt; + return 1;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Add a new entry to the completion queue.<br />&gt; + * This may be called with one of the qp-&gt;s_lock or qp-&gt;r_rq.lock held.<br />&gt; + */<br />&gt; +static void ipath_cq_enter(struct ipath_cq *cq, struct ib_wc *entry, int sig)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; + u32 next;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;cq-&gt;lock, flags);<br />&gt; +<br />&gt; + cq-&gt;queue[cq-&gt;head] = *entry;<br />&gt; + next = cq-&gt;head + 1;<br />&gt; + if (next == cq-&gt;ibcq.cqe)<br />&gt; + next = 0;<br />&gt; + if (next != cq-&gt;tail)<br />&gt; + cq-&gt;head = next;<br />&gt; + else {<br />&gt; + /* XXX - need to mark current wr as having an error... */<br />&gt; + }<br />&gt; +<br />&gt; + if (cq-&gt;notify == IB_CQ_NEXT_COMP ||<br />&gt; + (cq-&gt;notify == IB_CQ_SOLICITED &amp;&amp; sig)) {<br />&gt; + cq-&gt;notify = IB_CQ_NONE;<br />&gt; + cq-&gt;triggered++;<br />&gt; + /*<br />&gt; + * This will cause send_complete() to be called in<br />&gt; + * another thread.<br />&gt; + */<br />&gt; + tasklet_schedule(&amp;cq-&gt;comptask);<br />&gt; + }<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;cq-&gt;lock, flags);<br />&gt; +<br />&gt; + if (entry-&gt;status != IB_WC_SUCCESS)<br />&gt; + to_idev(cq-&gt;ibcq.device)-&gt;n_wqe_errs++;<br />&gt; +}<br />&gt; +<br />&gt; +static void send_complete(unsigned long data)<br />&gt; +{<br />&gt; + struct ipath_cq *cq = (struct ipath_cq *)data;<br />&gt; +<br />&gt; + /*<br />&gt; + * The completion handler will most likely rearm the notification<br />&gt; + * and poll for all pending entries. If a new completion entry<br />&gt; + * is added while we are in this routine, tasklet_schedule()<br />&gt; + * won't call us again until we return so we check triggered to<br />&gt; + * see if we need to call the handler again.<br />&gt; + */<br />&gt; + for (;;) {<br />&gt; + u8 triggered = cq-&gt;triggered;<br />&gt; +<br />&gt; + cq-&gt;ibcq.comp_handler(&amp;cq-&gt;ibcq, cq-&gt;ibcq.cq_context);<br />&gt; +<br />&gt; + if (cq-&gt;triggered == triggered)<br />&gt; + return;<br />&gt; + }<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * This is the QP state transition table.<br />&gt; + * See ipath_modify_qp() for details.<br />&gt; + */<br />&gt; +static const struct {<br />&gt; + int trans;<br />&gt; + u32 req_param[IB_QPT_RAW_IPV6];<br />&gt; + u32 opt_param[IB_QPT_RAW_IPV6];<br />&gt; +} qp_state_table[IB_QPS_ERR + 1][IB_QPS_ERR + 1] = {<br />&gt; + [IB_QPS_RESET] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR },<br />&gt; + [IB_QPS_INIT] = {<br />&gt; + .trans = IPATH_TRANS_RST2INIT,<br />&gt; + .req_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PORT |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PORT |<br />&gt; + IB_QP_ACCESS_FLAGS),<br />&gt; + [IB_QPT_RC] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PORT |<br />&gt; + IB_QP_ACCESS_FLAGS),<br />&gt; + },<br />&gt; + },<br />&gt; + },<br />&gt; + [IB_QPS_INIT] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR },<br />&gt; + [IB_QPS_INIT] = {<br />&gt; + .trans = IPATH_TRANS_INIT2INIT,<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PORT |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PORT |<br />&gt; + IB_QP_ACCESS_FLAGS),<br />&gt; + [IB_QPT_RC] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PORT |<br />&gt; + IB_QP_ACCESS_FLAGS),<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_RTR] = {<br />&gt; + .trans = IPATH_TRANS_INIT2RTR,<br />&gt; + .req_param = {<br />&gt; + [IB_QPT_UC] = (IB_QP_AV |<br />&gt; + IB_QP_PATH_MTU |<br />&gt; + IB_QP_DEST_QPN |<br />&gt; + IB_QP_RQ_PSN),<br />&gt; + [IB_QPT_RC] = (IB_QP_AV |<br />&gt; + IB_QP_PATH_MTU |<br />&gt; + IB_QP_DEST_QPN |<br />&gt; + IB_QP_RQ_PSN |<br />&gt; + IB_QP_MAX_DEST_RD_ATOMIC |<br />&gt; + IB_QP_MIN_RNR_TIMER),<br />&gt; + },<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PKEY_INDEX),<br />&gt; + [IB_QPT_RC] = (IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PKEY_INDEX),<br />&gt; + }<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_RTR] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR },<br />&gt; + [IB_QPS_RTS] = {<br />&gt; + .trans = IPATH_TRANS_RTR2RTS,<br />&gt; + .req_param = {<br />&gt; + [IB_QPT_SMI] = IB_QP_SQ_PSN,<br />&gt; + [IB_QPT_GSI] = IB_QP_SQ_PSN,<br />&gt; + [IB_QPT_UD] = IB_QP_SQ_PSN,<br />&gt; + [IB_QPT_UC] = IB_QP_SQ_PSN,<br />&gt; + [IB_QPT_RC] = (IB_QP_TIMEOUT |<br />&gt; + IB_QP_RETRY_CNT |<br />&gt; + IB_QP_RNR_RETRY |<br />&gt; + IB_QP_SQ_PSN |<br />&gt; + IB_QP_MAX_QP_RD_ATOMIC),<br />&gt; + },<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_CUR_STATE |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + [IB_QPT_RC] = (IB_QP_CUR_STATE |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_MIN_RNR_TIMER |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + }<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_RTS] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR },<br />&gt; + [IB_QPS_RTS] = {<br />&gt; + .trans = IPATH_TRANS_RTS2RTS,<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + [IB_QPT_RC] = (IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_PATH_MIG_STATE |<br />&gt; + IB_QP_MIN_RNR_TIMER),<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_SQD] = {<br />&gt; + .trans = IPATH_TRANS_RTS2SQD,<br />&gt; + },<br />&gt; + },<br />&gt; + [IB_QPS_SQD] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR },<br />&gt; + [IB_QPS_RTS] = {<br />&gt; + .trans = IPATH_TRANS_SQD2RTS,<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_CUR_STATE |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + [IB_QPT_RC] = (IB_QP_CUR_STATE |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_MIN_RNR_TIMER |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_SQD] = {<br />&gt; + .trans = IPATH_TRANS_SQD2SQD,<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_PKEY_INDEX | IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = (IB_QP_AV |<br />&gt; + IB_QP_TIMEOUT |<br />&gt; + IB_QP_CUR_STATE |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + [IB_QPT_RC] = (IB_QP_AV |<br />&gt; + IB_QP_TIMEOUT |<br />&gt; + IB_QP_RETRY_CNT |<br />&gt; + IB_QP_RNR_RETRY |<br />&gt; + IB_QP_MAX_QP_RD_ATOMIC |<br />&gt; + IB_QP_MAX_DEST_RD_ATOMIC |<br />&gt; + IB_QP_CUR_STATE |<br />&gt; + IB_QP_ALT_PATH |<br />&gt; + IB_QP_ACCESS_FLAGS |<br />&gt; + IB_QP_PKEY_INDEX |<br />&gt; + IB_QP_MIN_RNR_TIMER |<br />&gt; + IB_QP_PATH_MIG_STATE),<br />&gt; + }<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_SQE] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR },<br />&gt; + [IB_QPS_RTS] = {<br />&gt; + .trans = IPATH_TRANS_SQERR2RTS,<br />&gt; + .opt_param = {<br />&gt; + [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_GSI] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UD] = (IB_QP_CUR_STATE | IB_QP_QKEY),<br />&gt; + [IB_QPT_UC] = IB_QP_CUR_STATE,<br />&gt; + [IB_QPT_RC] = (IB_QP_CUR_STATE |<br />&gt; + IB_QP_MIN_RNR_TIMER),<br />&gt; + }<br />&gt; + }<br />&gt; + },<br />&gt; + [IB_QPS_ERR] = {<br />&gt; + [IB_QPS_RESET] = { .trans = IPATH_TRANS_ANY2RST },<br />&gt; + [IB_QPS_ERR] = { .trans = IPATH_TRANS_ANY2ERR }<br />&gt; + }<br />&gt; +};<br />&gt; +<br />&gt; +/*<br />&gt; + * Initialize the QP state to the reset state.<br />&gt; + */<br />&gt; +static void ipath_reset_qp(struct ipath_qp *qp)<br />&gt; +{<br />&gt; + qp-&gt;remote_qpn = 0;<br />&gt; + qp-&gt;qkey = 0;<br />&gt; + qp-&gt;qp_access_flags = 0;<br />&gt; + qp-&gt;s_hdrwords = 0;<br />&gt; + qp-&gt;s_psn = 0;<br />&gt; + qp-&gt;r_psn = 0;<br />&gt; + atomic_set(&amp;qp-&gt;msn, 0);<br />&gt; + if (qp-&gt;ibqp.qp_type == IB_QPT_RC) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_LAST;<br />&gt; + qp-&gt;r_state = IB_OPCODE_RC_SEND_LAST;<br />&gt; + } else {<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_SEND_LAST;<br />&gt; + qp-&gt;r_state = IB_OPCODE_UC_SEND_LAST;<br />&gt; + }<br />&gt; + qp-&gt;s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE;<br />&gt; + qp-&gt;s_nak_state = 0;<br />&gt; + qp-&gt;s_rnr_timeout = 0;<br />&gt; + qp-&gt;s_head = 0;<br />&gt; + qp-&gt;s_tail = 0;<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + qp-&gt;s_last = 0;<br />&gt; + qp-&gt;s_ssn = 1;<br />&gt; + qp-&gt;s_lsn = 0;<br />&gt; + qp-&gt;r_rq.head = 0;<br />&gt; + qp-&gt;r_rq.tail = 0;<br />&gt; + qp-&gt;r_reuse_sge = 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Flush send work queue.<br />&gt; + * The QP s_lock should be held.<br />&gt; + */<br />&gt; +static void ipath_sqerror_qp(struct ipath_qp *qp, struct ib_wc *wc)<br />&gt; +{<br />&gt; + struct ipath_ibdev *dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + struct ipath_swqe *wqe = get_swqe_ptr(qp, qp-&gt;s_last);<br />&gt; +<br />&gt; + _VERBS_INFO("Send queue error on QP%d/%d: err: %d\n",<br />&gt; + qp-&gt;ibqp.qp_num, qp-&gt;remote_qpn, wc-&gt;status);<br />&gt; +<br />&gt; + spin_lock(&amp;dev-&gt;pending_lock);<br />&gt; + /* XXX What if its already removed by the timeout code? */<br />&gt; + if (qp-&gt;timerwait.next != LIST_POISON1)<br />&gt; + list_del(&amp;qp-&gt;timerwait);<br />&gt; + if (qp-&gt;piowait.next != LIST_POISON1)<br />&gt; + list_del(&amp;qp-&gt;piowait);<br />&gt; + spin_unlock(&amp;dev-&gt;pending_lock);<br />&gt; +<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.send_cq), wc, 1);<br />&gt; + if (++qp-&gt;s_last &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_last = 0;<br />&gt; +<br />&gt; + wc-&gt;status = IB_WC_WR_FLUSH_ERR;<br />&gt; +<br />&gt; + while (qp-&gt;s_last != qp-&gt;s_head) {<br />&gt; + wc-&gt;wr_id = wqe-&gt;wr.wr_id;<br />&gt; + wc-&gt;opcode = wc_opcode[wqe-&gt;wr.opcode];<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.send_cq), wc, 1);<br />&gt; + if (++qp-&gt;s_last &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_last = 0;<br />&gt; + wqe = get_swqe_ptr(qp, qp-&gt;s_last);<br />&gt; + }<br />&gt; + qp-&gt;s_cur = qp-&gt;s_tail = qp-&gt;s_head;<br />&gt; + qp-&gt;state = IB_QPS_SQE;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Flush both send and receive work queues.<br />&gt; + * QP r_rq.lock and s_lock should be held.<br />&gt; + */<br />&gt; +static void ipath_error_qp(struct ipath_qp *qp)<br />&gt; +{<br />&gt; + struct ipath_ibdev *dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + struct ib_wc wc;<br />&gt; +<br />&gt; + _VERBS_INFO("QP%d/%d in error state\n",<br />&gt; + qp-&gt;ibqp.qp_num, qp-&gt;remote_qpn);<br />&gt; +<br />&gt; + spin_lock(&amp;dev-&gt;pending_lock);<br />&gt; + /* XXX What if its already removed by the timeout code? */<br />&gt; + if (qp-&gt;timerwait.next != LIST_POISON1)<br />&gt; + list_del(&amp;qp-&gt;timerwait);<br />&gt; + if (qp-&gt;piowait.next != LIST_POISON1)<br />&gt; + list_del(&amp;qp-&gt;piowait);<br />&gt; + spin_unlock(&amp;dev-&gt;pending_lock);<br />&gt; +<br />&gt; + wc.status = IB_WC_WR_FLUSH_ERR;<br />&gt; + wc.vendor_err = 0;<br />&gt; + wc.byte_len = 0;<br />&gt; + wc.imm_data = 0;<br />&gt; + wc.qp_num = qp-&gt;ibqp.qp_num;<br />&gt; + wc.src_qp = 0;<br />&gt; + wc.wc_flags = 0;<br />&gt; + wc.pkey_index = 0;<br />&gt; + wc.slid = 0;<br />&gt; + wc.sl = 0;<br />&gt; + wc.dlid_path_bits = 0;<br />&gt; + wc.port_num = 0;<br />&gt; +<br />&gt; + while (qp-&gt;s_last != qp-&gt;s_head) {<br />&gt; + struct ipath_swqe *wqe = get_swqe_ptr(qp, qp-&gt;s_last);<br />&gt; +<br />&gt; + wc.wr_id = wqe-&gt;wr.wr_id;<br />&gt; + wc.opcode = wc_opcode[wqe-&gt;wr.opcode];<br />&gt; + if (++qp-&gt;s_last &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_last = 0;<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.send_cq), &amp;wc, 1);<br />&gt; + }<br />&gt; + qp-&gt;s_cur = qp-&gt;s_tail = qp-&gt;s_head;<br />&gt; + qp-&gt;s_hdrwords = 0;<br />&gt; + qp-&gt;s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE;<br />&gt; +<br />&gt; + wc.opcode = IB_WC_RECV;<br />&gt; + while (qp-&gt;r_rq.tail != qp-&gt;r_rq.head) {<br />&gt; + wc.wr_id = get_rwqe_ptr(&amp;qp-&gt;r_rq, qp-&gt;r_rq.tail)-&gt;wr_id;<br />&gt; + if (++qp-&gt;r_rq.tail &gt;= qp-&gt;r_rq.size)<br />&gt; + qp-&gt;r_rq.tail = 0;<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.recv_cq), &amp;wc, 1);<br />&gt; + }<br />&gt; +}<br />&gt; +<br />&gt; +static int ipath_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,<br />&gt; + int attr_mask)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = to_iqp(ibqp);<br />&gt; + enum ib_qp_state cur_state, new_state;<br />&gt; + u32 req_param, opt_param;<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_CUR_STATE) {<br />&gt; + cur_state = attr-&gt;cur_qp_state;<br />&gt; + if (cur_state != IB_QPS_RTR &amp;&amp;<br />&gt; + cur_state != IB_QPS_RTS &amp;&amp;<br />&gt; + cur_state != IB_QPS_SQD &amp;&amp; cur_state != IB_QPS_SQE)<br />&gt; + return -EINVAL;<br />&gt; + spin_lock_irqsave(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + spin_lock(&amp;qp-&gt;s_lock);<br />&gt; + } else {<br />&gt; + spin_lock_irqsave(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + spin_lock(&amp;qp-&gt;s_lock);<br />&gt; + cur_state = qp-&gt;state;<br />&gt; + }<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_STATE) {<br />&gt; + new_state = attr-&gt;qp_state;<br />&gt; + if (new_state &lt; 0 || new_state &gt; IB_QPS_ERR)<br />&gt; + goto inval;<br />&gt; + } else<br />&gt; + new_state = cur_state;<br />&gt; +<br />&gt; + switch (qp_state_table[cur_state][new_state].trans) {<br />&gt; + case IPATH_TRANS_INVALID:<br />&gt; + goto inval;<br />&gt; +<br />&gt; + case IPATH_TRANS_ANY2RST:<br />&gt; + ipath_reset_qp(qp);<br />&gt; + break;<br />&gt; +<br />&gt; + case IPATH_TRANS_ANY2ERR:<br />&gt; + ipath_error_qp(qp);<br />&gt; + break;<br />&gt; +<br />&gt; + }<br />&gt; +<br />&gt; + req_param =<br />&gt; + qp_state_table[cur_state][new_state].req_param[qp-&gt;ibqp.qp_type];<br />&gt; + opt_param =<br />&gt; + qp_state_table[cur_state][new_state].opt_param[qp-&gt;ibqp.qp_type];<br />&gt; +<br />&gt; + if ((req_param &amp; attr_mask) != req_param)<br />&gt; + goto inval;<br />&gt; +<br />&gt; + if (attr_mask &amp; ~(req_param | opt_param | IB_QP_STATE))<br />&gt; + goto inval;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_PKEY_INDEX) {<br />&gt; + struct ipath_ibdev *dev = to_idev(ibqp-&gt;device);<br />&gt; +<br />&gt; + if (attr-&gt;pkey_index &gt;= ipath_layer_get_npkeys(dev-&gt;ib_unit))<br />&gt; + goto inval;<br />&gt; + qp-&gt;s_pkey_index = attr-&gt;pkey_index;<br />&gt; + }<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_DEST_QPN)<br />&gt; + qp-&gt;remote_qpn = attr-&gt;dest_qp_num;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_SQ_PSN) {<br />&gt; + qp-&gt;s_next_psn = attr-&gt;sq_psn;<br />&gt; + qp-&gt;s_last_psn = qp-&gt;s_next_psn - 1;<br />&gt; + }<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_RQ_PSN)<br />&gt; + qp-&gt;r_psn = attr-&gt;rq_psn;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_ACCESS_FLAGS)<br />&gt; + qp-&gt;qp_access_flags = attr-&gt;qp_access_flags;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_AV)<br />&gt; + qp-&gt;remote_ah_attr = attr-&gt;ah_attr;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_PATH_MTU)<br />&gt; + qp-&gt;path_mtu = attr-&gt;path_mtu;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_RETRY_CNT)<br />&gt; + qp-&gt;s_retry = qp-&gt;s_retry_cnt = attr-&gt;retry_cnt;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_RNR_RETRY) {<br />&gt; + qp-&gt;s_rnr_retry = attr-&gt;rnr_retry;<br />&gt; + if (qp-&gt;s_rnr_retry &gt; 7)<br />&gt; + qp-&gt;s_rnr_retry = 7;<br />&gt; + qp-&gt;s_rnr_retry_cnt = qp-&gt;s_rnr_retry;<br />&gt; + }<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_MIN_RNR_TIMER)<br />&gt; + qp-&gt;s_min_rnr_timer = attr-&gt;min_rnr_timer &amp; 0x1F;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_QKEY)<br />&gt; + qp-&gt;qkey = attr-&gt;qkey;<br />&gt; +<br />&gt; + if (attr_mask &amp; IB_QP_PKEY_INDEX)<br />&gt; + qp-&gt;s_pkey_index = attr-&gt;pkey_index;<br />&gt; +<br />&gt; + qp-&gt;state = new_state;<br />&gt; + spin_unlock(&amp;qp-&gt;s_lock);<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; +<br />&gt; + /*<br />&gt; + * Try to move to ARMED if QP1 changed to the RTS state.<br />&gt; + */<br />&gt; + if (qp-&gt;ibqp.qp_num == 1 &amp;&amp; new_state == IB_QPS_RTS) {<br />&gt; + struct ipath_ibdev *dev = to_idev(ibqp-&gt;device);<br />&gt; +<br />&gt; + /*<br />&gt; + * Bounce the link even if it was active so the SM will<br />&gt; + * reinitialize the SMA's state.<br />&gt; + */<br />&gt; + ipath_kset_linkstate((dev-&gt;ib_unit &lt;&lt; 16) | IPATH_IB_LINKDOWN);<br />&gt; + ipath_kset_linkstate((dev-&gt;ib_unit &lt;&lt; 16) | IPATH_IB_LINKARM);<br />&gt; + }<br />&gt; + return 0;<br />&gt; +<br />&gt; +inval:<br />&gt; + spin_unlock(&amp;qp-&gt;s_lock);<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + return -EINVAL;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Compute the AETH (syndrome + MSN).<br />&gt; + * The QP s_lock should be held.<br />&gt; + */<br />&gt; +static u32 ipath_compute_aeth(struct ipath_qp *qp)<br />&gt; +{<br />&gt; + u32 aeth = atomic_read(&amp;qp-&gt;msn) &amp; 0xFFFFFF;<br />&gt; +<br />&gt; + if (qp-&gt;s_nak_state) {<br />&gt; + aeth |= qp-&gt;s_nak_state &lt;&lt; 24;<br />&gt; + } else if (qp-&gt;ibqp.srq) {<br />&gt; + /* Shared receive queues don't generate credits. */<br />&gt; + aeth |= 0x1F &lt;&lt; 24;<br />&gt; + } else {<br />&gt; + u32 min, max, x;<br />&gt; + u32 credits;<br />&gt; +<br />&gt; + /*<br />&gt; + * Compute the number of credits available (RWQEs).<br />&gt; + * XXX Not holding the r_rq.lock here so there is a small<br />&gt; + * chance that the pair of reads are not atomic.<br />&gt; + */<br />&gt; + credits = qp-&gt;r_rq.head - qp-&gt;r_rq.tail;<br />&gt; + if ((int)credits &lt; 0)<br />&gt; + credits += qp-&gt;r_rq.size;<br />&gt; + /* Binary search the credit table to find the code to use. */<br />&gt; + min = 0;<br />&gt; + max = 31;<br />&gt; + for (;;) {<br />&gt; + x = (min + max) / 2;<br />&gt; + if (credit_table[x] == credits)<br />&gt; + break;<br />&gt; + if (credit_table[x] &gt; credits)<br />&gt; + max = x;<br />&gt; + else if (min == x)<br />&gt; + break;<br />&gt; + else<br />&gt; + min = x;<br />&gt; + }<br />&gt; + aeth |= x &lt;&lt; 24;<br />&gt; + }<br />&gt; + return cpu_to_be32(aeth);<br />&gt; +}<br />&gt; +<br />&gt; +<br />&gt; +static void no_bufs_available(struct ipath_qp *qp, struct ipath_ibdev *dev)<br />&gt; +{<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;dev-&gt;pending_lock, flags);<br />&gt; + if (qp-&gt;piowait.next == LIST_POISON1)<br />&gt; + list_add_tail(&amp;qp-&gt;piowait, &amp;dev-&gt;piowait);<br />&gt; + spin_unlock_irqrestore(&amp;dev-&gt;pending_lock, flags);<br />&gt; + /*<br />&gt; + * Note that as soon as ipath_layer_want_buffer() is called and<br />&gt; + * possibly before it returns, ipath_ib_piobufavail()<br />&gt; + * could be called. If we are still in the tasklet function,<br />&gt; + * tasklet_schedule() will not call us until the next time<br />&gt; + * tasklet_schedule() is called.<br />&gt; + * We clear the tasklet flag now since we are committing to return<br />&gt; + * from the tasklet function.<br />&gt; + */<br />&gt; + tasklet_unlock(&amp;qp-&gt;s_task);<br />&gt; + ipath_layer_want_buffer(dev-&gt;ib_unit);<br />&gt; + dev-&gt;n_piowait++;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Process entries in the send work queue until the queue is exhausted.<br />&gt; + * Only allow one CPU to send a packet per QP (tasklet).<br />&gt; + * Otherwise, after we drop the QP lock, two threads could send<br />&gt; + * packets out of order.<br />&gt; + * This is similar to do_rc_send() below except we don't have timeouts or<br />&gt; + * resends.<br />&gt; + */<br />&gt; +static void do_uc_send(unsigned long data)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = (struct ipath_qp *)data;<br />&gt; + struct ipath_ibdev *dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + struct ipath_swqe *wqe;<br />&gt; + unsigned long flags;<br />&gt; + u16 lrh0;<br />&gt; + u32 hwords;<br />&gt; + u32 nwords;<br />&gt; + u32 extra_bytes;<br />&gt; + u32 bth0;<br />&gt; + u32 bth2;<br />&gt; + u32 pmtu = ib_mtu_enum_to_int(qp-&gt;path_mtu);<br />&gt; + u32 len;<br />&gt; + struct ipath_other_headers *ohdr;<br />&gt; + struct ib_wc wc;<br />&gt; +<br />&gt; + if (test_and_set_bit(IPATH_S_BUSY, &amp;qp-&gt;s_flags))<br />&gt; + return;<br />&gt; +<br />&gt; + if (unlikely(qp-&gt;remote_ah_attr.dlid ==<br />&gt; + ipath_layer_get_lid(dev-&gt;ib_unit))) {<br />&gt; + /* Pass in an uninitialized ib_wc to save stack space. */<br />&gt; + ipath_ruc_loopback(qp, &amp;wc);<br />&gt; + clear_bit(IPATH_S_BUSY, &amp;qp-&gt;s_flags);<br />&gt; + return;<br />&gt; + }<br />&gt; +<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.oth;<br />&gt; + if (qp-&gt;remote_ah_attr.ah_flags &amp; IB_AH_GRH)<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.l.oth;<br />&gt; +<br />&gt; +again:<br />&gt; + /* Check for a constructed packet to be sent. */<br />&gt; + if (qp-&gt;s_hdrwords != 0) {<br />&gt; + /*<br />&gt; + * If no PIO bufs are available, return.<br />&gt; + * An interrupt will call ipath_ib_piobufavail()<br />&gt; + * when one is available.<br />&gt; + */<br />&gt; + if (ipath_verbs_send(dev-&gt;ib_unit, qp-&gt;s_hdrwords,<br />&gt; + (uint32_t *) &amp;qp-&gt;s_hdr,<br />&gt; + qp-&gt;s_cur_size, qp-&gt;s_cur_sge)) {<br />&gt; + no_bufs_available(qp, dev);<br />&gt; + return;<br />&gt; + }<br />&gt; + /* Record that we sent the packet and s_hdr is empty. */<br />&gt; + qp-&gt;s_hdrwords = 0;<br />&gt; + }<br />&gt; +<br />&gt; + lrh0 = IPS_LRH_BTH;<br />&gt; + /* header size in 32-bit words LRH+BTH = (8+12)/4. */<br />&gt; + hwords = 5;<br />&gt; +<br />&gt; + /*<br />&gt; + * The lock is needed to synchronize between<br />&gt; + * setting qp-&gt;s_ack_state and post_send().<br />&gt; + */<br />&gt; + spin_lock_irqsave(&amp;qp-&gt;s_lock, flags);<br />&gt; +<br />&gt; + if (!(state_ops[qp-&gt;state] &amp; IPATH_PROCESS_SEND_OK))<br />&gt; + goto done;<br />&gt; +<br />&gt; + bth0 = ipath_layer_get_pkey(dev-&gt;ib_unit, qp-&gt;s_pkey_index);<br />&gt; +<br />&gt; + /* Send a request. */<br />&gt; + wqe = get_swqe_ptr(qp, qp-&gt;s_last);<br />&gt; + switch (qp-&gt;s_state) {<br />&gt; + default:<br />&gt; + /* Signal the completion of the last send (if there is one). */<br />&gt; + if (qp-&gt;s_last != qp-&gt;s_tail) {<br />&gt; + if (++qp-&gt;s_last == qp-&gt;s_size)<br />&gt; + qp-&gt;s_last = 0;<br />&gt; + if (!test_bit(IPATH_S_SIGNAL_REQ_WR, &amp;qp-&gt;s_flags) ||<br />&gt; + (wqe-&gt;wr.send_flags &amp; IB_SEND_SIGNALED)) {<br />&gt; + wc.wr_id = wqe-&gt;wr.wr_id;<br />&gt; + wc.status = IB_WC_SUCCESS;<br />&gt; + wc.opcode = wc_opcode[wqe-&gt;wr.opcode];<br />&gt; + wc.vendor_err = 0;<br />&gt; + wc.byte_len = wqe-&gt;length;<br />&gt; + wc.qp_num = qp-&gt;ibqp.qp_num;<br />&gt; + wc.src_qp = qp-&gt;remote_qpn;<br />&gt; + wc.pkey_index = 0;<br />&gt; + wc.slid = qp-&gt;remote_ah_attr.dlid;<br />&gt; + wc.sl = qp-&gt;remote_ah_attr.sl;<br />&gt; + wc.dlid_path_bits = 0;<br />&gt; + wc.port_num = 0;<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.send_cq), &amp;wc,<br />&gt; + 0);<br />&gt; + }<br />&gt; + wqe = get_swqe_ptr(qp, qp-&gt;s_last);<br />&gt; + }<br />&gt; + /* Check if send work queue is empty. */<br />&gt; + if (qp-&gt;s_tail == qp-&gt;s_head)<br />&gt; + goto done;<br />&gt; + /*<br />&gt; + * Start a new request.<br />&gt; + */<br />&gt; + qp-&gt;s_psn = wqe-&gt;psn = qp-&gt;s_next_psn;<br />&gt; + qp-&gt;s_sge.sge = wqe-&gt;sg_list[0];<br />&gt; + qp-&gt;s_sge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + qp-&gt;s_sge.num_sge = wqe-&gt;wr.num_sge;<br />&gt; + qp-&gt;s_len = len = wqe-&gt;length;<br />&gt; + switch (wqe-&gt;wr.opcode) {<br />&gt; + case IB_WR_SEND:<br />&gt; + case IB_WR_SEND_WITH_IMM:<br />&gt; + if (len &gt; pmtu) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_SEND_FIRST;<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_SEND) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_SEND_ONLY;<br />&gt; + } else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_UC_SEND_ONLY_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the BTH */<br />&gt; + ohdr-&gt;u.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_WR_RDMA_WRITE:<br />&gt; + case IB_WR_RDMA_WRITE_WITH_IMM:<br />&gt; + ohdr-&gt;u.rc.reth.vaddr =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.rdma.remote_addr);<br />&gt; + ohdr-&gt;u.rc.reth.rkey =<br />&gt; + cpu_to_be32(wqe-&gt;wr.wr.rdma.rkey);<br />&gt; + ohdr-&gt;u.rc.reth.length = cpu_to_be32(len);<br />&gt; + hwords += sizeof(struct ib_reth) / 4;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_RDMA_WRITE_FIRST;<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_RDMA_WRITE) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_RDMA_WRITE_ONLY;<br />&gt; + } else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_UC_RDMA_WRITE_ONLY_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the RETH */<br />&gt; + ohdr-&gt;u.rc.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + }<br />&gt; + break;<br />&gt; +<br />&gt; + default:<br />&gt; + goto done;<br />&gt; + }<br />&gt; + if (++qp-&gt;s_tail &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_tail = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_UC_SEND_FIRST:<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_SEND_MIDDLE;<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_UC_SEND_MIDDLE:<br />&gt; + len = qp-&gt;s_len;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_SEND)<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_SEND_LAST;<br />&gt; + else {<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_SEND_LAST_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the BTH */<br />&gt; + ohdr-&gt;u.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_UC_RDMA_WRITE_FIRST:<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_RDMA_WRITE_MIDDLE;<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_UC_RDMA_WRITE_MIDDLE:<br />&gt; + len = qp-&gt;s_len;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_RDMA_WRITE)<br />&gt; + qp-&gt;s_state = IB_OPCODE_UC_RDMA_WRITE_LAST;<br />&gt; + else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_UC_RDMA_WRITE_LAST_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the BTH */<br />&gt; + ohdr-&gt;u.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + }<br />&gt; + break;<br />&gt; + }<br />&gt; + bth2 = qp-&gt;s_next_psn++ &amp; 0xFFFFFF;<br />&gt; + qp-&gt;s_len -= len;<br />&gt; + bth0 |= qp-&gt;s_state &lt;&lt; 24;<br />&gt; +<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; +<br />&gt; + /* Construct the header. */<br />&gt; + extra_bytes = (4 - len) &amp; 3;<br />&gt; + nwords = (len + extra_bytes) &gt;&gt; 2;<br />&gt; + if (unlikely(qp-&gt;remote_ah_attr.ah_flags &amp; IB_AH_GRH)) {<br />&gt; + /* Header size in 32-bit words. */<br />&gt; + hwords += 10;<br />&gt; + lrh0 = IPS_LRH_GRH;<br />&gt; + qp-&gt;s_hdr.u.l.grh.version_tclass_flow =<br />&gt; + cpu_to_be32((6 &lt;&lt; 28) |<br />&gt; + (qp-&gt;remote_ah_attr.grh.traffic_class &lt;&lt; 20) |<br />&gt; + qp-&gt;remote_ah_attr.grh.flow_label);<br />&gt; + qp-&gt;s_hdr.u.l.grh.paylen =<br />&gt; + cpu_to_be16(((hwords - 12) + nwords + SIZE_OF_CRC) &lt;&lt; 2);<br />&gt; + qp-&gt;s_hdr.u.l.grh.next_hdr = 0x1B;<br />&gt; + qp-&gt;s_hdr.u.l.grh.hop_limit = qp-&gt;remote_ah_attr.grh.hop_limit;<br />&gt; + /* The SGID is 32-bit aligned. */<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.subnet_prefix = dev-&gt;gid_prefix;<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.interface_id =<br />&gt; + ipath_layer_get_guid(dev-&gt;ib_unit);<br />&gt; + qp-&gt;s_hdr.u.l.grh.dgid = qp-&gt;remote_ah_attr.grh.dgid;<br />&gt; + }<br />&gt; + qp-&gt;s_hdrwords = hwords;<br />&gt; + qp-&gt;s_cur_sge = &amp;qp-&gt;s_sge;<br />&gt; + qp-&gt;s_cur_size = len;<br />&gt; + lrh0 |= qp-&gt;remote_ah_attr.sl &lt;&lt; 4;<br />&gt; + qp-&gt;s_hdr.lrh[0] = cpu_to_be16(lrh0);<br />&gt; + /* DEST LID */<br />&gt; + qp-&gt;s_hdr.lrh[1] = cpu_to_be16(qp-&gt;remote_ah_attr.dlid);<br />&gt; + qp-&gt;s_hdr.lrh[2] = cpu_to_be16(hwords + nwords + SIZE_OF_CRC);<br />&gt; + qp-&gt;s_hdr.lrh[3] = cpu_to_be16(ipath_layer_get_lid(dev-&gt;ib_unit));<br />&gt; + bth0 |= extra_bytes &lt;&lt; 20;<br />&gt; + ohdr-&gt;bth[0] = cpu_to_be32(bth0);<br />&gt; + ohdr-&gt;bth[1] = cpu_to_be32(qp-&gt;remote_qpn);<br />&gt; + ohdr-&gt;bth[2] = cpu_to_be32(bth2);<br />&gt; +<br />&gt; + /* Check for more work to do. */<br />&gt; + goto again;<br />&gt; +<br />&gt; +done:<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; + clear_bit(IPATH_S_BUSY, &amp;qp-&gt;s_flags);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Process entries in the send work queue until credit or queue is exhausted.<br />&gt; + * Only allow one CPU to send a packet per QP (tasklet).<br />&gt; + * Otherwise, after we drop the QP s_lock, two threads could send<br />&gt; + * packets out of order.<br />&gt; + */<br />&gt; +static void do_rc_send(unsigned long data)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = (struct ipath_qp *)data;<br />&gt; + struct ipath_ibdev *dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + struct ipath_swqe *wqe;<br />&gt; + struct ipath_sge_state *ss;<br />&gt; + unsigned long flags;<br />&gt; + u16 lrh0;<br />&gt; + u32 hwords;<br />&gt; + u32 nwords;<br />&gt; + u32 extra_bytes;<br />&gt; + u32 bth0;<br />&gt; + u32 bth2;<br />&gt; + u32 pmtu = ib_mtu_enum_to_int(qp-&gt;path_mtu);<br />&gt; + u32 len;<br />&gt; + struct ipath_other_headers *ohdr;<br />&gt; + char newreq;<br />&gt; +<br />&gt; + if (test_and_set_bit(IPATH_S_BUSY, &amp;qp-&gt;s_flags))<br />&gt; + return;<br />&gt; +<br />&gt; + if (unlikely(qp-&gt;remote_ah_attr.dlid ==<br />&gt; + ipath_layer_get_lid(dev-&gt;ib_unit))) {<br />&gt; + struct ib_wc wc;<br />&gt; +<br />&gt; + /*<br />&gt; + * Pass in an uninitialized ib_wc to be consistent with<br />&gt; + * other places where ipath_ruc_loopback() is called.<br />&gt; + */<br />&gt; + ipath_ruc_loopback(qp, &amp;wc);<br />&gt; + clear_bit(IPATH_S_BUSY, &amp;qp-&gt;s_flags);<br />&gt; + return;<br />&gt; + }<br />&gt; +<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.oth;<br />&gt; + if (qp-&gt;remote_ah_attr.ah_flags &amp; IB_AH_GRH)<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.l.oth;<br />&gt; +<br />&gt; +again:<br />&gt; + /* Check for a constructed packet to be sent. */<br />&gt; + if (qp-&gt;s_hdrwords != 0) {<br />&gt; + /*<br />&gt; + * If no PIO bufs are available, return.<br />&gt; + * An interrupt will call ipath_ib_piobufavail()<br />&gt; + * when one is available.<br />&gt; + */<br />&gt; + if (ipath_verbs_send(dev-&gt;ib_unit, qp-&gt;s_hdrwords,<br />&gt; + (uint32_t *) &amp;qp-&gt;s_hdr,<br />&gt; + qp-&gt;s_cur_size, qp-&gt;s_cur_sge)) {<br />&gt; + no_bufs_available(qp, dev);<br />&gt; + return;<br />&gt; + }<br />&gt; + /* Record that we sent the packet and s_hdr is empty. */<br />&gt; + qp-&gt;s_hdrwords = 0;<br />&gt; + }<br />&gt; +<br />&gt; + lrh0 = IPS_LRH_BTH;<br />&gt; + /* header size in 32-bit words LRH+BTH = (8+12)/4. */<br />&gt; + hwords = 5;<br />&gt; +<br />&gt; + /*<br />&gt; + * The lock is needed to synchronize between<br />&gt; + * setting qp-&gt;s_ack_state, resend timer, and post_send().<br />&gt; + */<br />&gt; + spin_lock_irqsave(&amp;qp-&gt;s_lock, flags);<br />&gt; +<br />&gt; + bth0 = ipath_layer_get_pkey(dev-&gt;ib_unit, qp-&gt;s_pkey_index);<br />&gt; +<br />&gt; + /* Sending responses has higher priority over sending requests. */<br />&gt; + if (qp-&gt;s_ack_state != IB_OPCODE_RC_ACKNOWLEDGE) {<br />&gt; + /*<br />&gt; + * Send a response.<br />&gt; + * Note that we are in the responder's side of the QP context.<br />&gt; + */<br />&gt; + switch (qp-&gt;s_ack_state) {<br />&gt; + case IB_OPCODE_RC_RDMA_READ_REQUEST:<br />&gt; + ss = &amp;qp-&gt;s_rdma_sge;<br />&gt; + len = qp-&gt;s_rdma_len;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + len = pmtu;<br />&gt; + qp-&gt;s_ack_state =<br />&gt; + IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST;<br />&gt; + } else {<br />&gt; + qp-&gt;s_ack_state =<br />&gt; + IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY;<br />&gt; + }<br />&gt; + qp-&gt;s_rdma_len -= len;<br />&gt; + bth0 |= qp-&gt;s_ack_state &lt;&lt; 24;<br />&gt; + ohdr-&gt;u.aeth = ipath_compute_aeth(qp);<br />&gt; + hwords++;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST:<br />&gt; + qp-&gt;s_ack_state =<br />&gt; + IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE;<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:<br />&gt; + ss = &amp;qp-&gt;s_rdma_sge;<br />&gt; + len = qp-&gt;s_rdma_len;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + len = pmtu;<br />&gt; + } else {<br />&gt; + ohdr-&gt;u.aeth = ipath_compute_aeth(qp);<br />&gt; + hwords++;<br />&gt; + qp-&gt;s_ack_state =<br />&gt; + IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST;<br />&gt; + }<br />&gt; + qp-&gt;s_rdma_len -= len;<br />&gt; + bth0 |= qp-&gt;s_ack_state &lt;&lt; 24;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST:<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY:<br />&gt; + /*<br />&gt; + * We have to prevent new requests from changing<br />&gt; + * the r_sge state while a ipath_verbs_send()<br />&gt; + * is in progress.<br />&gt; + * Changing r_state allows the receiver<br />&gt; + * to continue processing new packets.<br />&gt; + * We do it here now instead of above so<br />&gt; + * that we are sure the packet was sent before<br />&gt; + * changing the state.<br />&gt; + */<br />&gt; + qp-&gt;r_state = IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST;<br />&gt; + qp-&gt;s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE;<br />&gt; + goto send_req;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_COMPARE_SWAP:<br />&gt; + case IB_OPCODE_RC_FETCH_ADD:<br />&gt; + ss = NULL;<br />&gt; + len = 0;<br />&gt; + qp-&gt;r_state = IB_OPCODE_RC_SEND_LAST;<br />&gt; + qp-&gt;s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE;<br />&gt; + bth0 |= IB_OPCODE_ATOMIC_ACKNOWLEDGE &lt;&lt; 24;<br />&gt; + ohdr-&gt;u.at.aeth = ipath_compute_aeth(qp);<br />&gt; + ohdr-&gt;u.at.atomic_ack_eth =<br />&gt; + cpu_to_be64(qp-&gt;s_ack_atomic);<br />&gt; + hwords += sizeof(ohdr-&gt;u.at) / 4;<br />&gt; + break;<br />&gt; +<br />&gt; + default:<br />&gt; + /* Send a regular ACK. */<br />&gt; + ss = NULL;<br />&gt; + len = 0;<br />&gt; + qp-&gt;s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE;<br />&gt; + bth0 |= qp-&gt;s_ack_state &lt;&lt; 24;<br />&gt; + ohdr-&gt;u.aeth = ipath_compute_aeth(qp);<br />&gt; + hwords++;<br />&gt; + }<br />&gt; + bth2 = qp-&gt;s_ack_psn++ &amp; 0xFFFFFF;<br />&gt; + } else {<br />&gt; + send_req:<br />&gt; + if (!(state_ops[qp-&gt;state] &amp; IPATH_PROCESS_SEND_OK) ||<br />&gt; + qp-&gt;s_rnr_timeout)<br />&gt; + goto done;<br />&gt; +<br />&gt; + /* Send a request. */<br />&gt; + wqe = get_swqe_ptr(qp, qp-&gt;s_cur);<br />&gt; + switch (qp-&gt;s_state) {<br />&gt; + default:<br />&gt; + /*<br />&gt; + * Resend an old request or start a new one.<br />&gt; + *<br />&gt; + * We keep track of the current SWQE so that<br />&gt; + * we don't reset the "furthest progress" state<br />&gt; + * if we need to back up.<br />&gt; + */<br />&gt; + newreq = 0;<br />&gt; + if (qp-&gt;s_cur == qp-&gt;s_tail) {<br />&gt; + /* Check if send work queue is empty. */<br />&gt; + if (qp-&gt;s_tail == qp-&gt;s_head)<br />&gt; + goto done;<br />&gt; + qp-&gt;s_psn = wqe-&gt;psn = qp-&gt;s_next_psn;<br />&gt; + newreq = 1;<br />&gt; + }<br />&gt; + /*<br />&gt; + * Note that we have to be careful not to modify the<br />&gt; + * original work request since we may need to resend<br />&gt; + * it.<br />&gt; + */<br />&gt; + qp-&gt;s_sge.sge = wqe-&gt;sg_list[0];<br />&gt; + qp-&gt;s_sge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + qp-&gt;s_sge.num_sge = wqe-&gt;wr.num_sge;<br />&gt; + qp-&gt;s_len = len = wqe-&gt;length;<br />&gt; + ss = &amp;qp-&gt;s_sge;<br />&gt; + bth2 = 0;<br />&gt; + switch (wqe-&gt;wr.opcode) {<br />&gt; + case IB_WR_SEND:<br />&gt; + case IB_WR_SEND_WITH_IMM:<br />&gt; + /* If no credit, return. */<br />&gt; + if (qp-&gt;s_lsn != (u32) -1 &amp;&amp;<br />&gt; + cmp24(wqe-&gt;ssn, qp-&gt;s_lsn + 1) &gt; 0) {<br />&gt; + goto done;<br />&gt; + }<br />&gt; + wqe-&gt;lpsn = wqe-&gt;psn;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + wqe-&gt;lpsn += (len - 1) / pmtu;<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_FIRST;<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_SEND) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_ONLY;<br />&gt; + } else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_RC_SEND_ONLY_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the BTH */<br />&gt; + ohdr-&gt;u.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + bth2 = 1 &lt;&lt; 31; /* Request ACK. */<br />&gt; + if (++qp-&gt;s_cur == qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_WR_RDMA_WRITE:<br />&gt; + if (newreq)<br />&gt; + qp-&gt;s_lsn++;<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_WR_RDMA_WRITE_WITH_IMM:<br />&gt; + /* If no credit, return. */<br />&gt; + if (qp-&gt;s_lsn != (u32) -1 &amp;&amp;<br />&gt; + cmp24(wqe-&gt;ssn, qp-&gt;s_lsn + 1) &gt; 0) {<br />&gt; + goto done;<br />&gt; + }<br />&gt; + ohdr-&gt;u.rc.reth.vaddr =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.rdma.remote_addr);<br />&gt; + ohdr-&gt;u.rc.reth.rkey =<br />&gt; + cpu_to_be32(wqe-&gt;wr.wr.rdma.rkey);<br />&gt; + ohdr-&gt;u.rc.reth.length = cpu_to_be32(len);<br />&gt; + hwords += sizeof(struct ib_reth) / 4;<br />&gt; + wqe-&gt;lpsn = wqe-&gt;psn;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + wqe-&gt;lpsn += (len - 1) / pmtu;<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_RC_RDMA_WRITE_FIRST;<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_RDMA_WRITE) {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_RC_RDMA_WRITE_ONLY;<br />&gt; + } else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_RC_RDMA_WRITE_ONLY_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after RETH */<br />&gt; + ohdr-&gt;u.rc.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + if (wqe-&gt;wr.<br />&gt; + send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + }<br />&gt; + bth2 = 1 &lt;&lt; 31; /* Request ACK. */<br />&gt; + if (++qp-&gt;s_cur == qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_WR_RDMA_READ:<br />&gt; + ohdr-&gt;u.rc.reth.vaddr =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.rdma.remote_addr);<br />&gt; + ohdr-&gt;u.rc.reth.rkey =<br />&gt; + cpu_to_be32(wqe-&gt;wr.wr.rdma.rkey);<br />&gt; + ohdr-&gt;u.rc.reth.length = cpu_to_be32(len);<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_READ_REQUEST;<br />&gt; + hwords += sizeof(ohdr-&gt;u.rc.reth) / 4;<br />&gt; + if (newreq) {<br />&gt; + qp-&gt;s_lsn++;<br />&gt; + /*<br />&gt; + * Adjust s_next_psn to count the<br />&gt; + * expected number of responses.<br />&gt; + */<br />&gt; + if (len &gt; pmtu)<br />&gt; + qp-&gt;s_next_psn +=<br />&gt; + (len - 1) / pmtu;<br />&gt; + wqe-&gt;lpsn = qp-&gt;s_next_psn++;<br />&gt; + }<br />&gt; + ss = NULL;<br />&gt; + len = 0;<br />&gt; + if (++qp-&gt;s_cur == qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_WR_ATOMIC_CMP_AND_SWP:<br />&gt; + case IB_WR_ATOMIC_FETCH_AND_ADD:<br />&gt; + qp-&gt;s_state =<br />&gt; + wqe-&gt;wr.opcode == IB_WR_ATOMIC_CMP_AND_SWP ?<br />&gt; + IB_OPCODE_RC_COMPARE_SWAP :<br />&gt; + IB_OPCODE_RC_FETCH_ADD;<br />&gt; + ohdr-&gt;u.atomic_eth.vaddr =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.atomic.remote_addr);<br />&gt; + ohdr-&gt;u.atomic_eth.rkey =<br />&gt; + cpu_to_be32(wqe-&gt;wr.wr.atomic.rkey);<br />&gt; + ohdr-&gt;u.atomic_eth.swap_data =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.atomic.swap);<br />&gt; + ohdr-&gt;u.atomic_eth.compare_data =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.atomic.compare_add);<br />&gt; + hwords += sizeof(struct ib_atomic_eth) / 4;<br />&gt; + if (newreq) {<br />&gt; + qp-&gt;s_lsn++;<br />&gt; + wqe-&gt;lpsn = wqe-&gt;psn;<br />&gt; + }<br />&gt; + if (++qp-&gt;s_cur == qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + ss = NULL;<br />&gt; + len = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + default:<br />&gt; + goto done;<br />&gt; + }<br />&gt; + if (newreq) {<br />&gt; + if (++qp-&gt;s_tail &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_tail = 0;<br />&gt; + }<br />&gt; + bth2 |= qp-&gt;s_psn++ &amp; 0xFFFFFF;<br />&gt; + if ((int)(qp-&gt;s_psn - qp-&gt;s_next_psn) &gt; 0)<br />&gt; + qp-&gt;s_next_psn = qp-&gt;s_psn;<br />&gt; + spin_lock(&amp;dev-&gt;pending_lock);<br />&gt; + if (qp-&gt;timerwait.next == LIST_POISON1) {<br />&gt; + list_add_tail(&amp;qp-&gt;timerwait,<br />&gt; + &amp;dev-&gt;pending[dev-&gt;<br />&gt; + pending_index]);<br />&gt; + }<br />&gt; + spin_unlock(&amp;dev-&gt;pending_lock);<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST:<br />&gt; + /*<br />&gt; + * This case can only happen if a send is<br />&gt; + * restarted. See ipath_restart_rc().<br />&gt; + */<br />&gt; + ipath_init_restart(qp, wqe);<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_RC_SEND_FIRST:<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_MIDDLE;<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_RC_SEND_MIDDLE:<br />&gt; + bth2 = qp-&gt;s_psn++ &amp; 0xFFFFFF;<br />&gt; + if ((int)(qp-&gt;s_psn - qp-&gt;s_next_psn) &gt; 0)<br />&gt; + qp-&gt;s_next_psn = qp-&gt;s_psn;<br />&gt; + ss = &amp;qp-&gt;s_sge;<br />&gt; + len = qp-&gt;s_len;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + /*<br />&gt; + * Request an ACK every 1/2 MB to avoid<br />&gt; + * retransmit timeouts.<br />&gt; + */<br />&gt; + if (((wqe-&gt;length - len) % (512 * 1024)) == 0)<br />&gt; + bth2 |= 1 &lt;&lt; 31;<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_SEND)<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_LAST;<br />&gt; + else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_RC_SEND_LAST_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the BTH */<br />&gt; + ohdr-&gt;u.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + bth2 |= 1 &lt;&lt; 31; /* Request ACK. */<br />&gt; + if (++qp-&gt;s_cur &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST:<br />&gt; + /*<br />&gt; + * This case can only happen if a RDMA write is<br />&gt; + * restarted. See ipath_restart_rc().<br />&gt; + */<br />&gt; + ipath_init_restart(qp, wqe);<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_RC_RDMA_WRITE_FIRST:<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_WRITE_MIDDLE;<br />&gt; + /* FALLTHROUGH */<br />&gt; + case IB_OPCODE_RC_RDMA_WRITE_MIDDLE:<br />&gt; + bth2 = qp-&gt;s_psn++ &amp; 0xFFFFFF;<br />&gt; + if ((int)(qp-&gt;s_psn - qp-&gt;s_next_psn) &gt; 0)<br />&gt; + qp-&gt;s_next_psn = qp-&gt;s_psn;<br />&gt; + ss = &amp;qp-&gt;s_sge;<br />&gt; + len = qp-&gt;s_len;<br />&gt; + if (len &gt; pmtu) {<br />&gt; + /*<br />&gt; + * Request an ACK every 1/2 MB to avoid<br />&gt; + * retransmit timeouts.<br />&gt; + */<br />&gt; + if (((wqe-&gt;length - len) % (512 * 1024)) == 0)<br />&gt; + bth2 |= 1 &lt;&lt; 31;<br />&gt; + len = pmtu;<br />&gt; + break;<br />&gt; + }<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_RDMA_WRITE)<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_WRITE_LAST;<br />&gt; + else {<br />&gt; + qp-&gt;s_state =<br />&gt; + IB_OPCODE_RC_RDMA_WRITE_LAST_WITH_IMMEDIATE;<br />&gt; + /* Immediate data comes after the BTH */<br />&gt; + ohdr-&gt;u.imm_data = wqe-&gt;wr.imm_data;<br />&gt; + hwords += 1;<br />&gt; + if (wqe-&gt;wr.send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + }<br />&gt; + bth2 |= 1 &lt;&lt; 31; /* Request ACK. */<br />&gt; + if (++qp-&gt;s_cur &gt;= qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:<br />&gt; + /*<br />&gt; + * This case can only happen if a RDMA read is<br />&gt; + * restarted. See ipath_restart_rc().<br />&gt; + */<br />&gt; + ipath_init_restart(qp, wqe);<br />&gt; + len = ((qp-&gt;s_psn - wqe-&gt;psn) &amp; 0xFFFFFF) * pmtu;<br />&gt; + ohdr-&gt;u.rc.reth.vaddr =<br />&gt; + cpu_to_be64(wqe-&gt;wr.wr.rdma.remote_addr + len);<br />&gt; + ohdr-&gt;u.rc.reth.rkey =<br />&gt; + cpu_to_be32(wqe-&gt;wr.wr.rdma.rkey);<br />&gt; + ohdr-&gt;u.rc.reth.length = cpu_to_be32(qp-&gt;s_len);<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_READ_REQUEST;<br />&gt; + hwords += sizeof(ohdr-&gt;u.rc.reth) / 4;<br />&gt; + bth2 = qp-&gt;s_psn++ &amp; 0xFFFFFF;<br />&gt; + if ((int)(qp-&gt;s_psn - qp-&gt;s_next_psn) &gt; 0)<br />&gt; + qp-&gt;s_next_psn = qp-&gt;s_psn;<br />&gt; + ss = NULL;<br />&gt; + len = 0;<br />&gt; + if (++qp-&gt;s_cur == qp-&gt;s_size)<br />&gt; + qp-&gt;s_cur = 0;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_OPCODE_RC_RDMA_READ_REQUEST:<br />&gt; + case IB_OPCODE_RC_COMPARE_SWAP:<br />&gt; + case IB_OPCODE_RC_FETCH_ADD:<br />&gt; + /*<br />&gt; + * We shouldn't start anything new until this request<br />&gt; + * is finished. The ACK will handle rescheduling us.<br />&gt; + * XXX The number of outstanding ones is negotiated<br />&gt; + * at connection setup time (see pg. 258,289)?<br />&gt; + * XXX Also, if we support multiple outstanding<br />&gt; + * requests, we need to check the WQE IB_SEND_FENCE<br />&gt; + * flag and not send a new request if a RDMA read or<br />&gt; + * atomic is pending.<br />&gt; + */<br />&gt; + goto done;<br />&gt; + }<br />&gt; + qp-&gt;s_len -= len;<br />&gt; + bth0 |= qp-&gt;s_state &lt;&lt; 24;<br />&gt; + /* XXX queue resend timeout. */<br />&gt; + }<br />&gt; + /* Make sure it is non-zero before dropping the lock. */<br />&gt; + qp-&gt;s_hdrwords = hwords;<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; +<br />&gt; + /* Construct the header. */<br />&gt; + extra_bytes = (4 - len) &amp; 3;<br />&gt; + nwords = (len + extra_bytes) &gt;&gt; 2;<br />&gt; + if (unlikely(qp-&gt;remote_ah_attr.ah_flags &amp; IB_AH_GRH)) {<br />&gt; + /* Header size in 32-bit words. */<br />&gt; + hwords += 10;<br />&gt; + lrh0 = IPS_LRH_GRH;<br />&gt; + qp-&gt;s_hdr.u.l.grh.version_tclass_flow =<br />&gt; + cpu_to_be32((6 &lt;&lt; 28) |<br />&gt; + (qp-&gt;remote_ah_attr.grh.traffic_class &lt;&lt; 20) |<br />&gt; + qp-&gt;remote_ah_attr.grh.flow_label);<br />&gt; + qp-&gt;s_hdr.u.l.grh.paylen =<br />&gt; + cpu_to_be16(((hwords - 12) + nwords + SIZE_OF_CRC) &lt;&lt; 2);<br />&gt; + qp-&gt;s_hdr.u.l.grh.next_hdr = 0x1B;<br />&gt; + qp-&gt;s_hdr.u.l.grh.hop_limit = qp-&gt;remote_ah_attr.grh.hop_limit;<br />&gt; + /* The SGID is 32-bit aligned. */<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.subnet_prefix = dev-&gt;gid_prefix;<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.interface_id =<br />&gt; + ipath_layer_get_guid(dev-&gt;ib_unit);<br />&gt; + qp-&gt;s_hdr.u.l.grh.dgid = qp-&gt;remote_ah_attr.grh.dgid;<br />&gt; + qp-&gt;s_hdrwords = hwords;<br />&gt; + }<br />&gt; + qp-&gt;s_cur_sge = ss;<br />&gt; + qp-&gt;s_cur_size = len;<br />&gt; + lrh0 |= qp-&gt;remote_ah_attr.sl &lt;&lt; 4;<br />&gt; + qp-&gt;s_hdr.lrh[0] = cpu_to_be16(lrh0);<br />&gt; + /* DEST LID */<br />&gt; + qp-&gt;s_hdr.lrh[1] = cpu_to_be16(qp-&gt;remote_ah_attr.dlid);<br />&gt; + qp-&gt;s_hdr.lrh[2] = cpu_to_be16(hwords + nwords + SIZE_OF_CRC);<br />&gt; + qp-&gt;s_hdr.lrh[3] = cpu_to_be16(ipath_layer_get_lid(dev-&gt;ib_unit));<br />&gt; + bth0 |= extra_bytes &lt;&lt; 20;<br />&gt; + ohdr-&gt;bth[0] = cpu_to_be32(bth0);<br />&gt; + ohdr-&gt;bth[1] = cpu_to_be32(qp-&gt;remote_qpn);<br />&gt; + ohdr-&gt;bth[2] = cpu_to_be32(bth2);<br />&gt; +<br />&gt; + /* Check for more work to do. */<br />&gt; + goto again;<br />&gt; +<br />&gt; +done:<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; + clear_bit(IPATH_S_BUSY, &amp;qp-&gt;s_flags);<br />&gt; +}<br />&gt; +<br />&gt; +static void send_rc_ack(struct ipath_qp *qp)<br />&gt; +{<br />&gt; + struct ipath_ibdev *dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + u16 lrh0;<br />&gt; + u32 bth0;<br />&gt; + u32 hwords;<br />&gt; + struct ipath_other_headers *ohdr;<br />&gt; +<br />&gt; + /* Construct the header. */<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.oth;<br />&gt; + lrh0 = IPS_LRH_BTH;<br />&gt; + /* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4. */<br />&gt; + hwords = 6;<br />&gt; + if (unlikely(qp-&gt;remote_ah_attr.ah_flags &amp; IB_AH_GRH)) {<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.l.oth;<br />&gt; + /* Header size in 32-bit words. */<br />&gt; + hwords += 10;<br />&gt; + lrh0 = IPS_LRH_GRH;<br />&gt; + qp-&gt;s_hdr.u.l.grh.version_tclass_flow =<br />&gt; + cpu_to_be32((6 &lt;&lt; 28) |<br />&gt; + (qp-&gt;remote_ah_attr.grh.traffic_class &lt;&lt; 20) |<br />&gt; + qp-&gt;remote_ah_attr.grh.flow_label);<br />&gt; + qp-&gt;s_hdr.u.l.grh.paylen =<br />&gt; + cpu_to_be16(((hwords - 12) + SIZE_OF_CRC) &lt;&lt; 2);<br />&gt; + qp-&gt;s_hdr.u.l.grh.next_hdr = 0x1B;<br />&gt; + qp-&gt;s_hdr.u.l.grh.hop_limit = qp-&gt;remote_ah_attr.grh.hop_limit;<br />&gt; + /* The SGID is 32-bit aligned. */<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.subnet_prefix = dev-&gt;gid_prefix;<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.interface_id =<br />&gt; + ipath_layer_get_guid(dev-&gt;ib_unit);<br />&gt; + qp-&gt;s_hdr.u.l.grh.dgid = qp-&gt;remote_ah_attr.grh.dgid;<br />&gt; + }<br />&gt; + bth0 = ipath_layer_get_pkey(dev-&gt;ib_unit, qp-&gt;s_pkey_index);<br />&gt; + ohdr-&gt;u.aeth = ipath_compute_aeth(qp);<br />&gt; + if (qp-&gt;s_ack_state &gt;= IB_OPCODE_RC_COMPARE_SWAP) {<br />&gt; + bth0 |= IB_OPCODE_ATOMIC_ACKNOWLEDGE &lt;&lt; 24;<br />&gt; + ohdr-&gt;u.at.atomic_ack_eth = cpu_to_be64(qp-&gt;s_ack_atomic);<br />&gt; + hwords += sizeof(ohdr-&gt;u.at.atomic_ack_eth) / 4;<br />&gt; + } else {<br />&gt; + bth0 |= IB_OPCODE_RC_ACKNOWLEDGE &lt;&lt; 24;<br />&gt; + }<br />&gt; + lrh0 |= qp-&gt;remote_ah_attr.sl &lt;&lt; 4;<br />&gt; + qp-&gt;s_hdr.lrh[0] = cpu_to_be16(lrh0);<br />&gt; + /* DEST LID */<br />&gt; + qp-&gt;s_hdr.lrh[1] = cpu_to_be16(qp-&gt;remote_ah_attr.dlid);<br />&gt; + qp-&gt;s_hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);<br />&gt; + qp-&gt;s_hdr.lrh[3] = cpu_to_be16(ipath_layer_get_lid(dev-&gt;ib_unit));<br />&gt; + ohdr-&gt;bth[0] = cpu_to_be32(bth0);<br />&gt; + ohdr-&gt;bth[1] = cpu_to_be32(qp-&gt;remote_qpn);<br />&gt; + ohdr-&gt;bth[2] = cpu_to_be32(qp-&gt;s_ack_psn &amp; 0xFFFFFF);<br />&gt; +<br />&gt; + /*<br />&gt; + * If we can send the ACK, clear the ACK state.<br />&gt; + */<br />&gt; + if (ipath_verbs_send(dev-&gt;ib_unit, hwords, (uint32_t *) &amp;qp-&gt;s_hdr,<br />&gt; + 0, NULL) == 0) {<br />&gt; + qp-&gt;s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE;<br />&gt; + dev-&gt;n_rc_qacks++;<br />&gt; + }<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Back up the requester to resend the last un-ACKed request.<br />&gt; + * The QP s_lock should be held.<br />&gt; + */<br />&gt; +static void ipath_restart_rc(struct ipath_qp *qp, u32 psn, struct ib_wc *wc)<br />&gt; +{<br />&gt; + struct ipath_swqe *wqe = get_swqe_ptr(qp, qp-&gt;s_last);<br />&gt; + struct ipath_ibdev *dev;<br />&gt; + u32 n;<br />&gt; +<br />&gt; + /*<br />&gt; + * If there are no requests pending, we are done.<br />&gt; + */<br />&gt; + if (cmp24(psn, qp-&gt;s_next_psn) &gt;= 0 || qp-&gt;s_last == qp-&gt;s_tail)<br />&gt; + goto done;<br />&gt; +<br />&gt; + if (qp-&gt;s_retry == 0) {<br />&gt; + wc-&gt;wr_id = wqe-&gt;wr.wr_id;<br />&gt; + wc-&gt;status = IB_WC_RETRY_EXC_ERR;<br />&gt; + wc-&gt;opcode = wc_opcode[wqe-&gt;wr.opcode];<br />&gt; + wc-&gt;vendor_err = 0;<br />&gt; + wc-&gt;byte_len = 0;<br />&gt; + wc-&gt;qp_num = qp-&gt;ibqp.qp_num;<br />&gt; + wc-&gt;src_qp = qp-&gt;remote_qpn;<br />&gt; + wc-&gt;pkey_index = 0;<br />&gt; + wc-&gt;slid = qp-&gt;remote_ah_attr.dlid;<br />&gt; + wc-&gt;sl = qp-&gt;remote_ah_attr.sl;<br />&gt; + wc-&gt;dlid_path_bits = 0;<br />&gt; + wc-&gt;port_num = 0;<br />&gt; + ipath_sqerror_qp(qp, wc);<br />&gt; + return;<br />&gt; + }<br />&gt; + qp-&gt;s_retry--;<br />&gt; +<br />&gt; + /*<br />&gt; + * Remove the QP from the timeout queue.<br />&gt; + * Note: it may already have been removed by ipath_ib_timer().<br />&gt; + */<br />&gt; + dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + spin_lock(&amp;dev-&gt;pending_lock);<br />&gt; + if (qp-&gt;timerwait.next != LIST_POISON1)<br />&gt; + list_del(&amp;qp-&gt;timerwait);<br />&gt; + spin_unlock(&amp;dev-&gt;pending_lock);<br />&gt; +<br />&gt; + if (wqe-&gt;wr.opcode == IB_WR_RDMA_READ)<br />&gt; + dev-&gt;n_rc_resends++;<br />&gt; + else<br />&gt; + dev-&gt;n_rc_resends += (int)qp-&gt;s_psn - (int)psn;<br />&gt; +<br />&gt; + /*<br />&gt; + * If we are starting the request from the beginning, let the<br />&gt; + * normal send code handle initialization.<br />&gt; + */<br />&gt; + qp-&gt;s_cur = qp-&gt;s_last;<br />&gt; + if (cmp24(psn, wqe-&gt;psn) &lt;= 0) {<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_LAST;<br />&gt; + qp-&gt;s_psn = wqe-&gt;psn;<br />&gt; + } else {<br />&gt; + n = qp-&gt;s_cur;<br />&gt; + for (;;) {<br />&gt; + if (++n == qp-&gt;s_size)<br />&gt; + n = 0;<br />&gt; + if (n == qp-&gt;s_tail) {<br />&gt; + if (cmp24(psn, qp-&gt;s_next_psn) &gt;= 0) {<br />&gt; + qp-&gt;s_cur = n;<br />&gt; + wqe = get_swqe_ptr(qp, n);<br />&gt; + }<br />&gt; + break;<br />&gt; + }<br />&gt; + wqe = get_swqe_ptr(qp, n);<br />&gt; + if (cmp24(psn, wqe-&gt;psn) &lt; 0)<br />&gt; + break;<br />&gt; + qp-&gt;s_cur = n;<br />&gt; + }<br />&gt; + qp-&gt;s_psn = psn;<br />&gt; +<br />&gt; + /*<br />&gt; + * Reset the state to restart in the middle of a request.<br />&gt; + * Don't change the s_sge, s_cur_sge, or s_cur_size.<br />&gt; + * See do_rc_send().<br />&gt; + */<br />&gt; + switch (wqe-&gt;wr.opcode) {<br />&gt; + case IB_WR_SEND:<br />&gt; + case IB_WR_SEND_WITH_IMM:<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_WR_RDMA_WRITE:<br />&gt; + case IB_WR_RDMA_WRITE_WITH_IMM:<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST;<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_WR_RDMA_READ:<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE;<br />&gt; + break;<br />&gt; +<br />&gt; + default:<br />&gt; + /*<br />&gt; + * This case shouldn't happen since its only<br />&gt; + * one PSN per req.<br />&gt; + */<br />&gt; + qp-&gt;s_state = IB_OPCODE_RC_SEND_LAST;<br />&gt; + }<br />&gt; + }<br />&gt; +<br />&gt; +done:<br />&gt; + tasklet_schedule(&amp;qp-&gt;s_task);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Handle RC and UC post sends.<br />&gt; + */<br />&gt; +static int ipath_post_rc_send(struct ipath_qp *qp, struct ib_send_wr *wr)<br />&gt; +{<br />&gt; + struct ipath_swqe *wqe;<br />&gt; + unsigned long flags;<br />&gt; + u32 next;<br />&gt; + int i, j;<br />&gt; + int acc;<br />&gt; +<br />&gt; + /*<br />&gt; + * Don't allow RDMA reads or atomic operations on UC or<br />&gt; + * undefined operations.<br />&gt; + * Make sure buffer is large enough to hold the result for atomics.<br />&gt; + */<br />&gt; + if (qp-&gt;ibqp.qp_type == IB_QPT_UC) {<br />&gt; + if ((unsigned) wr-&gt;opcode &gt;= IB_WR_RDMA_READ)<br />&gt; + return -EINVAL;<br />&gt; + } else if ((unsigned) wr-&gt;opcode &gt; IB_WR_ATOMIC_FETCH_AND_ADD)<br />&gt; + return -EINVAL;<br />&gt; + else if (wr-&gt;opcode &gt;= IB_WR_ATOMIC_CMP_AND_SWP &amp;&amp;<br />&gt; + (wr-&gt;num_sge == 0 || wr-&gt;sg_list[0].length &lt; sizeof(u64) ||<br />&gt; + wr-&gt;sg_list[0].addr &amp; 0x7))<br />&gt; + return -EINVAL;<br />&gt; +<br />&gt; + /* IB spec says that num_sge == 0 is OK. */<br />&gt; + if (wr-&gt;num_sge &gt; qp-&gt;s_max_sge)<br />&gt; + return -ENOMEM;<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;qp-&gt;s_lock, flags);<br />&gt; + next = qp-&gt;s_head + 1;<br />&gt; + if (next &gt;= qp-&gt;s_size)<br />&gt; + next = 0;<br />&gt; + if (next == qp-&gt;s_last) {<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; +<br />&gt; + wqe = get_swqe_ptr(qp, qp-&gt;s_head);<br />&gt; + wqe-&gt;wr = *wr;<br />&gt; + wqe-&gt;ssn = qp-&gt;s_ssn++;<br />&gt; + wqe-&gt;sg_list[0].mr = NULL;<br />&gt; + wqe-&gt;sg_list[0].vaddr = NULL;<br />&gt; + wqe-&gt;sg_list[0].length = 0;<br />&gt; + wqe-&gt;sg_list[0].sge_length = 0;<br />&gt; + wqe-&gt;length = 0;<br />&gt; + acc = wr-&gt;opcode &gt;= IB_WR_RDMA_READ ? IB_ACCESS_LOCAL_WRITE : 0;<br />&gt; + for (i = 0, j = 0; i &lt; wr-&gt;num_sge; i++) {<br />&gt; + if (to_ipd(qp-&gt;ibqp.pd)-&gt;user &amp;&amp; wr-&gt;sg_list[i].lkey == 0) {<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + if (wr-&gt;sg_list[i].length == 0)<br />&gt; + continue;<br />&gt; + if (!ipath_lkey_ok(&amp;to_idev(qp-&gt;ibqp.device)-&gt;lk_table,<br />&gt; + &amp;wqe-&gt;sg_list[j], &amp;wr-&gt;sg_list[i], acc)) {<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + wqe-&gt;length += wr-&gt;sg_list[i].length;<br />&gt; + j++;<br />&gt; + }<br />&gt; + wqe-&gt;wr.num_sge = j;<br />&gt; + qp-&gt;s_head = next;<br />&gt; + /*<br />&gt; + * Wake up the send tasklet if the QP is not waiting<br />&gt; + * for an RNR timeout.<br />&gt; + */<br />&gt; + next = qp-&gt;s_rnr_timeout;<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;s_lock, flags);<br />&gt; +<br />&gt; + if (next == 0) {<br />&gt; + if (qp-&gt;ibqp.qp_type == IB_QPT_UC)<br />&gt; + do_uc_send((unsigned long) qp);<br />&gt; + else<br />&gt; + do_rc_send((unsigned long) qp);<br />&gt; + }<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Note that we actually send the data as it is posted instead of putting<br />&gt; + * the request into a ring buffer. If we wanted to use a ring buffer,<br />&gt; + * we would need to save a reference to the destination address in the SWQE.<br />&gt; + */<br />&gt; +static int ipath_post_ud_send(struct ipath_qp *qp, struct ib_send_wr *wr)<br />&gt; +{<br />&gt; + struct ipath_ibdev *dev = to_idev(qp-&gt;ibqp.device);<br />&gt; + struct ipath_other_headers *ohdr;<br />&gt; + struct ib_ah_attr *ah_attr;<br />&gt; + struct ipath_sge_state ss;<br />&gt; + struct ipath_sge *sg_list;<br />&gt; + struct ib_wc wc;<br />&gt; + u32 hwords;<br />&gt; + u32 nwords;<br />&gt; + u32 len;<br />&gt; + u32 extra_bytes;<br />&gt; + u32 bth0;<br />&gt; + u16 lrh0;<br />&gt; + u16 lid;<br />&gt; + int i;<br />&gt; +<br />&gt; + if (!(state_ops[qp-&gt;state] &amp; IPATH_PROCESS_SEND_OK))<br />&gt; + return 0;<br />&gt; +<br />&gt; + /* IB spec says that num_sge == 0 is OK. */<br />&gt; + if (wr-&gt;num_sge &gt; qp-&gt;s_max_sge)<br />&gt; + return -EINVAL;<br />&gt; +<br />&gt; + if (wr-&gt;num_sge &gt; 1) {<br />&gt; + sg_list = kmalloc((qp-&gt;s_max_sge - 1) * sizeof(*sg_list),<br />&gt; + GFP_ATOMIC);<br />&gt; + if (!sg_list)<br />&gt; + return -ENOMEM;<br />&gt; + } else<br />&gt; + sg_list = NULL;<br />&gt; +<br />&gt; + /* Check the buffer to send. */<br />&gt; + ss.sg_list = sg_list;<br />&gt; + ss.sge.mr = NULL;<br />&gt; + ss.sge.vaddr = NULL;<br />&gt; + ss.sge.length = 0;<br />&gt; + ss.sge.sge_length = 0;<br />&gt; + ss.num_sge = 0;<br />&gt; + len = 0;<br />&gt; + for (i = 0; i &lt; wr-&gt;num_sge; i++) {<br />&gt; + /* Check LKEY */<br />&gt; + if (to_ipd(qp-&gt;ibqp.pd)-&gt;user &amp;&amp; wr-&gt;sg_list[i].lkey == 0)<br />&gt; + return -EINVAL;<br />&gt; +<br />&gt; + if (wr-&gt;sg_list[i].length == 0)<br />&gt; + continue;<br />&gt; + if (!ipath_lkey_ok(&amp;dev-&gt;lk_table, ss.num_sge ?<br />&gt; + sg_list + ss.num_sge : &amp;ss.sge,<br />&gt; + &amp;wr-&gt;sg_list[i], 0)) {<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + len += wr-&gt;sg_list[i].length;<br />&gt; + ss.num_sge++;<br />&gt; + }<br />&gt; + extra_bytes = (4 - len) &amp; 3;<br />&gt; + nwords = (len + extra_bytes) &gt;&gt; 2;<br />&gt; +<br />&gt; + /* Construct the header. */<br />&gt; + ah_attr = &amp;to_iah(wr-&gt;wr.ud.ah)-&gt;attr;<br />&gt; + if (ah_attr-&gt;dlid &gt;= 0xC000 &amp;&amp; ah_attr-&gt;dlid &lt; 0xFFFF)<br />&gt; + dev-&gt;n_multicast_xmit++;<br />&gt; + if (unlikely(ah_attr-&gt;dlid == ipath_layer_get_lid(dev-&gt;ib_unit))) {<br />&gt; + /* Pass in an uninitialized ib_wc to save stack space. */<br />&gt; + ipath_ud_loopback(qp, &amp;ss, len, wr, &amp;wc);<br />&gt; + goto done;<br />&gt; + }<br />&gt; + if (ah_attr-&gt;ah_flags &amp; IB_AH_GRH) {<br />&gt; + /* Header size in 32-bit words. */<br />&gt; + hwords = 17;<br />&gt; + lrh0 = IPS_LRH_GRH;<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.l.oth;<br />&gt; + qp-&gt;s_hdr.u.l.grh.version_tclass_flow =<br />&gt; + cpu_to_be32((6 &lt;&lt; 28) |<br />&gt; + (ah_attr-&gt;grh.traffic_class &lt;&lt; 20) |<br />&gt; + ah_attr-&gt;grh.flow_label);<br />&gt; + qp-&gt;s_hdr.u.l.grh.paylen =<br />&gt; + cpu_to_be16(((wr-&gt;opcode ==<br />&gt; + IB_WR_SEND_WITH_IMM ? 6 : 5) + nwords +<br />&gt; + SIZE_OF_CRC) &lt;&lt; 2);<br />&gt; + qp-&gt;s_hdr.u.l.grh.next_hdr = 0x1B;<br />&gt; + qp-&gt;s_hdr.u.l.grh.hop_limit = ah_attr-&gt;grh.hop_limit;<br />&gt; + /* The SGID is 32-bit aligned. */<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.subnet_prefix = dev-&gt;gid_prefix;<br />&gt; + qp-&gt;s_hdr.u.l.grh.sgid.global.interface_id =<br />&gt; + ipath_layer_get_guid(dev-&gt;ib_unit);<br />&gt; + qp-&gt;s_hdr.u.l.grh.dgid = ah_attr-&gt;grh.dgid;<br />&gt; + /*<br />&gt; + * Don't worry about sending to locally attached<br />&gt; + * multicast QPs. It is unspecified by the spec. what happens.<br />&gt; + */<br />&gt; + } else {<br />&gt; + /* Header size in 32-bit words. */<br />&gt; + hwords = 7;<br />&gt; + lrh0 = IPS_LRH_BTH;<br />&gt; + ohdr = &amp;qp-&gt;s_hdr.u.oth;<br />&gt; + }<br />&gt; + if (wr-&gt;opcode == IB_WR_SEND_WITH_IMM) {<br />&gt; + ohdr-&gt;u.ud.imm_data = wr-&gt;imm_data;<br />&gt; + wc.imm_data = wr-&gt;imm_data;<br />&gt; + hwords += 1;<br />&gt; + bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE &lt;&lt; 24;<br />&gt; + } else if (wr-&gt;opcode == IB_WR_SEND) {<br />&gt; + wc.imm_data = 0;<br />&gt; + bth0 = IB_OPCODE_UD_SEND_ONLY &lt;&lt; 24;<br />&gt; + } else<br />&gt; + return -EINVAL;<br />&gt; + lrh0 |= ah_attr-&gt;sl &lt;&lt; 4;<br />&gt; + if (qp-&gt;ibqp.qp_type == IB_QPT_SMI)<br />&gt; + lrh0 |= 0xF000; /* Set VL */<br />&gt; + qp-&gt;s_hdr.lrh[0] = cpu_to_be16(lrh0);<br />&gt; + qp-&gt;s_hdr.lrh[1] = cpu_to_be16(ah_attr-&gt;dlid); /* DEST LID */<br />&gt; + qp-&gt;s_hdr.lrh[2] = cpu_to_be16(hwords + nwords + SIZE_OF_CRC);<br />&gt; + lid = ipath_layer_get_lid(dev-&gt;ib_unit);<br />&gt; + qp-&gt;s_hdr.lrh[3] = lid ? cpu_to_be16(lid) : IB_LID_PERMISSIVE;<br />&gt; + if (wr-&gt;send_flags &amp; IB_SEND_SOLICITED)<br />&gt; + bth0 |= 1 &lt;&lt; 23;<br />&gt; + bth0 |= extra_bytes &lt;&lt; 20;<br />&gt; + bth0 |= qp-&gt;ibqp.qp_type == IB_QPT_SMI ? IPS_DEFAULT_P_KEY :<br />&gt; + ipath_layer_get_pkey(dev-&gt;ib_unit, qp-&gt;s_pkey_index);<br />&gt; + ohdr-&gt;bth[0] = cpu_to_be32(bth0);<br />&gt; + ohdr-&gt;bth[1] = cpu_to_be32(wr-&gt;wr.ud.remote_qpn);<br />&gt; + /* XXX Could lose a PSN count but not worth locking */<br />&gt; + ohdr-&gt;bth[2] = cpu_to_be32(qp-&gt;s_psn++ &amp; 0xFFFFFF);<br />&gt; + /*<br />&gt; + * Qkeys with the high order bit set mean use the<br />&gt; + * qkey from the QP context instead of the WR.<br />&gt; + */<br />&gt; + ohdr-&gt;u.ud.deth[0] = cpu_to_be32((int)wr-&gt;wr.ud.remote_qkey &lt; 0 ?<br />&gt; + qp-&gt;qkey : wr-&gt;wr.ud.remote_qkey);<br />&gt; + ohdr-&gt;u.ud.deth[1] = cpu_to_be32(qp-&gt;ibqp.qp_num);<br />&gt; + if (ipath_verbs_send(dev-&gt;ib_unit, hwords, (uint32_t *) &amp;qp-&gt;s_hdr,<br />&gt; + len, &amp;ss))<br />&gt; + dev-&gt;n_no_piobuf++;<br />&gt; +<br />&gt; +done:<br />&gt; + /* Queue the completion status entry. */<br />&gt; + if (!test_bit(IPATH_S_SIGNAL_REQ_WR, &amp;qp-&gt;s_flags) ||<br />&gt; + (wr-&gt;send_flags &amp; IB_SEND_SIGNALED)) {<br />&gt; + wc.wr_id = wr-&gt;wr_id;<br />&gt; + wc.status = IB_WC_SUCCESS;<br />&gt; + wc.vendor_err = 0;<br />&gt; + wc.opcode = IB_WC_SEND;<br />&gt; + wc.byte_len = len;<br />&gt; + wc.qp_num = qp-&gt;ibqp.qp_num;<br />&gt; + wc.src_qp = 0;<br />&gt; + wc.wc_flags = 0;<br />&gt; + /* XXX initialize other fields? */<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.send_cq), &amp;wc, 0);<br />&gt; + }<br />&gt; + kfree(sg_list);<br />&gt; +<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * This may be called from interrupt context.<br />&gt; + */<br />&gt; +static int ipath_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,<br />&gt; + struct ib_send_wr **bad_wr)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = to_iqp(ibqp);<br />&gt; + int err = 0;<br />&gt; +<br />&gt; + /* Check that state is OK to post send. */<br />&gt; + if (!(state_ops[qp-&gt;state] &amp; IPATH_POST_SEND_OK)) {<br />&gt; + *bad_wr = wr;<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; +<br />&gt; + for (; wr; wr = wr-&gt;next) {<br />&gt; + switch (qp-&gt;ibqp.qp_type) {<br />&gt; + case IB_QPT_UC:<br />&gt; + case IB_QPT_RC:<br />&gt; + err = ipath_post_rc_send(qp, wr);<br />&gt; + break;<br />&gt; +<br />&gt; + case IB_QPT_SMI:<br />&gt; + case IB_QPT_GSI:<br />&gt; + case IB_QPT_UD:<br />&gt; + err = ipath_post_ud_send(qp, wr);<br />&gt; + break;<br />&gt; +<br />&gt; + default:<br />&gt; + err = -EINVAL;<br />&gt; + }<br />&gt; + if (err) {<br />&gt; + *bad_wr = wr;<br />&gt; + break;<br />&gt; + }<br />&gt; + }<br />&gt; + return err;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * This may be called from interrupt context.<br />&gt; + */<br />&gt; +static int ipath_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,<br />&gt; + struct ib_recv_wr **bad_wr)<br />&gt; +{<br />&gt; + struct ipath_qp *qp = to_iqp(ibqp);<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + /* Check that state is OK to post receive. */<br />&gt; + if (!(state_ops[qp-&gt;state] &amp; IPATH_POST_RECV_OK)) {<br />&gt; + *bad_wr = wr;<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; +<br />&gt; + for (; wr; wr = wr-&gt;next) {<br />&gt; + struct ipath_rwqe *wqe;<br />&gt; + u32 next;<br />&gt; + int i, j;<br />&gt; +<br />&gt; + if (wr-&gt;num_sge &gt; qp-&gt;r_rq.max_sge) {<br />&gt; + *bad_wr = wr;<br />&gt; + return -ENOMEM;<br />&gt; + }<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + next = qp-&gt;r_rq.head + 1;<br />&gt; + if (next &gt;= qp-&gt;r_rq.size)<br />&gt; + next = 0;<br />&gt; + if (next == qp-&gt;r_rq.tail) {<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + *bad_wr = wr;<br />&gt; + return -ENOMEM;<br />&gt; + }<br />&gt; +<br />&gt; + wqe = get_rwqe_ptr(&amp;qp-&gt;r_rq, qp-&gt;r_rq.head);<br />&gt; + wqe-&gt;wr_id = wr-&gt;wr_id;<br />&gt; + wqe-&gt;sg_list[0].mr = NULL;<br />&gt; + wqe-&gt;sg_list[0].vaddr = NULL;<br />&gt; + wqe-&gt;sg_list[0].length = 0;<br />&gt; + wqe-&gt;sg_list[0].sge_length = 0;<br />&gt; + wqe-&gt;length = 0;<br />&gt; + for (i = 0, j = 0; i &lt; wr-&gt;num_sge; i++) {<br />&gt; + /* Check LKEY */<br />&gt; + if (to_ipd(qp-&gt;ibqp.pd)-&gt;user &amp;&amp;<br />&gt; + wr-&gt;sg_list[i].lkey == 0) {<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + *bad_wr = wr;<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + if (wr-&gt;sg_list[i].length == 0)<br />&gt; + continue;<br />&gt; + if (!ipath_lkey_ok(&amp;to_idev(qp-&gt;ibqp.device)-&gt;lk_table,<br />&gt; + &amp;wqe-&gt;sg_list[j], &amp;wr-&gt;sg_list[i],<br />&gt; + IB_ACCESS_LOCAL_WRITE)) {<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + *bad_wr = wr;<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + wqe-&gt;length += wr-&gt;sg_list[i].length;<br />&gt; + j++;<br />&gt; + }<br />&gt; + wqe-&gt;num_sge = j;<br />&gt; + qp-&gt;r_rq.head = next;<br />&gt; + spin_unlock_irqrestore(&amp;qp-&gt;r_rq.lock, flags);<br />&gt; + }<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * This may be called from interrupt context.<br />&gt; + */<br />&gt; +static int ipath_post_srq_receive(struct ib_srq *ibsrq, struct ib_recv_wr *wr,<br />&gt; + struct ib_recv_wr **bad_wr)<br />&gt; +{<br />&gt; + struct ipath_srq *srq = to_isrq(ibsrq);<br />&gt; + struct ipath_ibdev *dev = to_idev(ibsrq-&gt;device);<br />&gt; + unsigned long flags;<br />&gt; +<br />&gt; + for (; wr; wr = wr-&gt;next) {<br />&gt; + struct ipath_rwqe *wqe;<br />&gt; + u32 next;<br />&gt; + int i, j;<br />&gt; +<br />&gt; + if (wr-&gt;num_sge &gt; srq-&gt;rq.max_sge) {<br />&gt; + *bad_wr = wr;<br />&gt; + return -ENOMEM;<br />&gt; + }<br />&gt; +<br />&gt; + spin_lock_irqsave(&amp;srq-&gt;rq.lock, flags);<br />&gt; + next = srq-&gt;rq.head + 1;<br />&gt; + if (next &gt;= srq-&gt;rq.size)<br />&gt; + next = 0;<br />&gt; + if (next == srq-&gt;rq.tail) {<br />&gt; + spin_unlock_irqrestore(&amp;srq-&gt;rq.lock, flags);<br />&gt; + *bad_wr = wr;<br />&gt; + return -ENOMEM;<br />&gt; + }<br />&gt; +<br />&gt; + wqe = get_rwqe_ptr(&amp;srq-&gt;rq, srq-&gt;rq.head);<br />&gt; + wqe-&gt;wr_id = wr-&gt;wr_id;<br />&gt; + wqe-&gt;sg_list[0].mr = NULL;<br />&gt; + wqe-&gt;sg_list[0].vaddr = NULL;<br />&gt; + wqe-&gt;sg_list[0].length = 0;<br />&gt; + wqe-&gt;sg_list[0].sge_length = 0;<br />&gt; + wqe-&gt;length = 0;<br />&gt; + for (i = 0, j = 0; i &lt; wr-&gt;num_sge; i++) {<br />&gt; + /* Check LKEY */<br />&gt; + if (to_ipd(srq-&gt;ibsrq.pd)-&gt;user &amp;&amp;<br />&gt; + wr-&gt;sg_list[i].lkey == 0) {<br />&gt; + spin_unlock_irqrestore(&amp;srq-&gt;rq.lock, flags);<br />&gt; + *bad_wr = wr;<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + if (wr-&gt;sg_list[i].length == 0)<br />&gt; + continue;<br />&gt; + if (!ipath_lkey_ok(&amp;dev-&gt;lk_table,<br />&gt; + &amp;wqe-&gt;sg_list[j], &amp;wr-&gt;sg_list[i],<br />&gt; + IB_ACCESS_LOCAL_WRITE)) {<br />&gt; + spin_unlock_irqrestore(&amp;srq-&gt;rq.lock, flags);<br />&gt; + *bad_wr = wr;<br />&gt; + return -EINVAL;<br />&gt; + }<br />&gt; + wqe-&gt;length += wr-&gt;sg_list[i].length;<br />&gt; + j++;<br />&gt; + }<br />&gt; + wqe-&gt;num_sge = j;<br />&gt; + srq-&gt;rq.head = next;<br />&gt; + spin_unlock_irqrestore(&amp;srq-&gt;rq.lock, flags);<br />&gt; + }<br />&gt; + return 0;<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * This is called from ipath_qp_rcv() to process an incomming UD packet<br />&gt; + * for the given QP.<br />&gt; + * Called at interrupt level.<br />&gt; + */<br />&gt; +static void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr,<br />&gt; + int has_grh, void *data, u32 tlen, struct ipath_qp *qp)<br />&gt; +{<br />&gt; + struct ipath_other_headers *ohdr;<br />&gt; + int opcode;<br />&gt; + u32 hdrsize;<br />&gt; + u32 pad;<br />&gt; + unsigned long flags;<br />&gt; + struct ib_wc wc;<br />&gt; + u32 qkey;<br />&gt; + u32 src_qp;<br />&gt; + struct ipath_rq *rq;<br />&gt; + struct ipath_srq *srq;<br />&gt; + struct ipath_rwqe *wqe;<br />&gt; +<br />&gt; + /* Check for GRH */<br />&gt; + if (!has_grh) {<br />&gt; + ohdr = &amp;hdr-&gt;u.oth;<br />&gt; + hdrsize = 8 + 12 + 8; /* LRH + BTH + DETH */<br />&gt; + qkey = be32_to_cpu(ohdr-&gt;u.ud.deth[0]);<br />&gt; + src_qp = be32_to_cpu(ohdr-&gt;u.ud.deth[1]);<br />&gt; + } else {<br />&gt; + ohdr = &amp;hdr-&gt;u.l.oth;<br />&gt; + hdrsize = 8 + 40 + 12 + 8; /* LRH + GRH + BTH + DETH */<br />&gt; + /*<br />&gt; + * The header with GRH is 68 bytes and the<br />&gt; + * core driver sets the eager header buffer<br />&gt; + * size to 56 bytes so the last 12 bytes of<br />&gt; + * the IB header is in the data buffer.<br />&gt; + */<br />&gt; + qkey = be32_to_cpu(((u32 *) data)[1]);<br />&gt; + src_qp = be32_to_cpu(((u32 *) data)[2]);<br />&gt; + data += 12;<br />&gt; + }<br />&gt; + src_qp &amp;= 0xFFFFFF;<br />&gt; +<br />&gt; + /* Check that the qkey matches. */<br />&gt; + if (unlikely(qkey != qp-&gt;qkey)) {<br />&gt; + /* XXX OK to lose a count once in a while. */<br />&gt; + dev-&gt;qkey_violations++;<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + return;<br />&gt; + }<br />&gt; +<br />&gt; + /* Get the number of bytes the message was padded by. */<br />&gt; + pad = (ohdr-&gt;bth[0] &gt;&gt; 12) &amp; 3;<br />&gt; + if (unlikely(tlen &lt; (hdrsize + pad + 4))) {<br />&gt; + /* Drop incomplete packets. */<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + return;<br />&gt; + }<br />&gt; +<br />&gt; + /*<br />&gt; + * A GRH is expected to preceed the data even if not<br />&gt; + * present on the wire.<br />&gt; + */<br />&gt; + wc.byte_len = tlen - (hdrsize + pad + 4) + sizeof(struct ib_grh);<br />&gt; +<br />&gt; + /*<br />&gt; + * The opcode is in the low byte when its in network order<br />&gt; + * (top byte when in host order).<br />&gt; + */<br />&gt; + opcode = *(u8 *) (&amp;ohdr-&gt;bth[0]);<br />&gt; + if (opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) {<br />&gt; + if (has_grh) {<br />&gt; + wc.imm_data = *(u32 *) data;<br />&gt; + data += sizeof(u32);<br />&gt; + } else<br />&gt; + wc.imm_data = ohdr-&gt;u.ud.imm_data;<br />&gt; + wc.wc_flags = IB_WC_WITH_IMM;<br />&gt; + hdrsize += sizeof(u32);<br />&gt; + } else if (opcode == IB_OPCODE_UD_SEND_ONLY) {<br />&gt; + wc.imm_data = 0;<br />&gt; + wc.wc_flags = 0;<br />&gt; + } else {<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + return;<br />&gt; + }<br />&gt; +<br />&gt; + /*<br />&gt; + * Get the next work request entry to find where to put the data.<br />&gt; + * Note that it is safe to drop the lock after changing rq-&gt;tail<br />&gt; + * since ipath_post_receive() won't fill the empty slot.<br />&gt; + */<br />&gt; + if (qp-&gt;ibqp.srq) {<br />&gt; + srq = to_isrq(qp-&gt;ibqp.srq);<br />&gt; + rq = &amp;srq-&gt;rq;<br />&gt; + } else {<br />&gt; + srq = NULL;<br />&gt; + rq = &amp;qp-&gt;r_rq;<br />&gt; + }<br />&gt; + spin_lock_irqsave(&amp;rq-&gt;lock, flags);<br />&gt; + if (rq-&gt;tail == rq-&gt;head) {<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + return;<br />&gt; + }<br />&gt; + /* Silently drop packets which are too big. */<br />&gt; + wqe = get_rwqe_ptr(rq, rq-&gt;tail);<br />&gt; + if (wc.byte_len &gt; wqe-&gt;length) {<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + return;<br />&gt; + }<br />&gt; + wc.wr_id = wqe-&gt;wr_id;<br />&gt; + qp-&gt;r_sge.sge = wqe-&gt;sg_list[0];<br />&gt; + qp-&gt;r_sge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + qp-&gt;r_sge.num_sge = wqe-&gt;num_sge;<br />&gt; + if (++rq-&gt;tail &gt;= rq-&gt;size)<br />&gt; + rq-&gt;tail = 0;<br />&gt; + if (srq &amp;&amp; srq-&gt;ibsrq.event_handler) {<br />&gt; + u32 n;<br />&gt; +<br />&gt; + if (rq-&gt;head &lt; rq-&gt;tail)<br />&gt; + n = rq-&gt;size + rq-&gt;head - rq-&gt;tail;<br />&gt; + else<br />&gt; + n = rq-&gt;head - rq-&gt;tail;<br />&gt; + if (n &lt; srq-&gt;limit) {<br />&gt; + struct ib_event ev;<br />&gt; +<br />&gt; + srq-&gt;limit = 0;<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + ev.device = qp-&gt;ibqp.device;<br />&gt; + ev.element.srq = qp-&gt;ibqp.srq;<br />&gt; + ev.event = IB_EVENT_SRQ_LIMIT_REACHED;<br />&gt; + srq-&gt;ibsrq.event_handler(&amp;ev, srq-&gt;ibsrq.srq_context);<br />&gt; + } else<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + } else<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + if (has_grh) {<br />&gt; + copy_sge(&amp;qp-&gt;r_sge, &amp;hdr-&gt;u.l.grh, sizeof(struct ib_grh));<br />&gt; + wc.wc_flags |= IB_WC_GRH;<br />&gt; + } else<br />&gt; + skip_sge(&amp;qp-&gt;r_sge, sizeof(struct ib_grh));<br />&gt; + copy_sge(&amp;qp-&gt;r_sge, data, wc.byte_len - sizeof(struct ib_grh));<br />&gt; + wc.status = IB_WC_SUCCESS;<br />&gt; + wc.opcode = IB_WC_RECV;<br />&gt; + wc.vendor_err = 0;<br />&gt; + wc.qp_num = qp-&gt;ibqp.qp_num;<br />&gt; + wc.src_qp = src_qp;<br />&gt; + /* XXX do we know which pkey matched? Only needed for GSI. */<br />&gt; + wc.pkey_index = 0;<br />&gt; + wc.slid = be16_to_cpu(hdr-&gt;lrh[3]);<br />&gt; + wc.sl = (be16_to_cpu(hdr-&gt;lrh[0]) &gt;&gt; 4) &amp; 0xF;<br />&gt; + wc.dlid_path_bits = 0;<br />&gt; + /* Signal completion event if the solicited bit is set. */<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.recv_cq), &amp;wc,<br />&gt; + ohdr-&gt;bth[0] &amp; __constant_cpu_to_be32(1 &lt;&lt; 23));<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * This is called from ipath_post_ud_send() to forward a WQE addressed<br />&gt; + * to the same HCA.<br />&gt; + */<br />&gt; +static void ipath_ud_loopback(struct ipath_qp *sqp, struct ipath_sge_state *ss,<br />&gt; + u32 length, struct ib_send_wr *wr,<br />&gt; + struct ib_wc *wc)<br />&gt; +{<br />&gt; + struct ipath_ibdev *dev = to_idev(sqp-&gt;ibqp.device);<br />&gt; + struct ipath_qp *qp;<br />&gt; + struct ib_ah_attr *ah_attr;<br />&gt; + unsigned long flags;<br />&gt; + struct ipath_rq *rq;<br />&gt; + struct ipath_srq *srq;<br />&gt; + struct ipath_sge_state rsge;<br />&gt; + struct ipath_sge *sge;<br />&gt; + struct ipath_rwqe *wqe;<br />&gt; +<br />&gt; + qp = ipath_lookup_qpn(&amp;dev-&gt;qp_table, wr-&gt;wr.ud.remote_qpn);<br />&gt; + if (!qp)<br />&gt; + return;<br />&gt; +<br />&gt; + /* Check that the qkey matches. */<br />&gt; + if (unlikely(wr-&gt;wr.ud.remote_qkey != qp-&gt;qkey)) {<br />&gt; + /* XXX OK to lose a count once in a while. */<br />&gt; + dev-&gt;qkey_violations++;<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + goto done;<br />&gt; + }<br />&gt; +<br />&gt; + /*<br />&gt; + * A GRH is expected to preceed the data even if not<br />&gt; + * present on the wire.<br />&gt; + */<br />&gt; + wc-&gt;byte_len = length + sizeof(struct ib_grh);<br />&gt; +<br />&gt; + if (wr-&gt;opcode == IB_WR_SEND_WITH_IMM) {<br />&gt; + wc-&gt;wc_flags = IB_WC_WITH_IMM;<br />&gt; + wc-&gt;imm_data = wr-&gt;imm_data;<br />&gt; + } else {<br />&gt; + wc-&gt;wc_flags = 0;<br />&gt; + wc-&gt;imm_data = 0;<br />&gt; + }<br />&gt; +<br />&gt; + /*<br />&gt; + * Get the next work request entry to find where to put the data.<br />&gt; + * Note that it is safe to drop the lock after changing rq-&gt;tail<br />&gt; + * since ipath_post_receive() won't fill the empty slot.<br />&gt; + */<br />&gt; + if (qp-&gt;ibqp.srq) {<br />&gt; + srq = to_isrq(qp-&gt;ibqp.srq);<br />&gt; + rq = &amp;srq-&gt;rq;<br />&gt; + } else {<br />&gt; + srq = NULL;<br />&gt; + rq = &amp;qp-&gt;r_rq;<br />&gt; + }<br />&gt; + spin_lock_irqsave(&amp;rq-&gt;lock, flags);<br />&gt; + if (rq-&gt;tail == rq-&gt;head) {<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + goto done;<br />&gt; + }<br />&gt; + /* Silently drop packets which are too big. */<br />&gt; + wqe = get_rwqe_ptr(rq, rq-&gt;tail);<br />&gt; + if (wc-&gt;byte_len &gt; wqe-&gt;length) {<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + dev-&gt;n_pkt_drops++;<br />&gt; + goto done;<br />&gt; + }<br />&gt; + wc-&gt;wr_id = wqe-&gt;wr_id;<br />&gt; + rsge.sge = wqe-&gt;sg_list[0];<br />&gt; + rsge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + rsge.num_sge = wqe-&gt;num_sge;<br />&gt; + if (++rq-&gt;tail &gt;= rq-&gt;size)<br />&gt; + rq-&gt;tail = 0;<br />&gt; + if (srq &amp;&amp; srq-&gt;ibsrq.event_handler) {<br />&gt; + u32 n;<br />&gt; +<br />&gt; + if (rq-&gt;head &lt; rq-&gt;tail)<br />&gt; + n = rq-&gt;size + rq-&gt;head - rq-&gt;tail;<br />&gt; + else<br />&gt; + n = rq-&gt;head - rq-&gt;tail;<br />&gt; + if (n &lt; srq-&gt;limit) {<br />&gt; + struct ib_event ev;<br />&gt; +<br />&gt; + srq-&gt;limit = 0;<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + ev.device = qp-&gt;ibqp.device;<br />&gt; + ev.element.srq = qp-&gt;ibqp.srq;<br />&gt; + ev.event = IB_EVENT_SRQ_LIMIT_REACHED;<br />&gt; + srq-&gt;ibsrq.event_handler(&amp;ev, srq-&gt;ibsrq.srq_context);<br />&gt; + } else<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + } else<br />&gt; + spin_unlock_irqrestore(&amp;rq-&gt;lock, flags);<br />&gt; + ah_attr = &amp;to_iah(wr-&gt;wr.ud.ah)-&gt;attr;<br />&gt; + if (ah_attr-&gt;ah_flags &amp; IB_AH_GRH) {<br />&gt; + copy_sge(&amp;rsge, &amp;ah_attr-&gt;grh, sizeof(struct ib_grh));<br />&gt; + wc-&gt;wc_flags |= IB_WC_GRH;<br />&gt; + } else<br />&gt; + skip_sge(&amp;rsge, sizeof(struct ib_grh));<br />&gt; + sge = &amp;ss-&gt;sge;<br />&gt; + while (length) {<br />&gt; + u32 len = sge-&gt;length;<br />&gt; +<br />&gt; + if (len &gt; length)<br />&gt; + len = length;<br />&gt; + BUG_ON(len == 0);<br />&gt; + copy_sge(&amp;rsge, sge-&gt;vaddr, len);<br />&gt; + sge-&gt;vaddr += len;<br />&gt; + sge-&gt;length -= len;<br />&gt; + sge-&gt;sge_length -= len;<br />&gt; + if (sge-&gt;sge_length == 0) {<br />&gt; + if (--ss-&gt;num_sge)<br />&gt; + *sge = *ss-&gt;sg_list++;<br />&gt; + } else if (sge-&gt;length == 0 &amp;&amp; sge-&gt;mr != NULL) {<br />&gt; + if (++sge-&gt;n &gt;= IPATH_SEGSZ) {<br />&gt; + if (++sge-&gt;m &gt;= sge-&gt;mr-&gt;mapsz)<br />&gt; + break;<br />&gt; + sge-&gt;n = 0;<br />&gt; + }<br />&gt; + sge-&gt;vaddr = sge-&gt;mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].vaddr;<br />&gt; + sge-&gt;length = sge-&gt;mr-&gt;map[sge-&gt;m]-&gt;segs[sge-&gt;n].length;<br />&gt; + }<br />&gt; + length -= len;<br />&gt; + }<br />&gt; + wc-&gt;status = IB_WC_SUCCESS;<br />&gt; + wc-&gt;opcode = IB_WC_RECV;<br />&gt; + wc-&gt;vendor_err = 0;<br />&gt; + wc-&gt;qp_num = qp-&gt;ibqp.qp_num;<br />&gt; + wc-&gt;src_qp = sqp-&gt;ibqp.qp_num;<br />&gt; + /* XXX do we know which pkey matched? Only needed for GSI. */<br />&gt; + wc-&gt;pkey_index = 0;<br />&gt; + wc-&gt;slid = ipath_layer_get_lid(dev-&gt;ib_unit);<br />&gt; + wc-&gt;sl = ah_attr-&gt;sl;<br />&gt; + wc-&gt;dlid_path_bits = 0;<br />&gt; + /* Signal completion event if the solicited bit is set. */<br />&gt; + ipath_cq_enter(to_icq(qp-&gt;ibqp.recv_cq), wc,<br />&gt; + wr-&gt;send_flags &amp; IB_SEND_SOLICITED);<br />&gt; +<br />&gt; +done:<br />&gt; + if (atomic_dec_and_test(&amp;qp-&gt;refcount))<br />&gt; + wake_up(&amp;qp-&gt;wait);<br />&gt; +}<br />&gt; +<br />&gt; +/*<br />&gt; + * Copy the next RWQE into the QP's RWQE.<br />&gt; + * Return zero if no RWQE is available.<br />&gt; + * Called at interrupt level with the QP r_rq.lock held.<br />&gt; + */<br />&gt; +static int get_rwqe(struct ipath_qp *qp, int wr_id_only)<br />&gt; +{<br />&gt; + struct ipath_rq *rq;<br />&gt; + struct ipath_srq *srq;<br />&gt; + struct ipath_rwqe *wqe;<br />&gt; +<br />&gt; + if (!qp-&gt;ibqp.srq) {<br />&gt; + rq = &amp;qp-&gt;r_rq;<br />&gt; + if (unlikely(rq-&gt;tail == rq-&gt;head))<br />&gt; + return 0;<br />&gt; + wqe = get_rwqe_ptr(rq, rq-&gt;tail);<br />&gt; + qp-&gt;r_wr_id = wqe-&gt;wr_id;<br />&gt; + if (!wr_id_only) {<br />&gt; + qp-&gt;r_sge.sge = wqe-&gt;sg_list[0];<br />&gt; + qp-&gt;r_sge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + qp-&gt;r_sge.num_sge = wqe-&gt;num_sge;<br />&gt; + qp-&gt;r_len = wqe-&gt;length;<br />&gt; + }<br />&gt; + if (++rq-&gt;tail &gt;= rq-&gt;size)<br />&gt; + rq-&gt;tail = 0;<br />&gt; + return 1;<br />&gt; + }<br />&gt; +<br />&gt; + srq = to_isrq(qp-&gt;ibqp.srq);<br />&gt; + rq = &amp;srq-&gt;rq;<br />&gt; + spin_lock(&amp;rq-&gt;lock);<br />&gt; + if (unlikely(rq-&gt;tail == rq-&gt;head)) {<br />&gt; + spin_unlock(&amp;rq-&gt;lock);<br />&gt; + return 0;<br />&gt; + }<br />&gt; + wqe = get_rwqe_ptr(rq, rq-&gt;tail);<br />&gt; + qp-&gt;r_wr_id = wqe-&gt;wr_id;<br />&gt; + if (!wr_id_only) {<br />&gt; + qp-&gt;r_sge.sge = wqe-&gt;sg_list[0];<br />&gt; + qp-&gt;r_sge.sg_list = wqe-&gt;sg_list + 1;<br />&gt; + qp-&gt;r_sge.num_sge = wqe-&gt;num_sge;<br />&gt; + qp-&gt;r_len = wqe-&gt;length;<br />&gt; + }<br />&gt; + if (++rq-&gt;tail &gt;= rq-&gt;size)<br />&gt; + rq-&gt;tail = 0;<br />&gt; + if (srq-&gt;ibsrq.event_handler) {<br />&gt; + struct ib_event ev;<br />&gt; + u32 n;<br />&gt; +<br />&gt; + if (rq-&gt;head &lt; rq-&gt;tail)<br />&gt; + n = rq-&gt;size + rq-&gt;head - rq-&gt;tail;<br />&gt; + else<br />&gt; + n = rq-&gt;head - rq-&gt;tail;<br />&gt; + if (n &lt; srq-&gt;limit) {<br />&gt; + srq-&gt;limit = 0;<br />&gt; + spin_unlock(&amp;rq-&gt;lock);<br />&gt; + ev.device = qp-&gt;ibqp.device;<br />&gt; + ev.element.srq = qp-&gt;ibqp.srq;<br />&gt; + ev.event = IB_EVENT_SRQ_LIMIT_REACHED;<br />&gt; + srq-&gt;ibsrq.event_handler(&amp;ev, srq-&gt;ibsrq.srq_context);<br />&gt; + } else<br />&gt; + spin_unlock(&amp;rq-&gt;lock);<br />&gt; + } else<br />&gt; + spin_unlock(&amp;rq-&gt;lock);<br />&gt; + return 1;<br />&gt; +}<br />&gt; -- <br />&gt; 0.99.9n<br />&gt; -<br />&gt; To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />&gt; the body of a message to majordomo&#64;vger.kernel.org<br />&gt; More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />&gt; Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br />&gt; <br />&gt; <br />&gt; <br />-<br />To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br />the body of a message to majordomo&#64;vger.kernel.org<br />More majordomo info at <a href="http://vger.kernel.org/majordomo-info.html">http://vger.kernel.org/majordomo-info.html</a><br />Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br /><br /></pre></td><td width="32" rowspan="2" class="c" valign="top"><img src="/images/icornerr.gif" width="32" height="32" alt="\" /></td></tr><tr><td align="right" valign="bottom"> 聽 </td></tr><tr><td align="right" valign="bottom">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerl.gif" width="32" height="32" alt="\" /></td><td class="c">聽</td><td class="c" valign="bottom" style="padding-bottom: 0px"><img src="/images/bcornerr.gif" width="32" height="32" alt="/" /></td></tr><tr><td align="right" valign="top" colspan="2"> 聽 </td><td class="lm">Last update: 2005-12-18 21:01 聽聽 [from the cache]<br />漏2003-2020 <a href="http://blog.jasper.es/"><span itemprop="editor">Jasper Spaans</span></a>|hosted at <a href="https://www.digitalocean.com/?refcode=9a8e99d24cf9">Digital Ocean</a> and my Meterkast|<a href="http://blog.jasper.es/categories.html#lkml-ref">Read the blog</a></td><td>聽</td></tr></table><script language="javascript" src="/js/styleswitcher.js" type="text/javascript"></script></body></html>

Pages: 1 2 3 4 5 6 7 8 9 10