Committers | Apache Spark

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title> Committers | Apache Spark </title> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous"> <link rel="preconnect" href="https://fonts.googleapis.com"> <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> <link href="https://fonts.googleapis.com/css2?family=DM+Sans:ital,wght@0,400;0,500;0,700;1,400;1,500;1,700&Courier+Prime:wght@400;700&display=swap" rel="stylesheet"> <link href="/css/custom.css" rel="stylesheet">  <link href="/css/pygments-default.css" rel="stylesheet"> <link rel="icon" href="/favicon.ico" type="image/x-icon">  <script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="https://analytics.apache.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '40']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script>  </head> <body class="global"> <nav class="navbar navbar-expand-lg navbar-dark p-0 px-4" style="background: #1D6890;"> <a class="navbar-brand" href="/"> <img src="/images/spark-logo-rev.svg" alt="" width="141" height="72"> </a> <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarContent" aria-controls="navbarContent" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="collapse navbar-collapse col-md-12 col-lg-auto pt-4" id="navbarContent"> <ul class="navbar-nav me-auto"> <li class="nav-item"> <a class="nav-link active" aria-current="page" href="/downloads.html">Download</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="libraries" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Libraries </a> <ul class="dropdown-menu" aria-labelledby="libraries"> <li><a class="dropdown-item" href="/sql/">SQL and DataFrames</a></li> <li><a class="dropdown-item" href="/spark-connect/">Spark Connect</a></li> <li><a class="dropdown-item" href="/streaming/">Spark Streaming</a></li> <li><a class="dropdown-item" href="/pandas-on-spark/">pandas on Spark</a></li> <li><a class="dropdown-item" href="/mllib/">MLlib (machine learning)</a></li> <li><a class="dropdown-item" href="/graphx/">GraphX (graph)</a></li> <li> <hr class="dropdown-divider"> </li> <li><a class="dropdown-item" href="/third-party-projects.html">Third-Party Projects</a></li> </ul> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="documentation" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Documentation </a> <ul class="dropdown-menu" aria-labelledby="documentation"> <li><a class="dropdown-item" href="/docs/latest/">Latest Release</a></li> <li><a class="dropdown-item" href="/documentation.html">Older Versions and Other Resources</a></li> <li><a class="dropdown-item" href="/faq.html">Frequently Asked Questions</a></li> </ul> </li> <li class="nav-item"> <a class="nav-link active" aria-current="page" href="/examples.html">Examples</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="community" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Community </a> <ul class="dropdown-menu" aria-labelledby="community"> <li><a class="dropdown-item" href="/community.html">Mailing Lists & Resources</a></li> <li><a class="dropdown-item" href="/contributing.html">Contributing to Spark</a></li> <li><a class="dropdown-item" href="/improvement-proposals.html">Improvement Proposals (SPIP)</a> </li> <li><a class="dropdown-item" href="https://issues.apache.org/jira/browse/SPARK">Issue Tracker</a> </li> <li><a class="dropdown-item" href="/powered-by.html">Powered By</a></li> <li><a class="dropdown-item" href="/committers.html">Project Committers</a></li> <li><a class="dropdown-item" href="/history.html">Project History</a></li> </ul> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="developers" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Developers </a> <ul class="dropdown-menu" aria-labelledby="developers"> <li><a class="dropdown-item" href="/developer-tools.html">Useful Developer Tools</a></li> <li><a class="dropdown-item" href="/versioning-policy.html">Versioning Policy</a></li> <li><a class="dropdown-item" href="/release-process.html">Release Process</a></li> <li><a class="dropdown-item" href="/security.html">Security</a></li> </ul> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="github" role="button" data-bs-toggle="dropdown" aria-expanded="false"> GitHub </a> <ul class="dropdown-menu" aria-labelledby="github"> <li><a class="dropdown-item" href="https://github.com/apache/spark">spark</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-connect-go">spark-connect-go</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-docker">spark-docker</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-kubernetes-operator">spark-kubernetes-operator</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-website">spark-website</a></li> </ul> </li> </ul> <ul class="navbar-nav ml-auto"> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="apacheFoundation" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Apache Software Foundation </a> <ul class="dropdown-menu" aria-labelledby="apacheFoundation"> <li><a class="dropdown-item" href="https://www.apache.org/">Apache Homepage</a></li> <li><a class="dropdown-item" href="https://www.apache.org/licenses/">License</a></li> <li><a class="dropdown-item" href="https://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> <li><a class="dropdown-item" href="https://www.apache.org/foundation/thanks.html">Thanks</a></li> <li><a class="dropdown-item" href="https://www.apache.org/security/">Security</a></li> <li><a class="dropdown-item" href="https://www.apache.org/events/current-event">Event</a></li> </ul> </li> </ul> </div> </nav> <div class="container"> <div class="row mt-4"> <div class="col-12 col-md-9"> <h2>Current committers</h2> <table> <thead> <tr> <th>Name</th> <th>Organization</th> </tr> </thead> <tbody> <tr> <td>Sameer Agarwal</td> <td>Deductive AI</td> </tr> <tr> <td>Michael Armbrust</td> <td>Databricks</td> </tr> <tr> <td>Dilip Biswal</td> <td>Adobe</td> </tr> <tr> <td>Ryan Blue</td> <td>Tabular</td> </tr> <tr> <td>Joseph Bradley</td> <td>Databricks</td> </tr> <tr> <td>Matthew Cheah</td> <td>Palantir</td> </tr> <tr> <td>Felix Cheung</td> <td>NVIDIA</td> </tr> <tr> <td>Mosharaf Chowdhury</td> <td>University of Michigan, Ann Arbor</td> </tr> <tr> <td>Bryan Cutler</td> <td>IBM</td> </tr> <tr> <td>Jason Dai</td> <td>Intel</td> </tr> <tr> <td>Tathagata Das</td> <td>Databricks</td> </tr> <tr> <td>Ankur Dave</td> <td>Databricks</td> </tr> <tr> <td>Aaron Davidson</td> <td>Databricks</td> </tr> <tr> <td>Thomas Dudziak</td> <td>Meta</td> </tr> <tr> <td>Erik Erlandson</td> <td>Red Hat</td> </tr> <tr> <td>Robert Evans</td> <td>NVIDIA</td> </tr> <tr> <td>Wenchen Fan</td> <td>Databricks</td> </tr> <tr> <td>Huaxin Gao</td> <td>Apple</td> </tr> <tr> <td>Max Gekk</td> <td>Databricks</td> </tr> <tr> <td>Jiaan Geng</td> <td>DataCyber</td> </tr> <tr> <td>Joseph Gonzalez</td> <td>UC Berkeley</td> </tr> <tr> <td>Thomas Graves</td> <td>NVIDIA</td> </tr> <tr> <td>Martin Grund</td> <td>Databricks</td> </tr> <tr> <td>Stephen Haberman</td> <td>LinkedIn</td> </tr> <tr> <td>Mark Hamstra</td> <td>ClearStory Data</td> </tr> <tr> <td>Seth Hendrickson</td> <td>Stripe</td> </tr> <tr> <td>Herman van Hovell</td> <td>Databricks</td> </tr> <tr> <td>Liang-Chi Hsieh</td> <td>Apple</td> </tr> <tr> <td>Yin Huai</td> <td>Databricks</td> </tr> <tr> <td>Shane Huang</td> <td>Intel</td> </tr> <tr> <td>Dongjoon Hyun</td> <td>Apple</td> </tr> <tr> <td>Kazuaki Ishizaki</td> <td>IBM</td> </tr> <tr> <td>Xingbo Jiang</td> <td>Databricks</td> </tr> <tr> <td>Yikun Jiang</td> <td>Huawei</td> </tr> <tr> <td>Holden Karau</td> <td>Netflix</td> </tr> <tr> <td>Shane Knapp</td> <td>UC Berkeley</td> </tr> <tr> <td>Cody Koeninger</td> <td>Nexstar Digital</td> </tr> <tr> <td>Andy Konwinski</td> <td>Databricks</td> </tr> <tr> <td>Hyukjin Kwon</td> <td>Databricks</td> </tr> <tr> <td>Ryan LeCompte</td> <td>Quantifind</td> </tr> <tr> <td>Haejoon Lee</td> <td>Databricks</td> </tr> <tr> <td>Haoyuan Li</td> <td>Alluxio</td> </tr> <tr> <td>Xiao Li</td> <td>Databricks</td> </tr> <tr> <td>Yinan Li</td> <td>Google</td> </tr> <tr> <td>Yuanjian Li</td> <td>Databricks</td> </tr> <tr> <td>Davies Liu</td> <td>Juicedata</td> </tr> <tr> <td>Cheng Lian</td> <td>Databricks</td> </tr> <tr> <td>Yanbo Liang</td> <td>Facebook</td> </tr> <tr> <td>Jungtaek Lim</td> <td>Databricks</td> </tr> <tr> <td>Sean McNamara</td> <td>Oracle</td> </tr> <tr> <td>Xiangrui Meng</td> <td>Databricks</td> </tr> <tr> <td>Xinrong Meng</td> <td>Databricks</td> </tr> <tr> <td>Mridul Muralidharan</td> <td>LinkedIn</td> </tr> <tr> <td>Andrew Or</td> <td>Facebook</td> </tr> <tr> <td>Kay Ousterhout</td> <td>LightStep</td> </tr> <tr> <td>Sean Owen</td> <td>Databricks</td> </tr> <tr> <td>Bingkun Pan</td> <td>Baidu</td> </tr> <tr> <td>Tejas Patil</td> <td>Meta</td> </tr> <tr> <td>Nick Pentreath</td> <td>Automattic</td> </tr> <tr> <td>Attila Zsolt Piros</td> <td>Cloudera</td> </tr> <tr> <td>Anirudh Ramanathan</td> <td>Signadot</td> </tr> <tr> <td>Imran Rashid</td> <td>Cloudera</td> </tr> <tr> <td>Charles Reiss</td> <td>University of Virginia</td> </tr> <tr> <td>Josh Rosen</td> <td>Databricks</td> </tr> <tr> <td>Sandy Ryza</td> <td>Dagster</td> </tr> <tr> <td>Kousuke Saruta</td> <td>NTT Data</td> </tr> <tr> <td>Saisai Shao</td> <td>Datastrato</td> </tr> <tr> <td>Prashant Sharma</td> <td>IBM</td> </tr> <tr> <td>Gabor Somogyi</td> <td>Apple</td> </tr> <tr> <td>Ram Sriharsha</td> <td>Pinecone</td> </tr> <tr> <td>Chao Sun</td> <td>OpenAI</td> </tr> <tr> <td>Maciej Szymkiewicz</td> <td> </td> </tr> <tr> <td>Jose Torres</td> <td>Databricks</td> </tr> <tr> <td>Peter Toth</td> <td>Cloudera</td> </tr> <tr> <td>DB Tsai</td> <td>Apple</td> </tr> <tr> <td>Takuya Ueshin</td> <td>Databricks</td> </tr> <tr> <td>Marcelo Vanzin</td> <td>Cloudera</td> </tr> <tr> <td>Shivaram Venkataraman</td> <td>University of Wisconsin, Madison</td> </tr> <tr> <td>Allison Wang</td> <td>Databricks</td> </tr> <tr> <td>Gengliang Wang</td> <td>Databricks</td> </tr> <tr> <td>Yuming Wang</td> <td>eBay</td> </tr> <tr> <td>Zhenhua Wang</td> <td>Huawei</td> </tr> <tr> <td>Patrick Wendell</td> <td>Databricks</td> </tr> <tr> <td>Yi Wu</td> <td>Databricks</td> </tr> <tr> <td>Andrew Xia</td> <td>Alibaba</td> </tr> <tr> <td>Reynold Xin</td> <td>Databricks</td> </tr> <tr> <td>Weichen Xu</td> <td>Databricks</td> </tr> <tr> <td>Takeshi Yamamuro</td> <td>NTT</td> </tr> <tr> <td>Jie Yang</td> <td>Baidu</td> </tr> <tr> <td>Kent Yao</td> <td>NetEase</td> </tr> <tr> <td>Burak Yavuz</td> <td>Databricks</td> </tr> <tr> <td>Xiduo You</td> <td>NetEase</td> </tr> <tr> <td>Matei Zaharia</td> <td>Databricks, Stanford</td> </tr> <tr> <td>Ruifeng Zheng</td> <td>Databricks</td> </tr> <tr> <td>Shixiong Zhu</td> <td>Databricks</td> </tr> </tbody> </table> <h3>Becoming a committer</h3> <p>To get started contributing to Spark, learn <a href="/contributing.html">how to contribute</a> – anyone can submit patches, documentation and examples to the project.</p> <p>The PMC regularly adds new committers from the active contributors, based on their contributions to Spark. The qualifications for new committers include:</p> <ol> <li>Sustained contributions to Spark: Committers should have a history of major contributions to Spark. An ideal committer will have contributed broadly throughout the project, and have contributed at least one major component where they have taken an “ownership” role. An ownership role means that existing contributors feel that they should run patches for this component by this person.</li> <li>Quality of contributions: Committers more than any other community member should submit simple, well-tested, and well-designed patches. In addition, they should show sufficient expertise to be able to review patches, including making sure they fit within Spark’s engineering practices (testability, documentation, API stability, code style, etc). The committership is collectively responsible for the software quality and maintainability of Spark. Note that contributions to critical parts of Spark, like its core and SQL modules, will be held to a higher standard when assessing quality. Contributors to these areas will face more review of their changes.</li> <li>Community involvement: Committers should have a constructive and friendly attitude in all community interactions. They should also be active on the dev and user list and help mentor newer contributors and users. In design discussions, committers should maintain a professional and diplomatic approach, even in the face of disagreement.</li> <li><a href="https://www.apache.org/theapacheway/">The Apache Way</a>: Committers should follow and understand <a href="https://www.apache.org/theapacheway/">The Apache Way</a> such as <a href="https://community.apache.org/committers/decisionMaking.html#lazy-consensus">Lazy Consensus</a>. <a href="https://community.apache.org/projectIndependence.html#apache-projects-are-managed-independently">Apache projects are managed independently</a>.</li> </ol> <blockquote> <p>A community that obviously favors one specific vendor in some exclusive way will often discourage new contributors from competing vendors, and this would be an issue for the long-term health of the project.</p> </blockquote> <p>The type and level of contributions considered may vary by project area – for example, we greatly encourage contributors who want to work on mainly the documentation, or mainly on platform support for specific OSes, storage systems, etc.</p> <p>The PMC also adds new PMC members. PMC members are expected to carry out PMC responsibilities as described in <a href="https://www.apache.org/dev/pmc.html#policy">Apache Guidance</a>, including helping vote on releases, enforce Apache project trademarks, take responsibility for legal and license issues, and ensure the project follows Apache project mechanics. The PMC periodically adds committers to the PMC who have shown they understand and can help with these activities.</p> <h3>Review process</h3> <p>All contributions should be reviewed before merging as described in <a href="/contributing.html">Contributing to Spark</a>. In particular, if you are working on an area of the codebase you are unfamiliar with, look at the Git history for that code to see who reviewed patches before. You can do this using <code class="language-plaintext highlighter-rouge">git log --format=full <filename></code>, by examining the “Commit” field to see who committed each patch.</p> <h3>When to commit/merge a pull request</h3> <p>PRs shall not be merged during active, on-topic discussion unless they address issues such as critical security fixes of a public vulnerability. Under extenuating circumstances, PRs may be merged during active, off-topic discussion and the discussion directed to a more appropriate venue. Time should be given prior to merging for those involved with the conversation to explain if they believe they are on-topic.</p> <p>Lazy consensus requires giving time for discussion to settle while understanding that people may not be working on Spark as their full-time job and may take holidays. It is believed that by doing this, we can limit how often people feel the need to exercise their veto.</p> <p>All -1s with justification merit discussion. A -1 from a non-committer can be overridden only with input from multiple committers, and suitable time must be offered for any committer to raise concerns. A -1 from a committer who cannot be reached requires a consensus vote of the PMC under ASF voting rules to determine the next steps within the <a href="https://www.apache.org/foundation/voting.html">ASF guidelines for code vetoes</a>.</p> <p>These policies serve to reiterate the core principle that code must not be merged with a pending veto or before a consensus has been reached (lazy or otherwise).</p> <p>It is the PMC’s hope that vetoes continue to be infrequent, and when they occur, that all parties will take the time to build consensus prior to additional feature work.</p> <p>Being a committer means exercising your judgement while working in a community of people with diverse views. There is nothing wrong in getting a second (or third or fourth) opinion when you are uncertain. Thank you for your dedication to the Spark project; it is appreciated by the developers and users of Spark.</p> <p>It is hoped that these guidelines do not slow down development; rather, by removing some of the uncertainty, the goal is to make it easier for us to reach consensus. If you have ideas on how to improve these guidelines or other Spark project operating procedures, you should reach out on the dev@ list to start the discussion.</p> <h3>How to merge a pull request</h3> <p>Changes pushed to the master branch on Apache cannot be removed; that is, we can’t force-push to it. So please don’t add any test commits or anything like that, only real patches.</p> <h4>Setting up remotes</h4> <p>To use the <code class="language-plaintext highlighter-rouge">merge_spark_pr.py</code> script described below, you will need to add a git remote called <code class="language-plaintext highlighter-rouge">apache</code> at <code class="language-plaintext highlighter-rouge">https://github.com/apache/spark</code>, as well as one called <code class="language-plaintext highlighter-rouge">apache-github</code> at <code class="language-plaintext highlighter-rouge">git://github.com/apache/spark</code>.</p> <p>The <code class="language-plaintext highlighter-rouge">apache</code> (the default value of <code class="language-plaintext highlighter-rouge">PUSH_REMOTE_NAME</code> environment variable) is the remote used for pushing the squashed commits and <code class="language-plaintext highlighter-rouge">apache-github</code> (default value of <code class="language-plaintext highlighter-rouge">PR_REMOTE_NAME</code>) is the remote used for pulling the changes. By using two separate remotes for these two actions the result of the <code class="language-plaintext highlighter-rouge">merge_spark_pr.py</code> can be tested without pushing it into the official Spark repo just by specifying your fork in the <code class="language-plaintext highlighter-rouge">PUSH_REMOTE_NAME</code> variable.</p> <p>After cloning your fork of Spark you already have a remote <code class="language-plaintext highlighter-rouge">origin</code> pointing there. So if correct, your <code class="language-plaintext highlighter-rouge">git remote -v</code> contains at least these lines:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apache git@github.com:apache/spark.git (fetch) apache git@github.com:apache/spark.git (push) apache-github git@github.com:apache/spark.git (fetch) apache-github git@github.com:apache/spark.git (push) origin git@github.com:[your username]/spark.git (fetch) origin git@github.com:[your username]/spark.git (push) </code></pre></div></div> <p>For the <code class="language-plaintext highlighter-rouge">apache</code> repo, you will need to set up command-line authentication to GitHub. This may include setting up an SSH key and/or personal access token. See:</p> <ul> <li><a href="https://docs.github.com/en/authentication/connecting-to-github-with-ssh">https://docs.github.com/en/authentication/connecting-to-github-with-ssh</a></li> <li><a href="https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens">https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens</a></li> </ul> <p>To check whether the necessary write access are already granted please visit <a href="https://gitbox.apache.org/setup/">GitBox</a>.</p> <p>Ask <code class="language-plaintext highlighter-rouge">dev@spark.apache.org</code> if you have trouble with these steps, or want help doing your first merge.</p> <h4>Merge script</h4> <p>All merges should be done using the <a href="https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py">dev/merge_spark_pr.py</a>, which squashes the pull request’s changes into one commit.</p> <p>The script is fairly self explanatory and walks you through steps and options interactively.</p> <p>If you want to amend a commit before merging – which should be used for trivial touch-ups – then simply let the script wait at the point where it asks you if you want to push to Apache. Then, in a separate window, modify the code and push a commit. Run <code class="language-plaintext highlighter-rouge">git rebase -i HEAD~2</code> and “squash” your new commit. Edit the commit message just after to remove your commit message. You can verify the result is one change with <code class="language-plaintext highlighter-rouge">git log</code>. Then resume the script in the other window.</p> <p>Also, please remember to set Assignee on JIRAs where applicable when they are resolved. The script can do this automatically in most cases.</p> <p>Once a PR is merged please leave a comment on the PR stating which branch(es) it has been merged with.</p> <h3>Policy on backporting bug fixes</h3> <p>From <a href="https://www.mail-archive.com/dev@spark.apache.org/msg10284.html"><code class="language-plaintext highlighter-rouge">pwendell</code></a>:</p> <p>The trade off when backporting is you get to deliver the fix to people running older versions (great!), but you risk introducing new or even worse bugs in maintenance releases (bad!). The decision point is when you have a bug fix and it’s not clear whether it is worth backporting.</p> <p>I think the following facets are important to consider:</p> <ul> <li>Backports are an extremely valuable service to the community and should be considered for any bug fix.</li> <li>Introducing a new bug in a maintenance release must be avoided at all costs. It over time would erode confidence in our release process.</li> <li>Distributions or advanced users can always backport risky patches on their own, if they see fit.</li> </ul> <p>For me, the consequence of these is that we should backport in the following situations:</p> <ul> <li>Both the bug and the fix are well understood and isolated. Code being modified is well tested.</li> <li>The bug being addressed is high priority to the community.</li> <li>The backported fix does not vary widely from the master branch fix.</li> </ul> <p>We tend to avoid backports in the converse situations:</p> <ul> <li>The bug or fix are not well understood. For instance, it relates to interactions between complex components or third party libraries (e.g. Hadoop libraries). The code is not well tested outside of the immediate bug being fixed.</li> <li>The bug is not clearly a high priority for the community.</li> <li>The backported fix is widely different from the master branch fix.</li> </ul> </div> <div class="col-12 col-md-3"> <div class="news" style="margin-bottom: 20px;"> <h5>Latest News</h5> <ul class="list-unstyled"> <li><a href="/news/spark-3-4-4-released.html">Spark 3.4.4 released</a> <span class="small">(Oct 27, 2024)</span></li> <li><a href="/news/spark-4.0.0-preview2.html">Preview release of Spark 4.0</a> <span class="small">(Sep 26, 2024)</span></li> <li><a href="/news/spark-3-5-3-released.html">Spark 3.5.3 released</a> <span class="small">(Sep 24, 2024)</span></li> <li><a href="/news/spark-3-5-2-released.html">Spark 3.5.2 released</a> <span class="small">(Aug 10, 2024)</span></li> </ul> <p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p> </div> <div style="text-align:center; margin-bottom: 20px;"> <a href="https://www.apache.org/events/current-event.html"> <img src="https://www.apache.org/events/current-event-234x60.png" style="max-width: 100%;"/> </a> </div> <div class="hidden-xs hidden-sm"> <a href="/downloads.html" class="btn btn-cta btn-lg d-grid" style="margin-bottom: 30px;"> Download Spark </a> <p style="font-size: 16px; font-weight: 500; color: #555;"> Built-in Libraries: </p> <ul class="list-none"> <li><a href="/sql/">SQL and DataFrames</a></li> <li><a href="/streaming/">Spark Streaming</a></li> <li><a href="/mllib/">MLlib (machine learning)</a></li> <li><a href="/graphx/">GraphX (graph)</a></li> </ul> <a href="/third-party-projects.html">Third-Party Projects</a> </div> </div> </div> <footer class="small"> <hr> Apache Spark, Spark, Apache, the Apache feather logo, and the Apache Spark project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries. See guidance on use of Apache Spark <a href="/trademarks.html">trademarks</a>. All other marks mentioned may be trademarks or registered trademarks of their respective owners. Copyright © 2018 The Apache Software Foundation, Licensed under the <a href="https://www.apache.org/licenses/">Apache License, Version 2.0</a>. </footer> </div> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js" integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM" crossorigin="anonymous"></script> <script src="https://code.jquery.com/jquery.js"></script> <script src="/js/lang-tabs.js"></script> <script src="/js/downloads.js"></script> </body> </html>

CINXE.COM

Committers | Apache Spark