CINXE.COM

Apache Zeppelin 0.10.1 Documentation: Apache Zeppelin on Spark cluster mode

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Apache Zeppelin 0.10.1 Documentation: Apache Zeppelin on Spark cluster mode</title> <meta name="description" content="This document will guide you how you can build and configure the environment on 3 types of Spark cluster manager(Standalone, Hadoop Yarn, Apache Mesos) with Apache Zeppelin using docker scripts."> <meta name="author" content="The Apache Software Foundation"> <!-- Enable responsive viewport --> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <!-- Le HTML5 shim, for IE6-8 support of HTML elements --> <!--[if lt IE 9]> <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script> <![endif]--> <link href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"> <!-- Le styles --> <link href="/docs/0.10.1/assets/themes/zeppelin/bootstrap/css/bootstrap.css" rel="stylesheet"> <link href="/docs/0.10.1/assets/themes/zeppelin/css/style.css?body=1" rel="stylesheet" type="text/css"> <link href="/docs/0.10.1/assets/themes/zeppelin/css/syntax.css" rel="stylesheet" type="text/css" media="screen" /> <!-- Le fav and touch icons --> <!-- Update these with your own images <link rel="shortcut icon" href="images/favicon.ico"> <link rel="apple-touch-icon" href="images/apple-touch-icon.png"> <link rel="apple-touch-icon" sizes="72x72" href="images/apple-touch-icon-72x72.png"> <link rel="apple-touch-icon" sizes="114x114" href="images/apple-touch-icon-114x114.png"> --> <!-- Js --> <script src="https://code.jquery.com/jquery-1.10.2.min.js"></script> <script src="/docs/0.10.1/assets/themes/zeppelin/bootstrap/js/bootstrap.min.js"></script> <script src="/docs/0.10.1/assets/themes/zeppelin/js/docs.js"></script> <script src="/docs/0.10.1/assets/themes/zeppelin/js/anchor.min.js"></script> <script src="/docs/0.10.1/assets/themes/zeppelin/js/toc.js"></script> <script src="/docs/0.10.1/assets/themes/zeppelin/js/lunr.min.js"></script> <script src="/docs/0.10.1/assets/themes/zeppelin/js/search.js"></script> <!-- atom & rss feed --> <link href="/docs/0.10.1/atom.xml" type="application/atom+xml" rel="alternate" title="Sitewide ATOM Feed"> <link href="/docs/0.10.1/rss.xml" type="application/rss+xml" rel="alternate" title="Sitewide RSS Feed"> </head> <body> <div id="menu" class="navbar navbar-inverse navbar-fixed-top" role="navigation"> <div class="container navbar-container"> <div class="navbar-header"> <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button> <div class="navbar-brand"> <a class="navbar-brand-main" href="http://zeppelin.apache.org"> <img src="/docs/0.10.1/assets/themes/zeppelin/img/zeppelin_logo.png" width="50" style="margin-top: -2px;" alt="I'm zeppelin"> <span style="margin-left: 5px; font-size: 27px;">Zeppelin</span> <a class="navbar-brand-version" href="/docs/0.10.1" style="font-size: 15px; color: white;"> 0.10.1 </a> </a> </div> </div> <nav class="navbar-collapse collapse" role="navigation"> <ul class="nav navbar-nav"> <li> <a href="#" data-toggle="dropdown" class="dropdown-toggle">Quick Start <b class="caret"></b></a> <ul class="dropdown-menu"> <li class="title"><span>Getting Started</span></li> <li><a href="/docs/0.10.1/quickstart/install.html">Install</a></li> <li><a href="/docs/0.10.1/quickstart/explore_ui.html">Explore UI</a></li> <li><a href="/docs/0.10.1/quickstart/tutorial.html">Tutorial</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Run Mode</span></li> <li><a href="/docs/0.10.1/quickstart/kubernetes.html">Kubernetes</a></li> <li><a href="/docs/0.10.1/quickstart/docker.html">Docker</a></li> <li><a href="/docs/0.10.1/quickstart/yarn.html">Yarn</a></li> <li role="separator" class="divider"></li> <li><a href="/docs/0.10.1/quickstart/spark_with_zeppelin.html">Spark with Zeppelin</a></li> <li><a href="/docs/0.10.1/quickstart/flink_with_zeppelin.html">Flink with Zeppelin</a></li> <li><a href="/docs/0.10.1/quickstart/sql_with_zeppelin.html">SQL with Zeppelin</a></li> <li><a href="/docs/0.10.1/quickstart/python_with_zeppelin.html">Python with Zeppelin</a></li> <li><a href="/docs/0.10.1/quickstart/r_with_zeppelin.html">R with Zeppelin</a></li> </ul> </li> <li> <a href="#" data-toggle="dropdown" class="dropdown-toggle">Usage<b class="caret"></b></a> <ul class="dropdown-menu scrollable-menu"> <li class="title"><span>Dynamic Form</span></li> <li><a href="/docs/0.10.1/usage/dynamic_form/intro.html">What is Dynamic Form?</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Display System</span></li> <li><a href="/docs/0.10.1/usage/display_system/basic.html#text">Text Display</a></li> <li><a href="/docs/0.10.1/usage/display_system/basic.html#html">HTML Display</a></li> <li><a href="/docs/0.10.1/usage/display_system/basic.html#table">Table Display</a></li> <li><a href="/docs/0.10.1/usage/display_system/basic.html#network">Network Display</a></li> <li><a href="/docs/0.10.1/usage/display_system/angular_backend.html">Angular Display using Backend API</a></li> <li><a href="/docs/0.10.1/usage/display_system/angular_frontend.html">Angular Display using Frontend API</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Interpreter</span></li> <li><a href="/docs/0.10.1/usage/interpreter/overview.html">Overview</a></li> <li><a href="/docs/0.10.1/usage/interpreter/interpreter_binding_mode.html">Interpreter Binding Mode</a></li> <li><a href="/docs/0.10.1/usage/interpreter/user_impersonation.html">User Impersonation</a></li> <li><a href="/docs/0.10.1/usage/interpreter/dependency_management.html">Dependency Management</a></li> <li><a href="/docs/0.10.1/usage/interpreter/installation.html">Installing Interpreters</a></li> <!--<li><a href="/docs/0.10.1/usage/interpreter/dynamic_loading.html">Dynamic Interpreter Loading (Experimental)</a></li>--> <li><a href="/docs/0.10.1/usage/interpreter/execution_hooks.html">Execution Hooks (Experimental)</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Other Features</span></li> <li><a href="/docs/0.10.1/usage/other_features/publishing_paragraphs.html">Publishing Paragraphs</a></li> <li><a href="/docs/0.10.1/usage/other_features/personalized_mode.html">Personalized Mode</a></li> <li><a href="/docs/0.10.1/usage/other_features/customizing_homepage.html">Customizing Zeppelin Homepage</a></li> <li><a href="/docs/0.10.1/usage/other_features/notebook_actions.html">Notebook Actions</a></li> <li><a href="/docs/0.10.1/usage/other_features/cron_scheduler.html">Cron Scheduler</a></li> <li><a href="/docs/0.10.1/usage/other_features/zeppelin_context.html">Zeppelin Context</a></li> <li role="separator" class="divider"></li> <li class="title"><span>REST API</span></li> <li><a href="/docs/0.10.1/usage/rest_api/interpreter.html">Interpreter API</a></li> <li><a href="/docs/0.10.1/usage/rest_api/zeppelin_server.html">Zeppelin Server API</a></li> <li><a href="/docs/0.10.1/usage/rest_api/notebook.html">Notebook API</a></li> <li><a href="/docs/0.10.1/usage/rest_api/notebook_repository.html">Notebook Repository API</a></li> <li><a href="/docs/0.10.1/usage/rest_api/configuration.html">Configuration API</a></li> <li><a href="/docs/0.10.1/usage/rest_api/credential.html">Credential API</a></li> <li><a href="/docs/0.10.1/usage/rest_api/helium.html">Helium API</a></li> <li class="title"><span>Zeppelin SDK</span></li> <li><a href="/docs/0.10.1/usage/zeppelin_sdk/client_api.html">Client API</a></li> <li><a href="/docs/0.10.1/usage/zeppelin_sdk/session_api.html">Session API</a></li> </ul> </li> <li> <a href="#" data-toggle="dropdown" class="dropdown-toggle">Setup<b class="caret"></b></a> <ul class="dropdown-menu scrollable-menu"> <li class="title"><span>Basics</span></li> <li><a href="/docs/0.10.1/setup/basics/how_to_build.html">How to Build Zeppelin</a></li> <li><a href="/docs/0.10.1/setup/basics/hadoop_integration.html">Hadoop Integration</a></li> <li><a href="/docs/0.10.1/setup/basics/multi_user_support.html">Multi-user Support</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Deployment</span></li> <!--<li><a href="/docs/0.10.1/setup/deployment/docker.html">Docker Image for Zeppelin</a></li>--> <li><a href="/docs/0.10.1/setup/deployment/spark_cluster_mode.html#spark-standalone-mode">Spark Cluster Mode: Standalone</a></li> <li><a href="/docs/0.10.1/setup/deployment/spark_cluster_mode.html#spark-on-yarn-mode">Spark Cluster Mode: YARN</a></li> <li><a href="/docs/0.10.1/setup/deployment/spark_cluster_mode.html#spark-on-mesos-mode">Spark Cluster Mode: Mesos</a></li> <li><a href="/docs/0.10.1/setup/deployment/flink_and_spark_cluster.html">Zeppelin with Flink, Spark Cluster</a></li> <li><a href="/docs/0.10.1/setup/deployment/cdh.html">Zeppelin on CDH</a></li> <li><a href="/docs/0.10.1/setup/deployment/virtual_machine.html">Zeppelin on VM: Vagrant</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Security</span></li> <li><a href="/docs/0.10.1/setup/security/authentication_nginx.html">HTTP Basic Auth using NGINX</a></li> <li><a href="/docs/0.10.1/setup/security/shiro_authentication.html">Shiro Authentication</a></li> <li><a href="/docs/0.10.1/setup/security/notebook_authorization.html">Notebook Authorization</a></li> <li><a href="/docs/0.10.1/setup/security/datasource_authorization.html">Data Source Authorization</a></li> <li><a href="/docs/0.10.1/setup/security/http_security_headers.html">HTTP Security Headers</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Notebook Storage</span></li> <li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-local-git-repository">Git Storage</a></li> <li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-s3">S3 Storage</a></li> <li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-azure">Azure Storage</a></li> <li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-oss">OSS Storage</a></li> <li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-zeppelinhub">ZeppelinHub Storage</a></li> <li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-mongodb">MongoDB Storage</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Operation</span></li> <li><a href="/docs/0.10.1/setup/operation/configuration.html">Configuration</a></li> <li><a href="/docs/0.10.1/setup/operation/proxy_setting.html">Proxy Setting</a></li> <li><a href="/docs/0.10.1/setup/operation/upgrading.html">Upgrading</a></li> <li><a href="/docs/0.10.1/setup/operation/trouble_shooting.html">Trouble Shooting</a></li> </ul> </li> <li> <a href="#" data-toggle="dropdown" class="dropdown-toggle">Interpreter <b class="caret"></b></a> <ul class="dropdown-menu scrollable-menu"> <li class="title"><span>Interpreters</span></li> <li><a href="/docs/0.10.1/usage/interpreter/overview.html">Overview</a></li> <li role="separator" class="divider"></li> <li><a href="/docs/0.10.1/interpreter/spark.html">Spark</a></li> <li><a href="/docs/0.10.1/interpreter/flink.html">Flink</a></li> <li><a href="/docs/0.10.1/interpreter/jdbc.html">JDBC</a></li> <li><a href="/docs/0.10.1/interpreter/python.html">Python</a></li> <li><a href="/docs/0.10.1/interpreter/r.html">R</a></li> <li role="separator" class="divider"></li> <li><a href="/docs/0.10.1/interpreter/alluxio.html">Alluxio</a></li> <li><a href="/docs/0.10.1/interpreter/beam.html">Beam</a></li> <li><a href="/docs/0.10.1/interpreter/bigquery.html">BigQuery</a></li> <li><a href="/docs/0.10.1/interpreter/cassandra.html">Cassandra</a></li> <li><a href="/docs/0.10.1/interpreter/elasticsearch.html">Elasticsearch</a></li> <li><a href="/docs/0.10.1/interpreter/geode.html">Geode</a></li> <li><a href="/docs/0.10.1/interpreter/groovy.html">Groovy</a></li> <li><a href="/docs/0.10.1/interpreter/hazelcastjet.html">Hazelcast Jet</a></li> <li><a href="/docs/0.10.1/interpreter/hbase.html">HBase</a></li> <li><a href="/docs/0.10.1/interpreter/hdfs.html">HDFS</a></li> <li><a href="/docs/0.10.1/interpreter/hive.html">Hive</a></li> <li><a href="/docs/0.10.1/interpreter/ignite.html">Ignite</a></li> <li><a href="/docs/0.10.1/interpreter/influxdb.html">influxDB</a></li> <li><a href="/docs/0.10.1/interpreter/java.html">Java</a></li> <li><a href="/docs/0.10.1/interpreter/jupyter.html">Jupyter</a></li> <li><a href="/docs/0.10.1/interpreter/kotlin.html">Kotlin</a></li> <li><a href="/docs/0.10.1/interpreter/ksql.html">KSQL</a></li> <li><a href="/docs/0.10.1/interpreter/kylin.html">Kylin</a></li> <li><a href="/docs/0.10.1/interpreter/lens.html">Lens</a></li> <li><a href="/docs/0.10.1/interpreter/livy.html">Livy</a></li> <li><a href="/docs/0.10.1/interpreter/mahout.html">Mahout</a></li> <li><a href="/docs/0.10.1/interpreter/markdown.html">Markdown</a></li> <li><a href="/docs/0.10.1/interpreter/mongodb.html">MongoDB</a></li> <li><a href="/docs/0.10.1/interpreter/neo4j.html">Neo4j</a></li> <li><a href="/docs/0.10.1/interpreter/pig.html">Pig</a></li> <li><a href="/docs/0.10.1/interpreter/postgresql.html">Postgresql, HAWQ</a></li> <li><a href="/docs/0.10.1/interpreter/sap.html">SAP</a></li> <li><a href="/docs/0.10.1/interpreter/scalding.html">Scalding</a></li> <li><a href="/docs/0.10.1/interpreter/scio.html">Scio</a></li> <li><a href="/docs/0.10.1/interpreter/shell.html">Shell</a></li> <li><a href="/docs/0.10.1/interpreter/sparql.html">Sparql</a></li> <li><a href="/docs/0.10.1/interpreter/submarine.html">Submarine</a></li> </ul> </li> <li> <a href="#" data-toggle="dropdown" class="dropdown-toggle">More<b class="caret"></b></a> <ul class="dropdown-menu scrollable-menu" style="right: 0; left: auto;"> <li class="title"><span>Extending Zeppelin</span></li> <li><a href="/docs/0.10.1/development/writing_zeppelin_interpreter.html">Writing Zeppelin Interpreter</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Helium (Experimental)</span></li> <li><a href="/docs/0.10.1/development/helium/overview.html">Overview</a></li> <li><a href="/docs/0.10.1/development/helium/writing_application.html">Writing Helium Application</a></li> <li><a href="/docs/0.10.1/development/helium/writing_spell.html">Writing Helium Spell</a></li> <li><a href="/docs/0.10.1/development/helium/writing_visualization_basic.html">Writing Helium Visualization: Basics</a></li> <li><a href="/docs/0.10.1/development/helium/writing_visualization_transformation.html">Writing Helium Visualization: Transformation</a></li> <li role="separator" class="divider"></li> <li class="title"><span>Contributing to Zeppelin</span></li> <li><a href="/docs/0.10.1/setup/basics/how_to_build.html">How to Build Zeppelin</a></li> <li><a href="/docs/0.10.1/development/contribution/useful_developer_tools.html">Useful Developer Tools</a></li> <li><a href="/docs/0.10.1/development/contribution/how_to_contribute_code.html">How to Contribute (code)</a></li> <li><a href="/docs/0.10.1/development/contribution/how_to_contribute_website.html">How to Contribute (website)</a></li> <li role="separator" class="divider"></li> <li class="title"><span>External Resources</span></li> <li><a target="_blank" rel="noopener noreferrer" href="https://zeppelin.apache.org/community.html">Mailing List</a></li> <li><a target="_blank" rel="noopener noreferrer" href="https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Home">Apache Zeppelin Wiki</a></li> <li><a target="_blank" rel="noopener noreferrer" href="http://stackoverflow.com/questions/tagged/apache-zeppelin">Stackoverflow Questions about Zeppelin</a></li> </ul> </li> <li> <a href="/docs/0.10.1/search.html" class="nav-search-link"> <span class="fa fa-search nav-search-icon"></span> </a> </li> </ul> </nav><!--/.navbar-collapse --> </div> </div> <div class="content"> <!--<div class="hero-unit Apache Zeppelin on Spark cluster mode"> <h1></h1> </div> --> <div class="row"> <div class="col-md-12"> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <h1>Apache Zeppelin on Spark Cluster Mode</h1> <div id="toc"></div> <h2>Overview</h2> <p><a href="http://spark.apache.org/">Apache Spark</a> has supported three cluster manager types(<a href="http://spark.apache.org/docs/latest/spark-standalone.html">Standalone</a>, <a href="http://spark.apache.org/docs/latest/running-on-mesos.html">Apache Mesos</a> and <a href="http://spark.apache.org/docs/latest/running-on-yarn.html">Hadoop YARN</a>) so far. This document will guide you how you can build and configure the environment on 3 types of Spark cluster manager with Apache Zeppelin using <a href="https://www.docker.com/">Docker</a> scripts. So <a href="https://docs.docker.com/engine/installation/">install docker</a> on the machine first.</p> <h2>Spark standalone mode</h2> <p><a href="http://spark.apache.org/docs/latest/spark-standalone.html">Spark standalone</a> is a simple cluster manager included with Spark that makes it easy to set up a cluster. You can simply set up Spark standalone environment with below steps.</p> <blockquote> <p><strong>Note :</strong> Since Apache Zeppelin and Spark use same <code>8080</code> port for their web UI, you might need to change <code>zeppelin.server.port</code> in <code>conf/zeppelin-site.xml</code>.</p> </blockquote> <h3>1. Build Docker file</h3> <p>You can find docker script files under <code>scripts/docker/spark-cluster-managers</code>.</p> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash"><span class="nb">cd</span> <span class="nv">$ZEPPELIN_HOME</span>/scripts/docker/spark-cluster-managers/spark_standalone docker build -t <span class="s2">&quot;spark_standalone&quot;</span> . </code></pre></div> <h3>2. Run docker</h3> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash">docker run -it <span class="se">\</span> -p 8080:8080 <span class="se">\</span> -p 7077:7077 <span class="se">\</span> -p 8888:8888 <span class="se">\</span> -p 8081:8081 <span class="se">\</span> -h sparkmaster <span class="se">\</span> --name spark_standalone <span class="se">\</span> spark_standalone bash<span class="p">;</span> </code></pre></div> <p>Note that <code>sparkmaster</code> hostname used here to run docker container should be defined in your <code>/etc/hosts</code>.</p> <h3>3. Configure Spark interpreter in Zeppelin</h3> <p>Set Spark master as <code>spark://&lt;hostname&gt;:7077</code> in Zeppelin <strong>Interpreters</strong> setting page.</p> <p><img src="/docs/0.10.1/assets/themes/zeppelin/img/docs-img/standalone_conf.png" /></p> <h3>4. Run Zeppelin with Spark interpreter</h3> <p>After running single paragraph with Spark interpreter in Zeppelin, browse <code>https://&lt;hostname&gt;:8080</code> and check whether Spark cluster is running well or not.</p> <p><img src="/docs/0.10.1/assets/themes/zeppelin/img/docs-img/spark_ui.png" /></p> <p>You can also simply verify that Spark is running well in Docker with below command.</p> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash">ps -ef <span class="p">|</span> grep spark </code></pre></div> <h2>Spark on YARN mode</h2> <p>You can simply set up <a href="http://spark.apache.org/docs/latest/running-on-yarn.html">Spark on YARN</a> docker environment with below steps.</p> <blockquote> <p><strong>Note :</strong> Since Apache Zeppelin and Spark use same <code>8080</code> port for their web UI, you might need to change <code>zeppelin.server.port</code> in <code>conf/zeppelin-site.xml</code>.</p> </blockquote> <h3>1. Build Docker file</h3> <p>You can find docker script files under <code>scripts/docker/spark-cluster-managers</code>.</p> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash"><span class="nb">cd</span> <span class="nv">$ZEPPELIN_HOME</span>/scripts/docker/spark-cluster-managers/spark_yarn_cluster docker build -t <span class="s2">&quot;spark_yarn&quot;</span> . </code></pre></div> <h3>2. Run docker</h3> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash">docker run -it <span class="se">\</span> -p 5000:5000 <span class="se">\</span> -p 9000:9000 <span class="se">\</span> -p 9001:9001 <span class="se">\</span> -p 8088:8088 <span class="se">\</span> -p 8042:8042 <span class="se">\</span> -p 8030:8030 <span class="se">\</span> -p 8031:8031 <span class="se">\</span> -p 8032:8032 <span class="se">\</span> -p 8033:8033 <span class="se">\</span> -p 8080:8080 <span class="se">\</span> -p 7077:7077 <span class="se">\</span> -p 8888:8888 <span class="se">\</span> -p 8081:8081 <span class="se">\</span> -p 50010:50010 <span class="se">\</span> -p 50075:50075 <span class="se">\</span> -p 50020:50020 <span class="se">\</span> -p 50070:50070 <span class="se">\</span> --name spark_yarn <span class="se">\</span> -h sparkmaster <span class="se">\</span> spark_yarn bash<span class="p">;</span> </code></pre></div> <p>Note that <code>sparkmaster</code> hostname used here to run docker container should be defined in your <code>/etc/hosts</code>.</p> <h3>3. Verify running Spark on YARN.</h3> <p>You can simply verify the processes of Spark and YARN are running well in Docker with below command.</p> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash">ps -ef </code></pre></div> <p>You can also check each application web UI for HDFS on <code>http://&lt;hostname&gt;:50070/</code>, YARN on <code>http://&lt;hostname&gt;:8088/cluster</code> and Spark on <code>http://&lt;hostname&gt;:8080/</code>.</p> <h3>4. Configure Spark interpreter in Zeppelin</h3> <p>Set following configurations to <code>conf/zeppelin-env.sh</code>.</p> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash"><span class="nb">export </span><span class="nv">HADOOP_CONF_DIR</span><span class="o">=[</span>your_hadoop_conf_path<span class="o">]</span> <span class="nb">export </span><span class="nv">SPARK_HOME</span><span class="o">=[</span>your_spark_home_path<span class="o">]</span> </code></pre></div> <p><code>HADOOP_CONF_DIR</code>(Hadoop configuration path) is defined in <code>/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf</code>.</p> <p>Don&#39;t forget to set Spark <code>spark.master</code> as <code>yarn-client</code> in Zeppelin <strong>Interpreters</strong> setting page like below.</p> <p><img src="/docs/0.10.1/assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png" /></p> <h3>5. Run Zeppelin with Spark interpreter</h3> <p>After running a single paragraph with Spark interpreter in Zeppelin, browse <code>http://&lt;hostname&gt;:8088/cluster/apps</code> and check Zeppelin application is running well or not.</p> <p><img src="/docs/0.10.1/assets/themes/zeppelin/img/docs-img/yarn_applications.png" /></p> <h2>Spark on Mesos mode</h2> <p>You can simply set up <a href="http://spark.apache.org/docs/latest/running-on-mesos.html">Spark on Mesos</a> docker environment with below steps.</p> <h3>1. Build Docker file</h3> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash"><span class="nb">cd</span> <span class="nv">$ZEPPELIN_HOME</span>/scripts/docker/spark-cluster-managers/spark_mesos docker build -t <span class="s2">&quot;spark_mesos&quot;</span> . </code></pre></div> <h3>2. Run docker</h3> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash">docker run --net<span class="o">=</span>host -it <span class="se">\</span> -p 8080:8080 <span class="se">\</span> -p 7077:7077 <span class="se">\</span> -p 8888:8888 <span class="se">\</span> -p 8081:8081 <span class="se">\</span> -p 8082:8082 <span class="se">\</span> -p 5050:5050 <span class="se">\</span> -p 5051:5051 <span class="se">\</span> -p 4040:4040 <span class="se">\</span> -h sparkmaster <span class="se">\</span> --name spark_mesos <span class="se">\</span> spark_mesos bash<span class="p">;</span> </code></pre></div> <p>Note that <code>sparkmaster</code> hostname used here to run docker container should be defined in your <code>/etc/hosts</code>.</p> <h3>3. Verify running Spark on Mesos.</h3> <p>You can simply verify the processes of Spark and Mesos are running well in Docker with below command.</p> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash">ps -ef </code></pre></div> <p>You can also check each application web UI for Mesos on <code>http://&lt;hostname&gt;:5050/cluster</code> and Spark on <code>http://&lt;hostname&gt;:8080/</code>.</p> <h3>4. Configure Spark interpreter in Zeppelin</h3> <div class="highlight"><pre><code class="bash language-bash" data-lang="bash"><span class="nb">export </span><span class="nv">MESOS_NATIVE_JAVA_LIBRARY</span><span class="o">=[</span>PATH OF libmesos.so<span class="o">]</span> <span class="nb">export </span><span class="nv">SPARK_HOME</span><span class="o">=[</span>PATH OF SPARK HOME<span class="o">]</span> </code></pre></div> <p>Don&#39;t forget to set Spark <code>spark.master</code> as <code>mesos://127.0.1.1:5050</code> in Zeppelin <strong>Interpreters</strong> setting page like below.</p> <p><img src="/docs/0.10.1/assets/themes/zeppelin/img/docs-img/zeppelin_mesos_conf.png" /></p> <h3>5. Run Zeppelin with Spark interpreter</h3> <p>After running a single paragraph with Spark interpreter in Zeppelin, browse <code>http://&lt;hostname&gt;:5050/#/frameworks</code> and check Zeppelin application is running well or not.</p> <p><img src="/docs/0.10.1/assets/themes/zeppelin/img/docs-img/mesos_frameworks.png" /></p> <h3>Troubleshooting for Spark on Mesos</h3> <ul> <li>If you have problem with hostname, use <code>--add-host</code> option when executing <code>dockerrun</code></li> </ul> <div class="highlight"><pre><code class="text language-text" data-lang="text">## use `--add-host=moby:127.0.0.1` option to resolve ## since docker container couldn&#39;t resolve `moby` : java.net.UnknownHostException: moby: moby: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1496) at org.apache.spark.util.Utils$.findLocalInetAddress(Utils.scala:789) at org.apache.spark.util.Utils$.org$apache$spark$util$Utils$$localIpAddress$lzycompute(Utils.scala:782) at org.apache.spark.util.Utils$.org$apache$spark$util$Utils$$localIpAddress(Utils.scala:782) </code></pre></div> <ul> <li>If you have problem with mesos master, try <code>mesos://127.0.0.1</code> instead of <code>mesos://127.0.1.1</code></li> </ul> <div class="highlight"><pre><code class="text language-text" data-lang="text">I0103 20:17:22.329269 340 sched.cpp:330] New master detected at master@127.0.1.1:5050 I0103 20:17:22.330749 340 sched.cpp:341] No credentials provided. Attempting to register without authentication W0103 20:17:22.333531 340 sched.cpp:736] Ignoring framework registered message because it was sentfrom &#39;master@127.0.0.1:5050&#39; instead of the leading master &#39;master@127.0.1.1:5050&#39; W0103 20:17:24.040252 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom &#39;master@127.0.0.1:5050&#39; instead of the leading master &#39;master@127.0.1.1:5050&#39; W0103 20:17:26.150250 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom &#39;master@127.0.0.1:5050&#39; instead of the leading master &#39;master@127.0.1.1:5050&#39; W0103 20:17:26.737604 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom &#39;master@127.0.0.1:5050&#39; instead of the leading master &#39;master@127.0.1.1:5050&#39; W0103 20:17:35.241714 336 sched.cpp:736] Ignoring framework registered message because it was sentfrom &#39;master@127.0.0.1:5050&#39; instead of the leading master &#39;master@127.0.1.1:5050&#39; </code></pre></div> </div> </div> <hr> <footer> <!-- <p>&copy; 2022 The Apache Software Foundation</p>--> </footer> </div> <script type="text/javascript"> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-45176241-5', 'zeppelin.apache.org'); ga('require', 'linkid', 'linkid.js'); ga('send', 'pageview'); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10