CINXE.COM
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <link href="book.css" rel="stylesheet"/> <img src="knox-logo.gif" alt="Knox"/> <img src="apache-logo.gif" align="right" alt="Apache"/> <h1><a id="Apache+Knox+Gateway+2.1.x+Developer's+Guide">Apache Knox Gateway 2.1.x Developer’s Guide</a> <a href="#Apache+Knox+Gateway+2.1.x+Developer's+Guide"><img src="markbook-section-link.png"/></a></h1> <h2><a id="Table+Of+Contents">Table Of Contents</a> <a href="#Table+Of+Contents"><img src="markbook-section-link.png"/></a></h2> <ul> <li><a href="#Overview">Overview</a> <ul> <li><a href="#Architecture+Overview">Architecture Overview</a></li> <li><a href="#Project+Overview">Project Overview</a></li> <li><a href="#Development+Processes">Development Processes</a></li> <li><a href="#Docker+Image">Docker Image</a></li> </ul> </li> <li><a href="#Behavior">Behavior</a> <ul> <li><a href="#Runtime+Behavior">Runtime Behavior</a></li> <li><a href="#Deployment+Behavior">Deployment Behavior</a></li> </ul> </li> <li><a href="#Extension+Logistics">Extension Logistics</a> <ul> <li><a href="#Providers">Providers</a></li> <li><a href="#Services">Services</a></li> <li><a href="#Service+Discovery">Service Discovery</a></li> </ul> </li> <li><a href="#Standard+Providers">Standard Providers</a> <ul> <li><a href="#Rewrite+Provider">Rewrite Provider</a></li> </ul> </li> <li><a href="#Gateway+Services">Gateway Services</a></li> <li><a href="#KnoxSSO+Integration">KnoxSSO Integration</a></li> <li><a href="#Health+Monitoring+API">Health Monitoring API</a></li> <li><a href="#Auditing">Auditing</a></li> <li><a href="#Logging">Logging</a></li> <li><a href="#Internationalization">Internationalization</a></li> <li><a href="#Admin+UI">Admin UI</a></li> </ul> <h2><a id="Overview">Overview</a> <a href="#Overview"><img src="markbook-section-link.png"/></a></h2> <p>Apache Knox gateway is a specialized reverse proxy gateway for various Hadoop REST APIs. However, the gateway is built entirely upon a fairly generic framework. This framework is used to “plug-in” all of the behavior that makes it specific to Hadoop in general and any particular Hadoop REST API. It would be equally as possible to create a customized reverse proxy for other non-Hadoop HTTP endpoints. This approach is taken to ensure that the Apache Knox gateway can scale with the rapidly evolving Hadoop ecosystem.</p> <p>Throughout this guide we will be using a publicly available REST API to demonstrate the development of various extension mechanisms. <a href="http://openweathermap.org/">http://openweathermap.org/</a></p> <h3><a id="Architecture+Overview">Architecture Overview</a> <a href="#Architecture+Overview"><img src="markbook-section-link.png"/></a></h3> <p>The gateway itself is a layer over an embedded Jetty JEE server. At the very highest level the gateway processes requests by using request URLs to lookup specific JEE Servlet Filter chain that is used to process the request. The gateway framework provides extensible mechanisms to assemble chains of custom filters that support secured access to services.</p> <p>The gateway has two primary extensibility mechanisms: Service and Provider. The Service extensibility framework provides a way to add support for new HTTP/REST endpoints. For example, the support for WebHdfs is plugged into the Knox gateway as a Service. The Provider extensibility framework allows adding new features to the gateway that can be used across Services. An example of a Provider is an authentication provider. Providers can also expose APIs that other service and provider extensions can utilize.</p> <p>Service and Provider integrations interact with the gateway framework in two distinct phases: Deployment and Runtime. The gateway framework can be thought of as a layer over the JEE Servlet framework. Specifically all runtime processing within the gateway is performed by JEE Servlet Filters. The two phases interact with this JEE Servlet Filter based model in very different ways. The first phase, Deployment, is responsible for converting fairly simple to understand configuration called topology into JEE WebArchive (WAR) based implementation details. The second phase, Runtime, is the processing of requests via a set of Filters configured in the WAR.</p> <p>From an “ethos” perspective, Service and Provider extensions should attempt to incur complexity associated with configuration in the deployment phase. This should allow for very streamlined request processing that is very high performance and easily testable. The preference at runtime, in OO style, is for small classes that perform a specific function. The ideal set of implementation classes are then assembled by the Service and Provider plugins during deployment.</p> <p>A second critical design consideration is streaming. The processing infrastructure is build around JEE Servlet Filters as they provide a natural streaming interception model. All Provider implementations should make every attempt to maintaining this streaming characteristic.</p> <h3><a id="Project+Overview">Project Overview</a> <a href="#Project+Overview"><img src="markbook-section-link.png"/></a></h3> <p>The table below describes the purpose of the current modules in the project. Of particular importance are the root pom.xml and the gateway-release module. The root pom.xml is critical because this is where all dependency version must be declared. There should be no dependency version information in module pom.xml files. The gateway-release module is critical because the dependencies declared there essentially define the classpath of the released gateway server. This is also true of the other -release modules in the project.</p> <table> <thead> <tr> <th>File/Module </th> <th>Description </th> </tr> </thead> <tbody> <tr> <td>LICENSE </td> <td>The license for all source files in the release. </td> </tr> <tr> <td>NOTICE </td> <td>Attributions required by dependencies. </td> </tr> <tr> <td>README </td> <td>A brief overview of the Knox project. </td> </tr> <tr> <td>CHANGES </td> <td>A description of the changes for each release. </td> </tr> <tr> <td>ISSUES </td> <td>The knox issues for the current release. </td> </tr> <tr> <td>gateway-util-common </td> <td>Common low level utilities used by many modules. </td> </tr> <tr> <td>gateway-util-launcher </td> <td>The launcher framework. </td> </tr> <tr> <td>gateway-util-urltemplate </td> <td>The i18n logging and resource framework. </td> </tr> <tr> <td>gateway-i18n </td> <td>The URL template and rewrite utilities </td> </tr> <tr> <td>gateway-i18n-logging-log4j </td> <td>The integration of i18n logging with log4j. </td> </tr> <tr> <td>gateway-i18n-logging-sl4j </td> <td>The integration of i18n logging with sl4j. </td> </tr> <tr> <td>gateway-spi </td> <td>The SPI for service and provider extensions. </td> </tr> <tr> <td>gateway-provider-identity-assertion-common </td> <td>The identity assertion provider base </td> </tr> <tr> <td>gateway-provider-identity-assertion-concat </td> <td>An identity assertion provider that facilitates prefix and suffix concatenation.</td> </tr> <tr> <td>gateway-provider-identity-assertion-pseudo </td> <td>The default identity assertion provider. </td> </tr> <tr> <td>gateway-provider-jersey </td> <td>The jersey display provider. </td> </tr> <tr> <td>gateway-provider-rewrite </td> <td>The URL rewrite provider. </td> </tr> <tr> <td>gateway-provider-rewrite-func-hostmap-static </td> <td>Host mapping function extension to rewrite. </td> </tr> <tr> <td>gateway-provider-rewrite-func-service-registry </td> <td>Service registry function extension to rewrite. </td> </tr> <tr> <td>gateway-provider-rewrite-step-secure-query </td> <td>Crypto step extension to rewrite. </td> </tr> <tr> <td>gateway-provider-security-authz-acls </td> <td>Service level authorization. </td> </tr> <tr> <td>gateway-provider-security-jwt </td> <td>JSON Web Token utilities. </td> </tr> <tr> <td>gateway-provider-security-preauth </td> <td>Preauthenticated SSO header support. </td> </tr> <tr> <td>gateway-provider-security-shiro </td> <td>Shiro authentiation integration. </td> </tr> <tr> <td>gateway-provider-security-webappsec </td> <td>Filters to prevent common webapp security issues. </td> </tr> <tr> <td>gateway-service-as </td> <td>The implementation of the Access service POC. </td> </tr> <tr> <td>gateway-service-definitions </td> <td>The implementation of the Service definition and rewrite files. </td> </tr> <tr> <td>gateway-service-hbase </td> <td>The implementation of the HBase service. </td> </tr> <tr> <td>gateway-service-hive </td> <td>The implementation of the Hive service. </td> </tr> <tr> <td>gateway-service-oozie </td> <td>The implementation of the Oozie service. </td> </tr> <tr> <td>gateway-service-tgs </td> <td>The implementation of the Ticket Granting service POC. </td> </tr> <tr> <td>gateway-service-webhdfs </td> <td>The implementation of the WebHdfs service. </td> </tr> <tr> <td>gateway-discovery-ambari </td> <td>The Ambari service URL discovery implementation. </td> </tr> <tr> <td>gateway-service-remoteconfig </td> <td>The implementation of the RemoteConfigurationRegistryClientService. </td> </tr> <tr> <td>gateway-server </td> <td>The implementation of the Knox gateway server. </td> </tr> <tr> <td>gateway-shell </td> <td>The implementation of the Knox Groovy shell. </td> </tr> <tr> <td>gateway-test-ldap </td> <td>Pulls in all of the dependencies of the test LDAP server. </td> </tr> <tr> <td>gateway-server-launcher </td> <td>The launcher definition for the gateway. </td> </tr> <tr> <td>gateway-shell-launcher </td> <td>The launcher definition for the shell. </td> </tr> <tr> <td>knox-cli-launcher </td> <td>A module to pull in all of the dependencies of the CLI. </td> </tr> <tr> <td>gateway-test-ldap-launcher </td> <td>The launcher definition for the test LDAP server. </td> </tr> <tr> <td>gateway-release </td> <td>The definition of the gateway binary release. Contains content and dependencies to be included in binary gateway package. </td> </tr> <tr> <td>gateway-test-utils </td> <td>Various utilities used in unit and system tests. </td> </tr> <tr> <td>gateway-test </td> <td>The functional tests. </td> </tr> <tr> <td>pom.xml </td> <td>The top level pom. </td> </tr> <tr> <td>build.xml </td> <td>A collection of utility for building and releasing. </td> </tr> </tbody> </table> <h3><a id="Development+Processes">Development Processes</a> <a href="#Development+Processes"><img src="markbook-section-link.png"/></a></h3> <p>The project uses Maven in general with a few convenience Ant targets.</p> <p>Building the project can be built via Maven or Ant. The two commands below are equivalent.</p> <pre><code>mvn clean install ant </code></pre> <p>A more complete build can be done that builds and generates the unsigned ZIP release artifacts. You will find these in the target/{version} directory (e.g. target/0.XX.0-SNAPSHOT).</p> <pre><code>mvn -Ppackage clean install ant release </code></pre> <p>There are a few other Ant targets that are especially convenient for testing.</p> <p>This command installs the gateway into the {{{install}}} directory of the project. Note that this command does not first build the project.</p> <pre><code>ant install-test-home </code></pre> <p>This command starts the gateway and LDAP servers installed by the command above into a test GATEWAY_HOME (i.e. install). Note that this command does not first install the test home.</p> <pre><code>ant start-test-servers </code></pre> <p>So putting things together the following Ant command will build a release, install it and start the servers ready for manual testing.</p> <pre><code>ant release install-test-home start-test-servers </code></pre> <h3><a id="Docker+Image">Docker Image</a> <a href="#Docker+Image"><img src="markbook-section-link.png"/></a></h3> <p>Apache Knox ships with a <code>docker</code> Maven module that will build a Docker image. To build the Knox Docker image, you must have Docker running on your machine. The following Maven command will build Knox and package it into a Docker image.</p> <pre><code>mvn -Ppackage,release,docker clean package </code></pre> <p>This will build 2 Docker images:</p> <ul> <li><code>apache/knox:gateway-2.1.0-SNAPSHOT</code></li> <li><code>apache/knox:ldap-2.1.0-SNAPSHOT</code></li> </ul> <p>The <code>gateway</code> image will use an entrypoint to start Knox Gateway. The <code>ldap</code> image will use an entrypoint to start Knox Demo LDAP.</p> <p>An example of using the Docker images would be the following:</p> <pre><code>docker run -d --name knox-ldap -p 33389:33389 apache/knox:ldap-2.1.0-SNAPSHOT docker run -d --name knox-gateway -p 8443:8443 apache/knox:gateway-2.1.0-SNAPSHOT </code></pre> <p>Using docker-compose that would look like this:</p> <pre><code>docker-compose -f gateway-docker/src/main/resources/docker-compose.yml up </code></pre> <p>The images are designed to be a base that can be built on to add your own providers, descriptors, and topologies as necessary.</p> <h2><a id="Behavior">Behavior</a> <a href="#Behavior"><img src="markbook-section-link.png"/></a></h2> <p>There are two distinct phases in the behavior of the gateway. These are the deployment and runtime phases. The deployment phase is responsible for converting topology descriptors into an executable JEE style WAR. The runtime phase is the processing of requests via WAR created during the deployment phase.</p> <p>The deployment phase is arguably the more complex of the two phases. This is because runtime relies on well known JEE constructs while deployment introduces new framework concepts. The base concept of the deployment framework is that of a “contributor”. In the framework, contributors are pluggable component responsible for generating JEE WAR artifacts from topology files.</p> <h3><a id="Deployment+Behavior">Deployment Behavior</a> <a href="#Deployment+Behavior"><img src="markbook-section-link.png"/></a></h3> <p>The goal of the deployment phase is to take easy to understand topology descriptions and convert them into optimized runtime artifacts. Our goal is not only should the topology descriptors be easy to understand, but have them be easy for a management system (e.g. Ambari) to generate. Think of deployment as compiling an assembly descriptor into a JEE WAR. WARs are then deployed to an embedded JEE container (i.e. Jetty).</p> <p>Consider the results of starting the gateway the first time. There are two sets of files that are relevant for deployment. The first is the topology file <code><GATEWAY_HOME>/conf/topologies/sandbox.xml</code>. This second set is the WAR structure created during the deployment of the topology file.</p> <pre><code>data/deployments/sandbox.war.143bfef07f0/WEB-INF web.xml gateway.xml shiro.ini rewrite.xml hostmap.txt </code></pre> <p>Notice that the directory <code>sandbox.war.143bfef07f0</code> is an “unzipped” representation of a JEE WAR file. This specifically means that it contains a <code>WEB-INF</code> directory which contains a <code>web.xml</code> file. For the curious the strange number (i.e. 143bfef07f0) in the name of the WAR directory is an encoded timestamp. This is the timestamp of the topology file (i.e. sandbox.xml) at the time the deployment occurred. This value is used to determine when topology files have changed and redeployment is required.</p> <p>Here is a brief overview of the purpose of each file in the WAR structure.</p> <dl> <dt>web.xml </dt> <dd>A standard JEE WAR descriptor. In this case a build-in GatewayServlet is mapped to the url pattern /*.</dd> <dt>gateway.xml </dt> <dd>The configuration file for the GatewayServlet. Defines the filter chain that will be applied to each service’s various URLs.</dd> <dt>shiro.ini </dt> <dd>The configuration file for the Shiro authentication provider’s filters. This information is derived from the information in the provider section of the topology file.</dd> <dt>rewrite.xml </dt> <dd>The configuration file for the rewrite provider’s filter. This captures all of the rewrite rules for the services. These rules are contributed by the contributors for each service.</dd> <dt>hostmap.txt </dt> <dd>The configuration file the hostmap provider’s filter. This information is derived from the information in the provider section of the topology file.</dd> </dl> <p>The deployment framework follows “visitor” style patterns. Each topology file is parsed and the various constructs within it are “visited”. The appropriate contributor for each visited construct is selected by the framework. The contributor is then passed the contrust from the topology file and asked to update the JEE WAR artifacts. Each contributor is free to inspect and modify any portion of the WAR artifacts.</p> <p>The diagram below provides an overview of the deployment processing. Detailed descriptions of each step follow the diagram.</p> <img src='deployment-overview.png'/> <ol> <li> <p>The gateway server loads a topology file from conf/topologies into an internal structure.</p></li> <li> <p>The gateway server delegates to a deployment factory to create the JEE WAR structure.</p></li> <li> <p>The deployment factory first creates a basic WAR structure with WEB-INF/web.xml.</p></li> <li> <p>Each provider and service in the topology is visited and the appropriate deployment contributor invoked. Each contributor is passed the appropriate information from the topology and modifies the WAR structure.</p></li> <li> <p>A complete WAR structure is returned to the gateway service.</p></li> <li> <p>The gateway server uses internal container APIs to dynamically deploy the WAR.</p></li> </ol> <p>The Java method below is the actual code from the DeploymentFactory that implements this behavior. You will note the initialize, contribute, finalize sequence. Each contributor is given three opportunities to interact with the topology and archive. This allows the various contributors to interact if required. For example, the service contributors use the deployment descriptor added to the WAR by the rewrite provider.</p> <pre><code class="java">public static WebArchive createDeployment( GatewayConfig config, Topology topology ) { Map<String,List<ProviderDeploymentContributor>> providers; Map<String,List<ServiceDeploymentContributor>> services; DeploymentContext context; providers = selectContextProviders( topology ); services = selectContextServices( topology ); context = createDeploymentContext( config, topology.getName(), topology, providers, services ); initialize( context, providers, services ); contribute( context, providers, services ); finalize( context, providers, services ); return context.getWebArchive(); } </code></pre> <p>Below is a diagram that provides more detail. This diagram focuses on the interactions between the deployment factory and the service deployment contributors. Detailed description of each step follow the diagram.</p> <img src='deployment-service.png'/> <ol> <li> <p>The gateway server loads global configuration (i.e. <GATEWAY_HOME>/conf/gateway-site.xml</p></li> <li> <p>The gateway server loads a topology descriptor file.</p></li> <li> <p>The gateway server delegates to the deployment factory to create a deployable WAR structure.</p></li> <li> <p>The deployment factory creates a runtime descriptor to configure that gateway servlet.</p></li> <li> <p>The deployment factory creates a basic WAR structure and adds the gateway servlet runtime descriptor to it.</p></li> <li> <p>The deployment factory creates a deployment context object and adds the WAR structure to it.</p></li> <li> <p>For each service defined in the topology descriptor file the appropriate service deployment contributor is selected and invoked. The correct service deployment contributor is determined by matching the role of a service in the topology descriptor to a value provided by the getRole() method of the ServiceDeploymentContributor interface. The initializeContribution method from <em>each</em> service identified in the topology is called. Each service deployment contributor is expected to setup any runtime artifacts in the WAR that other services or provides may need.</p></li> <li> <p>The contributeService method from <em>each</em> service identified in the topology is called. This is where the service deployment contributors will modify any runtime descriptors.</p></li> <li> <p>One of they ways that a service deployment contributor can modify the runtime descriptors is by asking the framework to contribute filters. This is how services are loosely coupled to the providers of features. For example a service deployment contributor might ask the framework to contribute the filters required for authorization. The deployment framework will then delegate to the correct provider deployment contributor to add filters for that feature.</p></li> <li> <p>Finally the finalizeContribution method for each service is invoked. This provides an opportunity to react to anything done via the contributeService invocations and tie up any loose ends.</p></li> <li> <p>The populated WAR is returned to the gateway server.</p></li> </ol> <p>The following diagram will provided expanded detail on the behavior of provider deployment contributors. Much of the beginning and end of the sequence shown overlaps with the service deployment sequence above. Those steps (i.e. 1-6, 17) will not be described below for brevity. The remaining steps have detailed descriptions following the diagram.</p> <img src='deployment-provider.png'/> <ol> <li> <p>For each provider the appropriate provider deployment contributor is selected and invoked. The correct service deployment contributor is determined by first matching the role of a provider in the topology descriptor to a value provided by the getRole() method of the ProviderDeploymentContributor interface. If this is ambiguous, the name from the topology is used match the value provided by the getName() method of the ProviderDeploymentContributor interface. The initializeContribution method from <em>each</em> provider identified in the topology is called. Each provider deployment contributor is expected to setup any runtime artifacts in the WAR that other services or provides may need. Note: In addition, others provider not explicitly referenced in the topology may have their initializeContribution method called. If this is the case only one default instance for each role declared vis the getRole() method will be used. The method used to determine the default instance is non-deterministic so it is best to select a particular named instance of a provider for each role.</p></li> <li> <p>Each provider deployment contributor will typically add any runtime deployment descriptors it requires for operation. These descriptors are added to the WAR structure within the deployment context.</p></li> <li> <p>The contributeProvider method of each configured or default provider deployment contributor is invoked.</p></li> <li> <p>Each provider deployment contributor populates any runtime deployment descriptors based on information in the topology.</p></li> <li> <p>Provider deployment contributors are never asked to contribute to the deployment directly. Instead a service deployment contributor will ask to have a particular provider role (e.g. authentication) contribute to the deployment.</p></li> <li> <p>A service deployment contributor asks the framework to contribute filters for a given provider role.</p></li> <li> <p>The framework selects the appropriate provider deployment contributor and invokes its contributeFilter method.</p></li> <li> <p>During this invocation the provider deployment contributor populate populate service specific information. In particular it will add filters to the gateway servlet’s runtime descriptor by adding JEE Servlet Filters. These filters will be added to the resources (or URLs) identified by the service deployment contributor.</p></li> <li> <p>The finalizeContribute method of all referenced and default provider deployment contributors is invoked.</p></li> <li> <p>The provider deployment contributor is expected to perform any final modifications to the runtime descriptors in the WAR structure.</p></li> </ol> <h3><a id="Runtime+Behavior">Runtime Behavior</a> <a href="#Runtime+Behavior"><img src="markbook-section-link.png"/></a></h3> <p>The runtime behavior of the gateway is somewhat simpler as it more or less follows well known JEE models. There is one significant wrinkle. The filter chains are managed within the GatewayServlet as opposed to being managed by the JEE container. This is the result of an early decision made in the project. The intention is to allow more powerful URL matching than is provided by the JEE Servlet mapping mechanisms.</p> <p>The diagram below provides a high level overview of the runtime processing. An explanation for each step is provided after the diagram.</p> <img src='runtime-overview.png'/> <ol> <li> <p>A REST client makes a HTTP request that is received by the embedded JEE container.</p></li> <li> <p>A filter chain is looked up in a map of URLs to filter chains.</p></li> <li> <p>The filter chain, which is itself a filter, is invoked.</p></li> <li> <p>Each filter invokes the filters that follow it in the chain. The request and response objects can be wrapped in typically JEE Filter fashion. Filters may not continue chain processing and return if that is appropriate.</p></li> <li> <p>Eventually the end of the last filter in the chain is invoked. Typically this is a special “dispatch” filter that is responsible for dispatching the request to the ultimate endpoint. Dispatch filters are also responsible for reading the response.</p></li> <li> <p>The response may be in the form of a number of content types (e.g. application/json, text/xml).</p></li> <li> <p>The response entity is streamed through the various response wrappers added by the filters. These response wrappers may rewrite various portions of the headers and body as per their configuration.</p></li> <li> <p>The return of the response entity to the client is ultimately “pulled through” the filter response wrapper by the container.</p></li> <li> <p>The response entity is returned original client.</p></li> </ol> <p>This diagram providers a more detailed breakdown of the request processing. Again descriptions of each step follow the diagram.</p> <img src='runtime-request-processing.png'/> <ol> <li> <p>A REST client makes a HTTP request that is received by the embedded JEE container.</p></li> <li> <p>The embedded container looks up the servlet mapped to the URL and invokes the service method. This our case the GatewayServlet is mapped to /* and therefore receives all requests for a given topology. Keep in mind that the WAR itself is deployed on a root context path that typically contains a level for the gateway and the name of the topology. This means that there is a single GatewayServlet per topology and it is effectivly mapped to <gateway>/<topology>/*.</p></li> <li> <p>The GatewayServlet holds a single reference to a GatewayFilter which is a specialized JEE Servlet Filter. This choice was made to allow the GatewayServlet to dynamically deploy modified topologies. This is done by building a new GatewayFilter instance and replacing the old in an atomic fashion.</p></li> <li> <p>The GatewayFilter contains another layer of URL mapping as defined in the gateway.xml runtime descriptor. The various service deployment contributor added these mappings at deployment time. Each service may add a number of different sub-URLs depending in their requirements. These sub-URLs will all be mapped to independently configured filter chains.</p></li> <li> <p>The GatewayFilter invokes the doFilter method on the selected chain.</p></li> <li> <p>The chain invokes the doFilter method of the first filter in the chain.</p></li> <li> <p>Each filter in the chain continues processing by invoking the doFilter on the next filter in the chain. Ultimately a dispatch filter forward the request to the real service instead of invoking another filter. This is sometimes referred to as pivoting.</p></li> </ol> <h2><a id="Gateway+Servlet+&+Gateway+Filter">Gateway Servlet & Gateway Filter</a> <a href="#Gateway+Servlet+&+Gateway+Filter"><img src="markbook-section-link.png"/></a></h2> <p>TODO</p> <pre><code class="xml"><web-app> <servlet> <servlet-name>sample</servlet-name> <servlet-class>org.apache.knox.gateway.GatewayServlet</servlet-class> <init-param> <param-name>gatewayDescriptorLocation</param-name> <param-value>gateway.xml</param-value> </init-param> </servlet> <servlet-mapping> <servlet-name>sandbox</servlet-name> <url-pattern>/*</url-pattern> </servlet-mapping> <listener> <listener-class>org.apache.knox.gateway.services.GatewayServicesContextListener</listener-class> </listener> ... </web-app> </code></pre> <pre><code class="xml"><gateway> <resource> <role>WEATHER</role> <pattern>/weather/**?**</pattern> <filter> <role>authentication</role> <name>sample</name> <class>...</class> </filter> <filter>...</filter>* </resource> </gateway> </code></pre> <pre><code class="java">@Test public void testDevGuideSample() throws Exception { Template pattern, input; Matcher<String> matcher; Matcher<String>.Match match; // GET http://api.openweathermap.org/data/2.5/weather?q=Palo+Alto pattern = Parser.parse( "/weather/**?**" ); input = Parser.parse( "/weather/2.5?q=Palo+Alto" ); matcher = new Matcher<String>(); matcher.add( pattern, "fake-chain" ); match = matcher.match( input ); assertThat( match.getValue(), is( "fake-chain") ); } </code></pre> <h2><a id="Extension+Logistics">Extension Logistics</a> <a href="#Extension+Logistics"><img src="markbook-section-link.png"/></a></h2> <p>There are a number of extension points available in the gateway: services, providers, rewrite steps and functions, etc. All of these use the Java ServiceLoader mechanism for their discovery. There are two ways to make these extensions available on the class path at runtime. The first way to add a new module to the project and have the extension “built-in”. The second is to add the extension to the class path of the server after it is installed. Both mechanism are described in more detail below.</p> <h3><a id="Service+Loaders">Service Loaders</a> <a href="#Service+Loaders"><img src="markbook-section-link.png"/></a></h3> <p>Extensions are discovered via Java’s <a href="http://docs.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html">Service Loader</a> mechanism. There are good <a href="http://docs.oracle.com/javase/tutorial/ext/basics/spi.html">tutorials</a> available for learning more about this. The basics come down to two things.</p> <ol> <li> <p>Implement the service contract interface (e.g. ServiceDeploymentContributor, ProviderDeploymentContributor)</p></li> <li> <p>Create a file in META-INF/services of the JAR that will contain the extension. This file will be named as the fully qualified name of the contract interface (e.g. org.apache.knox.gateway.deploy.ProviderDeploymentContributor). The contents of the file will be the fully qualified names of any implementation of that contract interface in that JAR.</p></li> </ol> <p>One tip is to include a simple test with each of you extension to ensure that it will be properly discovered. This is very helpful in situations where a refactoring fails to change the a class in the META-INF/services files. An example of one such test from the project is shown below.</p> <pre><code class="java"> @Test public void testServiceLoader() throws Exception { ServiceLoader loader = ServiceLoader.load( ProviderDeploymentContributor.class ); Iterator iterator = loader.iterator(); assertThat( "Service iterator empty.", iterator.hasNext() ); while( iterator.hasNext() ) { Object object = iterator.next(); if( object instanceof ShiroDeploymentContributor ) { return; } } fail( "Failed to find " + ShiroDeploymentContributor.class.getName() + " via service loader." ); } </code></pre> <h3><a id="Class+Path">Class Path</a> <a href="#Class+Path"><img src="markbook-section-link.png"/></a></h3> <p>One way to extend the functionality of the server without having to recompile is to add the extension JARs to the servers class path. As an extensible server this is made straight forward but it requires some understanding of how the server’s classpath is setup. In the <GATEWAY_HOME> directory there are four class path related directories (i.e. bin, lib, dep, ext).</p> <p>The bin directory contains very small “launcher” jars that contain only enough code to read configuration and setup a class path. By default the configuration of a launcher is embedded with the launcher JAR but it may also be extracted into a .cfg file. In that file you will see how the class path is defined.</p> <pre><code>class.path=../lib/*.jar,../dep/*.jar;../ext;../ext/*.jar </code></pre> <p>The paths are all relative to the directory that contains the launcher JAR.</p> <dl> <dt>../lib/*.jar </dt> <dd>These are the “built-in” jars that are part of the project itself. Information is provided elsewhere in this document for how to integrate a built-in extension.</dd> <dt>../dep/*.jar </dt> <dd>These are the JARs for all of the external dependencies of the project. This separation between the generated JARs and dependencies help keep licensing issues straight.</dd> <dt>../ext </dt> <dd>This directory is for post-install extensions and is empty by default. Including the directory (vs *.jar) allows for individual classes to be placed in this directory. </dd> <dt>../ext/*.jar </dt> <dd>This would pick up all extension JARs placed in the ext directory.</dd> </dl> <p>Note that order is significant. The lib JARs take precedence over dep JARs and they take precedence over ext classes and JARs.</p> <h3><a id="Maven+Module">Maven Module</a> <a href="#Maven+Module"><img src="markbook-section-link.png"/></a></h3> <p>Integrating an extension into the project follows well established Maven patterns for adding modules. Below are several points that are somewhat unique to the Knox project.</p> <ol> <li> <p>Add the module to the root pom.xml file’s <modules> list. Take care to ensure that the module is in the correct place in the list based on its dependencies. Note: In general modules should not have non-test dependencies on gateway-server but rather gateway-spi</p></li> <li> <p>Any new dependencies must be represented in the root pom.xml file’s <dependencyManagement> section. The required version of the dependencies will be declared there. The new sub-module’s pom.xml file must not include dependency version information. This helps prevent dependency version conflict issues.</p></li> <li> <p>If the extension is to be “built into” the released gateway server it needs to be added as a dependency to the gateway-release module. This is done by adding to the <dependencies> section of the gateway-release’s pom.xml file. If this isn’t done the JARs for the module will not be automatically packaged into the release artifacts. This can be useful while an extension is under development but not yet ready for inclusion in the release.</p></li> </ol> <p>More detailed examples of adding both a service and a provider extension are provided in subsequent sections.</p> <h3><a id="Services">Services</a> <a href="#Services"><img src="markbook-section-link.png"/></a></h3> <p>Services are extensions that are responsible for converting information in the topology file to runtime descriptors. Typically services do not require their own runtime descriptors. Rather, they modify either the gateway runtime descriptor (i.e. gateway.xml) or descriptors of other providers (e.g. rewrite.xml).</p> <p>The service provider interface for a Service is ServiceDeploymentContributor and is shown below.</p> <pre><code class="java">package org.apache.knox.gateway.deploy; import org.apache.knox.gateway.topology.Service; public interface ServiceDeploymentContributor { String getRole(); void initializeContribution( DeploymentContext context ); void contributeService( DeploymentContext context, Service service ) throws Exception; void finalizeContribution( DeploymentContext context ); } </code></pre> <p>Each service provides an implementation of this interface that is discovered via the ServerLoader mechanism previously described. The meaning of this is best understood in the context of the structure of the topology file. A fragment of a topology file is shown below.</p> <pre><code class="xml"><topology> <gateway> .... </gateway> <service> <role>WEATHER</role> <url>http://api.openweathermap.org/data</url> </service> .... </topology> </code></pre> <p>With these two things a more detailed description of the purpose of each ServiceDeploymentContributor method should be helpful.</p> <dl> <dt>String getRole(); </dt> <dd>This is the value the framework uses to associate a given <code><service><role></code> with a particular ServiceDeploymentContributor implementation. See below how the example WeatherDeploymentContributor implementation returns the role WEATHER that matches the value in the topology file. This will result in the WeatherDeploymentContributor’s methods being invoked when a WEATHER service is encountered in the topology file.</dd> </dl> <pre><code class="java">public class WeatherDeploymentContributor extends ServiceDeploymentContributorBase { private static final String ROLE = "WEATHER"; @Override public String getRole() { return ROLE; } ... } </code></pre> <dl> <dt>void initializeContribution( DeploymentContext context ); </dt> <dd>In this method a contributor would create, initialize and add any descriptors it was responsible for to the deployment context. For the weather service example this isn’t required so the empty method isn’t shown here.</dd> <dt>void contributeService( DeploymentContext context, Service service ) throws Exception; </dt> <dd>In this method a service contributor typically add and configures any features it requires. This method will be dissected in more detail below.</dd> <dt>void finalizeContribution( DeploymentContext context ); </dt> <dd>In this method a contributor would finalize any descriptors it was responsible for to the deployment context. For the weather service example this isn’t required so the empty method isn’t shown here.</dd> </dl> <h4><a id="Service+Contribution+Behavior">Service Contribution Behavior</a> <a href="#Service+Contribution+Behavior"><img src="markbook-section-link.png"/></a></h4> <p>In order to understand the job of the ServiceDeploymentContributor a few runtime descriptors need to be introduced.</p> <dl> <dt>Gateway Runtime Descriptor: WEB-INF/gateway.xml </dt> <dd>This runtime descriptor controls the behavior of the GatewayFilter. It defines a mapping between resources (i.e. URL patterns) and filter chains. The sample gateway runtime descriptor helps illustrate.</dd> </dl> <pre><code class="xml"><gateway> <resource> <role>WEATHER</role> <pattern>/weather/**?**</pattern> <filter> <role>authentication</role> <name>sample</name> <class>...</class> </filter> <filter>...</filter>* ... </resource> </gateway> </code></pre> <dl> <dt>Rewrite Provider Runtime Descriptor: WEB-INF/rewrite.xml </dt> <dd>The rewrite provider runtime descriptor controls the behavior of the rewrite filter. Each service contributor is responsible for adding the rules required to control the URL rewriting required by that service. Later sections will provide more detail about the capabilities of the rewrite provider.</dd> </dl> <pre><code class="xml"><rules> <rule dir="IN" name="WEATHER/openweathermap/inbound/versioned/file" pattern="*://*:*/**/weather/{version}?{**}"> <rewrite template="{$serviceUrl[WEATHER]}/{version}/weather?{**}"/> </rule> </rules> </code></pre> <p>With these two descriptors in mind a detailed breakdown of the WeatherDeploymentContributor’s contributeService method will make more sense. At a high level the important concept is that contributeService is invoked by the framework for each <service> in the topology file.</p> <pre><code class="java">public class WeatherDeploymentContributor extends ServiceDeploymentContributorBase { ... @Override public void contributeService( DeploymentContext context, Service service ) throws Exception { contributeResources( context, service ); contributeRewriteRules( context ); } private void contributeResources( DeploymentContext context, Service service ) throws URISyntaxException { ResourceDescriptor resource = context.getGatewayDescriptor().addResource(); resource.role( service.getRole() ); resource.pattern( "/weather/**?**" ); addAuthenticationFilter( context, service, resource ); addRewriteFilter( context, service, resource ); addDispatchFilter( context, service, resource ); } private void contributeRewriteRules( DeploymentContext context ) throws IOException { UrlRewriteRulesDescriptor allRules = context.getDescriptor( "rewrite" ); UrlRewriteRulesDescriptor newRules = loadRulesFromClassPath(); allRules.addRules( newRules ); } ... } </code></pre> <p>The DeploymentContext parameter contains information about the deployment as well as the WAR structure being created via deployment. The Service parameter is the object representation of the <service> element in the topology file. Details about particularly important lines follow the code block.</p> <dl> <dt>ResourceDescriptor resource = context.getGatewayDescriptor().addResource(); </dt> <dd>Obtains a reference to the gateway runtime descriptor and adds a new resource element. Note that many of the APIs in the deployment framework follow a fluent vs bean style.</dd> <dt>resource.role( service.getRole() ); </dt> <dd>Sets the role for a particular resource. Many of the filters may need access to this role information in order to make runtime decisions.</dd> <dt>resource.pattern( “/weather/**?**” ); </dt> <dd>Sets the URL pattern to which the filter chain that will follow will be mapped within the GatewayFilter.</dd> <dt>add*Filter( context, service, resource ); </dt> <dd>These are taken from a base class. A representation of the implementation of that method from the base class is shown below. Notice how this essentially delegates back to the framework to add the filters required by a particular provider role (e.g. “rewrite”).</dd> </dl> <pre><code class="java"> protected void addRewriteFilter( DeploymentContext context, Service service, ResourceDescriptor resource ) { context.contributeFilter( service, resource, "rewrite", null, null ); } </code></pre> <dl> <dt>UrlRewriteRulesDescriptor allRules = context.getDescriptor( “rewrite” ); </dt> <dd>Here the rewrite provider runtime descriptor is obtained by name from the deployment context. This does represent a tight coupling in this case between this service and the default rewrite provider. The rewrite provider however is unlikely to be related with alternate implementations.</dd> <dt>UrlRewriteRulesDescriptor newRules = loadRulesFromClassPath(); </dt> <dd>This is convenience method for loading partial rewrite descriptor information from the classpath. Developing and maintaining these rewrite rules is far easier as an external resource. The rewrite descriptor API could however have been used to achieve the same result.</dd> <dt>allRules.addRules( newRules ); </dt> <dd>Here the rewrite rules for the weather service are merged into the larger set of rewrite rules.</dd> </dl> <pre><code class="xml"><project> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.apache.knox</groupId> <artifactId>gateway</artifactId> <version>2.1.0-SNAPSHOT</version> </parent> <artifactId>gateway-service-weather</artifactId> <name>gateway-service-weather</name> <description>A sample extension to the gateway for a weather REST API.</description> <licenses> <license> <name>The Apache Software License, Version 2.0</name> <url>https://www.apache.org/licenses/LICENSE-2.0.txt</url> <distribution>repo</distribution> </license> </licenses> <dependencies> <dependency> <groupId>${gateway-group}</groupId> <artifactId>gateway-spi</artifactId> </dependency> <dependency> <groupId>${gateway-group}</groupId> <artifactId>gateway-provider-rewrite</artifactId> </dependency> ... Test Dependencies ... </dependencies> </project> </code></pre> <h4><a id="Service+Definition+Files">Service Definition Files</a> <a href="#Service+Definition+Files"><img src="markbook-section-link.png"/></a></h4> <p>As of release 0.6.0, the gateway now also supports a declarative way of plugging-in a new Service. A Service can be defined with a combination of two files, these are:</p> <pre><code>service.xml rewrite.xml </code></pre> <p>The rewrite.xml file contains the rewrite rules as defined in other sections of this guide, and the service.xml file contains the various routes (paths) to be provided by the Service and the rewrite rule bindings to those paths. This will be described in further detail in this section.</p> <p>While the service.xml file is absolutely required, the rewrite.xml file in theory is optional (though it is highly unlikely that no rewrite rules are needed).</p> <p>To add a new service, simply add a service.xml and rewrite.xml file in an appropriate directory (see <a href="#Service+Definition+Directory+Structure">Service Definition Directory Structure</a>) in the module gateway-service-definitions to make the new service part of the Knox build.</p> <h5><a id="service.xml">service.xml</a> <a href="#service.xml"><img src="markbook-section-link.png"/></a></h5> <p>Below is a sample of a very simple service.xml file, taking the same weather api example.</p> <pre><code class="xml"><service role="WEATHER" name="weather" version="0.1.0"> <routes> <route path="/weather/**?**"/> </routes> </service> </code></pre> <dl> <dt><strong>service</strong></dt> <dd>The root tag is ‘service’ that has the three required attributes: <em>role</em>, <em>name</em> and <em>version</em>. These three values disambiguate this service definition from others. To ensure the exact same service definition is being used in a topology file, all values should be specified,</dd> </dl> <pre><code class="xml"><topology> <gateway> .... </gateway> <service> <role>WEATHER</role> <name>weather</name> <version>0.1.0</version> <url>http://api.openweathermap.org/data</url> <dispatch> <contributor-name>custom-client</contributor-name> <ha-contributor-name>ha-client</ha-contributor-name> <classname>org.apache.knox.gateway.dispatch.PassAllHeadersDispatch</classname> <ha-classname></ha-classname> <http-client-factory></http-client-factory> <use-two-way-ssl>false</use-two-way-ssl> </dispatch> </service> .... </topology> </code></pre> <p>If only <em>role</em> is specified in the topology file (the only required element other than <em>url</em>) then the first service definition of that role found will be used with the highest version of that role and name. Similarly if only the <em>version</em> is omitted from the topology specification of the service, the service definition of the highest version will be used. It is therefore important to specify a version for a service if it is desired that a topology be locked down to a specific version of a service.</p> <p>The optional <code>dispatch</code> element can be used to override the dispatch specified in service.xml file for the service, this can be useful in cases where you want a specific topology to override the dispatch, which is useful for topology based federation from one Knox instance to another. <code>dispatch\classname</code> is the most commonly used option, but other options are available if one wants more fine grained control over topology dispatch override. </p> <dl> <dt><strong>routes</strong></dt> <dd>Wrapper element for one or more routes.</dd> <dt><strong>route</strong></dt> <dd>A route specifies the <em>path</em> that the service is routing as well as any rewrite bindings or policy bindings. Another child element that may be used here is a <em>dispatch</em> element.</dd> <dt><strong>rewrite</strong></dt> <dd>A rewrite rule or function that is to be applied to the path. A rewrite element contains a <em>apply</em> attribute that references the rewrite function or rule by name. Along with the <em>apply</em> attribute, a <em>to</em> attribute must be used. The <em>to</em> specifies what part of the request or response to rewrite. The valid values for the <em>to</em> attribute are:</dd> </dl> <ul> <li>request.url</li> <li>request.headers</li> <li>request.cookies</li> <li>request.body</li> <li>response.headers</li> <li>response.cookies</li> <li>response.body</li> </ul> <p>Below is an example of a snippet from the WebHDFS service definition</p> <pre><code class="xml"> <route path="/webhdfs/v1/**?**"> <rewrite apply="WEBHDFS/webhdfs/inbound/namenode/file" to="request.url"/> <rewrite apply="WEBHDFS/webhdfs/outbound/namenode/headers" to="response.headers"/> </route> </code></pre> <dl> <dt><strong>dispatch</strong></dt> <dd>The dispatch element can be used to plug-in a custom dispatch class. The interface for Dispatch can be found in the module gateway-spi, org.apache.knox.gateway.dispatch.Dispatch.</dd> </dl> <p>This element can be used at the service level (i.e. as a child of the service tag) or at the route level. A dispatch specified at the route level takes precedence over a dispatch specified at the service level. By default the dispatch used is org.apache.knox.gateway.dispatch.DefaultDispatch.</p> <p>The dispatch tag has four attributes that can be specified.</p> <p><em>contributor-name</em> : This attribute can be used to specify a deployment contributor to be invoked for a custom dispatch.</p> <p><em>classname</em> : This attribute can be used to specify a custom dispatch class.</p> <p><em>ha-contributor-name</em> : This attribute can be used to specify a deployment contributor to be invoked for custom HA dispatch functionality.</p> <p><em>ha-classname</em> : This attribute can be used to specify a custom dispatch class with HA functionality.</p> <p>Only one of contributor-name or classname should be specified and one of ha-contributor-name or ha-classname should be specified.</p> <p>If providing a custom dispatch, either a jar should be provided, see <a href="#Class+Path">Class Path</a> or a <a href="#Maven+Module">Maven Module</a> should be created.</p> <p>Check out <a href="#ConfigurableDispatch">ConfigurableDispatch</a> about configurable dispatch type. </p> <dl> <dt><strong>policies</strong></dt> <dd>This is a wrapper tag for <em>policy</em> elements and can be a child of the <em>service</em> tag or the <em>route</em> tag. Once again, just like with dispatch, the route level policies defined override the ones at the service level.</dd> </dl> <p>This element can contain one or more <em>policy</em> elements. The order of the <em>policy</em> elements is important as that will be the order of execution.</p> <dl> <dt><strong>policy</strong></dt> <dd>At this time the policy element just has two attributes, <em>role</em> and <em>name</em>. These are used to execute a deployment contributor by that role and name. Therefore new policies must be added by using the deployment contributor mechanism.</dd> </dl> <p>For example,</p> <pre><code class="xml"><service role="FOO" name="foo" version="1.6.0"> <policies> <policy role="webappsec"/> <policy role="authentication"/> <policy role="rewrite"/> <policy role="identity-assertion"/> <policy role="authorization"/> </policies> <routes> <route path="/foo/?**"> <rewrite apply="FOO/foo/inbound" to="request.url"/> <policies> <policy role="webappsec"/> <policy role="federation"/> <policy role="identity-assertion"/> <policy role="authorization"/> <policy role="rewrite"/> </policies> <dispatch contributor-name="http-client" /> </route> </routes> <dispatch contributor-name="custom-client" ha-contributor-name="ha-client"/> </service> </code></pre> <h5><a id="ConfigurableDispatch">ConfigurableDispatch</a> <a href="#ConfigurableDispatch"><img src="markbook-section-link.png"/></a></h5> <p>The <code>ConfigurableDispatch</code> allows service definition writers to:</p> <ul> <li>exclude certain header(s) from the outbound HTTP request</li> <li>exclude certain header(s) from the outbound HTTP response</li> <li>declares whether parameters should be URL-encoded or not</li> </ul> <p>This dispatch type can be set in <code>service.xml</code> as follows:</p> <pre><code><dispatch classname="org.apache.knox.gateway.dispatch.ConfigurableDispatch" ha-classname="org.apache.knox.gateway.ha.dispatch.ConfigurableHADispatch"> <param> <name>requestExcludeHeaders</name> <value>Authorization,Content-Length</value> </param> <param> <name>responseExcludeHeaders</name> <value>SET-COOKIE,WWW-AUTHENTICATE</value> </param> <param> <name>removeUrlEncoding</name> <value>true</value> </param> </dispatch> </code></pre> <p>The default values of these parameters are:</p> <ul> <li><code>requestExcludeHeaders</code> = <code>Host,Authorization,Content-Length,Transfer-Encoding</code></li> <li><code>responseExcludeHeaders</code> = <code>SET-COOKIE,WWW-AUTHENTICATE</code></li> <li><code>removeUrlEncoding</code> = <code>false</code></li> </ul> <p>The <code>responseExcludeHeaders</code> handling allows excluding only certain directives of the <code>SET-COOKIE</code> HTTP header. The following sample shows how to ecxlude only the <code>HttpOnly</code> directive from <code>SET-COOKIE</code> and the <code>WWW-AUTHENTICATE</code> header entirely in the routbound response HTTP header:</p> <pre><code><dispatch classname="org.apache.knox.gateway.dispatch.ConfigurableDispatch" ha-classname="org.apache.knox.gateway.ha.dispatch.ConfigurableHADispatch">> <param> <name>responseExcludeHeaders</name> <value>WWW-AUTHENTICATE,SET-COOKIE:HttpOnly</value> </param> </dispatch> </code></pre> <h5><a id="rewrite.xml">rewrite.xml</a> <a href="#rewrite.xml"><img src="markbook-section-link.png"/></a></h5> <p>The rewrite.xml file that accompanies the service.xml file follows the same rules as described in the section <a href="#Rewrite+Provider">Rewrite Provider</a>.</p> <h4><a id="Service+Definition+Directory+Structure">Service Definition Directory Structure</a> <a href="#Service+Definition+Directory+Structure"><img src="markbook-section-link.png"/></a></h4> <p>On installation of the Knox gateway, the following directory structure can be found under ${GATEWAY_HOME}/data. This is a mirror of the directories and files under the module gateway-service-definitions.</p> <pre><code>services |______ service name |______ version |______service.xml |______rewrite.xml </code></pre> <p>For example,</p> <pre><code>services |______ webhdfs |______ 2.4.0 |______service.xml |______rewrite.xml </code></pre> <p>To test out a new service, you can just add the appropriate files (service.xml and rewrite.xml) in a directory under ${GATEWAY_HOME}/data/services. If you want to make the service contribution to the Knox build, they files need to go in the gateway-service-definitions module.</p> <h4><a id="Service+Definition+Runtime+Behavior">Service Definition Runtime Behavior</a> <a href="#Service+Definition+Runtime+Behavior"><img src="markbook-section-link.png"/></a></h4> <p>The runtime artifacts as well as the behavior does not change whether the service is plugged in via the deployment descriptors or through a service.xml file.</p> <h4><a id="Custom+Dispatch+Dependency+Injection">Custom Dispatch Dependency Injection</a> <a href="#Custom+Dispatch+Dependency+Injection"><img src="markbook-section-link.png"/></a></h4> <p>When writing a custom dispatch class, one often needs configuration or gateway services. A lightweight dependency injection system is used that can inject instances of classes or primitives available in the filter configuration’s init params or as a servlet context attribute.</p> <p>Details of this can be found in the module gateway-util-configinjector and also an example use of it is in the class org.apache.knox.gateway.dispatch.DefaultDispatch. Look at the following method for example:</p> <pre><code class="java"> @Configure protected void setReplayBufferSize(@Default("8") int size) { replayBufferSize = size; } </code></pre> <h3><a id="Service+Discovery">Service Discovery</a> <a href="#Service+Discovery"><img src="markbook-section-link.png"/></a></h3> <p>Knox supports the ability to dynamically determine endpoint URLs in topologies for supported Hadoop services. <em>This functionality is currently only supported for Ambari-managed Hadoop clusters.</em> There are a number of these services, which are officially supported, but this set can be extended by modifying the source or speciyfing external configuration.</p> <h4><a id="Service+URL+Definitions">Service URL Definitions</a> <a href="#Service+URL+Definitions"><img src="markbook-section-link.png"/></a></h4> <p>The service discovery system determines service URLs by processing mappings of Hadoop service configuration properties and corresponding URL templates. The knowledge about converting these arbitrary service configuration properties into correct service endpoint URLs is defined in a configuration file internal to the Ambari service discovery module.</p> <h5><a id="Configuration+Details">Configuration Details</a> <a href="#Configuration+Details"><img src="markbook-section-link.png"/></a></h5> <p>This internal configuration file ( <strong>ambari-service-discovery-url-mappings.xml</strong> ) in the <strong>gateway-discovery-ambari</strong> module is used to specify a URL template and the associated configuration properties to populate it. A limited degree of conditional logic is supported to accommodate things like <em>http</em> vs <em>https</em> configurations.</p> <p>The simplest way to describe its contents will be by examples.</p> <p><strong>Example 1</strong></p> <p>The simplest example of one such mapping involves a service component for which there is a single configuration property which specifies the complete URL</p> <pre><code><!-- This is the service mapping declaration. The name must match what is specified as a Knox topology service role --> <service name="OOZIE"> <!-- This is the URL pattern with palceholder(s) for values provided by properties --> <url-pattern>{OOZIE_URL}</url-pattern> <properties> <!-- This is a property, which in this simple case, matches a template placeholder --> <property name="OOZIE_URL"> <!-- This is the component whose configuration will be used to lookup the value of the subsequent config property name --> <component>OOZIE_SERVER</component> <!-- This is the name of the component config property whose value should be assigned to the OOZIE_URL value --> <config-property>oozie.base.url</config-property> </property> </properties> </service> </code></pre> <p>The <em>OOZIE_SERVER</em> component configuration is <strong>oozie-site</strong> If <strong>oozie-site.xml</strong> has the property named <strong>oozie.base.url</strong> with the value <a href="http://ooziehost:11000">http://ooziehost:11000</a>, then the resulting URL for the <em>OOZIE</em> service will be <a href="http://ooziehost:11000">http://ooziehost:11000</a></p> <p><strong>Example 2</strong></p> <p>A slightly more complicated example involves a service component for which the complete URL is not described by a single detail, but rather multiple endpoint URL details</p> <pre><code><service name="WEBHCAT"> <url-pattern>http://{HOST}:{PORT}/templeton</url-pattern> <properties> <property name="HOST"> <component>WEBHCAT_SERVER</component> <!-- This tells discovery to get the hostname for the WEBHCAT_SERVER component from Ambari --> <hostname/> </property> <property name="PORT"> <component>WEBHCAT_SERVER</component> <!-- This is the name of the component config property whose value should be assigned to --> <!-- the PORT value --> <config-property>templeton.port</config-property> </property> </properties> </service> </code></pre> <p><strong>Example 3</strong></p> <p>An even more complicated example involves a service for which <em>HTTPS</em> is supported, and which employs the limited conditional logic support</p> <pre><code><service name="ATLAS"> <url-pattern>{SCHEME}://{HOST}:{PORT}</url-pattern> <properties> <!-- Property for getting the ATLAS_SERVER component hostname from Ambari --> <property name="HOST"> <component>ATLAS_SERVER</component> <hostname/> </property> <!-- Property for capturing whether TLS is enabled or not; This is not a template placeholder property --> <property name="TLS_ENABLED"> <component>ATLAS_SERVER</component> <config-property>atlas.enableTLS</config-property> </property> <!-- Property for getting the http port ; also NOT a template placeholder property --> <property name="HTTP_PORT"> <component>ATLAS_SERVER</component> <config-property>atlas.server.http.port</config-property> </property> <!-- Property for getting the https port ; also NOT a template placeholder property --> <property name="HTTPS_PORT"> <component>ATLAS_SERVER</component> <config-property>atlas.server.https.port</config-property> </property> <!-- Template placeholder property, dependent on the TLS_ENABLED property value --> <property name="PORT"> <config-property> <if property="TLS_ENABLED" value="true"> <then>HTTPS_PORT</then> <else>HTTP_PORT</else> </if> </config-property> </property> <!-- Template placeholder property, dependent on the TLS_ENABLED property value --> <property name="SCHEME"> <config-property> <if property="TLS_ENABLED" value="true"> <then>https</then> <else>http</else> </if> </config-property> </property> </properties> </service> </code></pre> <h5><a id="External+Configuration">External Configuration</a> <a href="#External+Configuration"><img src="markbook-section-link.png"/></a></h5> <p>The internal configuration for URL construction can be overridden or augmented by way of a configuration file in the gateway configuration directory, or an alternative file specified by a Java system property. This mechanism is useful for developing support for new services, for custom solutions, or any scenario for which rebuilding Knox is not desirable.</p> <p>The default file, for which Knox will search first, is <strong><em>{GATEWAY_HOME}</em>/conf/ambari-discovery-url-mappings.xml</strong></p> <p>If Knox doesn’t find that file, it will check for a Java system property named <strong>org.apache.gateway.topology.discovery.ambari.config</strong>, whose value is the fully-qualified path to an XML file. This file’s contents must adhere to the format outlined above.</p> <p>If this configuration exists, Knox will apply it as if it were part of the internal configuration.</p> <p><strong>Example</strong></p> <p>If Apache Solr weren’t supported, then it could be added by creating the following definition in <strong><em>{GATEWAY_HOME}</em>/conf/ambari-discovery-url-mappings.xml</strong> :</p> <pre><code><?xml version="1.0" encoding="utf-8"?> <service-discovery-url-mappings> <service name="SOLR"> <url-pattern>http://{HOST}:{PORT}</url-pattern> <properties> <property name="HOST"> <component>INFRA_SOLR</component> <hostname/> </property> <property name="PORT"> <component>INFRA_SOLR</component> <config-property>infra_solr_port</config-property> </property> </properties> </service> </service-discovery-url-mappings> </code></pre> <p><strong><em>N.B. Knox must be restarted for changes to this external configuration to be applied.</em></strong></p> <h4><a id="Component+Configuration+Mapping">Component Configuration Mapping</a> <a href="#Component+Configuration+Mapping"><img src="markbook-section-link.png"/></a></h4> <p>To support URL construction from service configuration files, Ambari service discovery requires knowledge of the service component types and their respective relationships to configuration types. This knowledge is defined in a configuration file internal to the Ambari service discovery module.</p> <h5><a id="Configuration+Details">Configuration Details</a> <a href="#Configuration+Details"><img src="markbook-section-link.png"/></a></h5> <p>This internal configuration file ( <strong>ambari-service-discovery-component-config-mapping.properties</strong> ) in the <strong>gateway-discovery-ambari</strong> module is used to define the mapping of Hadoop service component names to the configuration type from which Knox will lookup property values.</p> <p><strong>Example</strong></p> <pre><code>NAMENODE=hdfs-site RESOURCEMANAGER=yarn-site HISTORYSERVER=mapred-site OOZIE_SERVER=oozie-site HIVE_SERVER=hive-site WEBHCAT_SERVER=webhcat-site </code></pre> <h5><a id="External+Configuration">External Configuration</a> <a href="#External+Configuration"><img src="markbook-section-link.png"/></a></h5> <p>The internal configuration for component configuration mappings can be overridden or augmented by way of a configuration file in the gateway configuration directory, or an alternative file specified by a Java system property. This mechanism is useful for developing support for new services, for custom solutions, or any scenario for which rebuilding Knox is not desirable.</p> <p>The default file, for which Knox will search first, is <strong><em>{GATEWAY_HOME}</em>/conf/ambari-discovery-component-config.properties</strong> If Knox doesn’t find that file, it will check for a Java system property named <strong>org.apache.knox.gateway.topology.discovery.ambari.component.mapping</strong>, whose value is the fully-qualified path to a properties file. This file’s contents must adhere to the format outlined above.</p> <p>If this configuration exists, Knox will apply it as if it were part of the internal configuration.</p> <p><strong>Example</strong></p> <p>Following the aforementioned SOLR example, Knox needs to know in which configuration file to find the <em>INFRA_SOLR</em> component configuration property, so the following property must be defined in <strong><em>{GATEWAY_HOME}</em>/conf/ambari-discovery-component-config.properties</strong> :</p> <pre><code>INFRA_SOLR=infra-solr-env </code></pre> <p>This tells Knox to look for the <strong>infra_solr_port</strong> property in the <strong>infra-solr-env</strong> configuration.</p> <p><strong><em>N.B. Knox must be restarted for changes to this external configuration to be applied.</em></strong></p> <h3><a id="Validator">Validator</a> <a href="#Validator"><img src="markbook-section-link.png"/></a></h3> <p>Apache Knox provides preauth federation authentication where<br/>Knox supports two built-in validators for verifying incoming requests. In this section, we describe how to write a custom validator for this scenario. The provided validators include: </p> <ul> <li><em>preauth.default.validation:</em> This default behavior does not perform any validation check. All requests will pass.</li> <li><em>preauth.ip.validation</em> : This validation checks if a request is originated from an IP address which is configured in Knox service through property <em>preauth.ip.addresses</em>.</li> </ul> <p>However, these built-in validation choices may not fulfill the internal requirments of some organization. Therefore, Knox supports (since 0.12) a pluggble framework where anyone can include a custom validator. </p> <p>In essence, a user can add a custom validator by following these steps. The corresponding code examples are incorporated after that:</p> <ol> <li>Create a separate Java package (e.g. com.company.knox.validator) in a new or existing Maven project.</li> <li>Create a new class (e.g. <em>CustomValidator</em>) that implements <em>org.apache.knox.gateway.preauth.filter.PreAuthValidator</em>.</li> <li>The class should implement the method <em>String getName()</em> that may returns a string constant. The step-9 will need this user defined string constant.</li> <li>The class should implement the method <em>boolean validate(HttpServletRequest httpRequest, FilterConfig filterConfig)</em>. This is the key method which will validate the request based on ‘httpRequest’ and ‘filterConfig’. In most common cases, user may need to use HTTP headers value to validate. For example, client can get a token from an authentication service and pass it as HTTP header. This validate method needs to extract that header and verify the token. In some instance, the server may need to contact the same authentication service to validate.</li> <li>Create a text file src/resources/META-INF/services and add fully qualified name of your custom validator class (e.g. <em>com.company.knox.validator.CustomValidator</em>).</li> <li>You may need to include the packages “org.apache.knox.gateway-provider-security-preauth” of version 0.12+ and “javax.servlet.javax.servlet-api” of version 3.1.0+ in pom.xml.</li> <li>Build your custom jar.</li> <li>Deploy the jar in $GATEWAY_HOME/ext directory.</li> <li>Add/modify a parameter called <em>preauth.validation.method</em> with the name of validator used in step #3. Optionally, you may add any new parameter that may be required only for your CustomValidator.</li> </ol> <p><strong>Validator Class (Step 2-4)</strong> </p> <pre><code>package com.company.knox.validator; import org.apache.knox.gateway.preauth.filter.PreAuthValidationException; import org.apache.knox.gateway.preauth.filter.PreAuthValidator; import com.google.common.base.Strings; import javax.servlet.FilterConfig; import javax.servlet.http.HttpServletRequest; public class CustomValidator extends PreAuthValidator { //Any string constant value should work for these 3 variables //This string will be used in 'services' file. public static final String CUSTOM_VALIDATOR_NAME = "fooValidator"; //Optional: User may want to pass soemthign through HTTP header. (per client request) public static final String CUSTOM_TOKEN_HEADER_NAME = "foo_claim"; /** * @param httpRequest * @param filterConfig * @return * @throws PreAuthValidationException */ @Override public boolean validate(HttpServletRequest httpRequest, FilterConfig filterConfig) throws PreAuthValidationException { String claimToken = httpRequest.getHeader(CUSTOM_TOKEN_HEADER_NAME); if (!Strings.isNullOrEmpty(claimToken)) { return checkCustomeToken(claimToken); //to be implemented } else { log.warn("Claim token was empty for header name '" + CUSTOM_TOKEN_HEADER_NAME + "'"); return false; } } /** * Define unique validator name * * @return */ @Override public String getName() { return CUSTOM_VALIDATOR_NAME; } } </code></pre> <p><strong>META-INF/services contents (Step-5)</strong></p> <p><code>com.company.knox.validator.CustomValidator</code></p> <p><strong>POM file (Step-6)</strong></p> <pre><code><dependency> <groupId>javax.servlet</groupId> <artifactId>javax.servlet-api</artifactId> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.knox</groupId> <artifactId>gateway-test-utils</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.knox</groupId> <artifactId>gateway-provider-security-preauth</artifactId> <scope>provided</scope> </dependency> </code></pre> <p><strong>Deploy Custom Jar (Step-7-8)</strong></p> <p>Build the jar (e.g. customValidation.jar) using ‘mvn clean package’ <code>cp customValidation.jar $GATEWAY_HOME/ext/</code></p> <p><strong>Topology Config (Step-9)</strong></p> <pre><code><provider> <role>federation</role> <name>HeaderPreAuth</name> <enabled>true</enabled> <param><name>preauth.validation.method</name> <!--Same as CustomeValidator.CUSTOM_VALIDATOR_NAME -> <value>fooValidator</value></param> </provider> </code></pre> <h3><a id="Providers">Providers</a> <a href="#Providers"><img src="markbook-section-link.png"/></a></h3> <pre><code class="java">public interface ProviderDeploymentContributor { String getRole(); String getName(); void initializeContribution( DeploymentContext context ); void contributeProvider( DeploymentContext context, Provider provider ); void contributeFilter( DeploymentContext context, Provider provider, Service service, ResourceDescriptor resource, List<FilterParamDescriptor> params ); void finalizeContribution( DeploymentContext context ); } </code></pre> <pre><code class="xml"><project> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.apache.knox</groupId> <artifactId>gateway</artifactId> <version>2.1.0-SNAPSHOT</version> </parent> <artifactId>gateway-provider-security-authn-sample</artifactId> <name>gateway-provider-security-authn-sample</name> <description>A simple sample authorization provider.</description> <licenses> <license> <name>The Apache Software License, Version 2.0</name> <url>https://www.apache.org/licenses/LICENSE-2.0.txt</url> <distribution>repo</distribution> </license> </licenses> <dependencies> <dependency> <groupId>${gateway-group}</groupId> <artifactId>gateway-spi</artifactId> </dependency> </dependencies> </project> </code></pre> <h3><a id="Deployment+Context">Deployment Context</a> <a href="#Deployment+Context"><img src="markbook-section-link.png"/></a></h3> <pre><code class="java">package org.apache.knox.gateway.deploy; import ... public interface DeploymentContext { GatewayConfig getGatewayConfig(); Topology getTopology(); WebArchive getWebArchive(); WebAppDescriptor getWebAppDescriptor(); GatewayDescriptor getGatewayDescriptor(); void contributeFilter( Service service, ResourceDescriptor resource, String role, String name, List<FilterParamDescriptor> params ); void addDescriptor( String name, Object descriptor ); <T> T getDescriptor( String name ); } </code></pre> <pre><code class="java">public class Topology { public URI getUri() {...} public void setUri( URI uri ) {...} public String getName() {...} public void setName( String name ) {...} public long getTimestamp() {...} public void setTimestamp( long timestamp ) {...} public Collection<Service> getServices() {...} public Service getService( String role, String name ) {...} public void addService( Service service ) {...} public Collection<Provider> getProviders() {...} public Provider getProvider( String role, String name ) {...} public void addProvider( Provider provider ) {...} } </code></pre> <pre><code class="java">public interface GatewayDescriptor { List<GatewayParamDescriptor> params(); GatewayParamDescriptor addParam(); GatewayParamDescriptor createParam(); void addParam( GatewayParamDescriptor param ); void addParams( List<GatewayParamDescriptor> params ); List<ResourceDescriptor> resources(); ResourceDescriptor addResource(); ResourceDescriptor createResource(); void addResource( ResourceDescriptor resource ); } </code></pre> <h3><a id="Gateway+Services">Gateway Services</a> <a href="#Gateway+Services"><img src="markbook-section-link.png"/></a></h3> <p>TODO - Describe the service registry and other global services.</p> <h2><a id="Standard+Providers">Standard Providers</a> <a href="#Standard+Providers"><img src="markbook-section-link.png"/></a></h2> <h3><a id="Rewrite+Provider">Rewrite Provider</a> <a href="#Rewrite+Provider"><img src="markbook-section-link.png"/></a></h3> <p>gateway-provider-rewrite org.apache.knox.gateway.filter.rewrite.api.UrlRewriteRulesDescriptor</p> <pre><code class="xml"><rules> <rule dir="IN" name="WEATHER/openweathermap/inbound/versioned/file" pattern="*://*:*/**/weather/{version}?{**}"> <rewrite template="{$serviceUrl[WEATHER]}/{version}/weather?{**}"/> </rule> </rules> </code></pre> <pre><code class="xml"><rules> <filter name="WEBHBASE/webhbase/status/outbound"> <content type="*/json"> <apply path="$[LiveNodes][*][name]" rule="WEBHBASE/webhbase/address/outbound"/> </content> <content type="*/xml"> <apply path="/ClusterStatus/LiveNodes/Node/@name" rule="WEBHBASE/webhbase/address/outbound"/> </content> </filter> </rules> </code></pre> <pre><code class="java">@Test public void testDevGuideSample() throws Exception { URI inputUri, outputUri; Matcher<Void> matcher; Matcher<Void>.Match match; Template input, pattern, template; inputUri = new URI( "http://sample-host:8443/gateway/topology/weather/2.5?q=Palo+Alto" ); input = Parser.parse( inputUri.toString() ); pattern = Parser.parse( "*://*:*/**/weather/{version}?{**}" ); template = Parser.parse( "http://api.openweathermap.org/data/{version}/weather?{**}" ); matcher = new Matcher<Void>(); matcher.add( pattern, null ); match = matcher.match( input ); outputUri = Expander.expand( template, match.getParams(), null ); assertThat( outputUri.toString(), is( "http://api.openweathermap.org/data/2.5/weather?q=Palo+Alto" ) ); } </code></pre> <pre><code class="java">@Test public void testDevGuideSampleWithEvaluator() throws Exception { URI inputUri, outputUri; Matcher<Void> matcher; Matcher<Void>.Match match; Template input, pattern, template; Evaluator evaluator; inputUri = new URI( "http://sample-host:8443/gateway/topology/weather/2.5?q=Palo+Alto" ); input = Parser.parse( inputUri.toString() ); pattern = Parser.parse( "*://*:*/**/weather/{version}?{**}" ); template = Parser.parse( "{$serviceUrl[WEATHER]}/{version}/weather?{**}" ); matcher = new Matcher<Void>(); matcher.add( pattern, null ); match = matcher.match( input ); evaluator = new Evaluator() { @Override public List<String> evaluate( String function, List<String> parameters ) { return Arrays.asList( "http://api.openweathermap.org/data" ); } }; outputUri = Expander.expand( template, match.getParams(), evaluator ); assertThat( outputUri.toString(), is( "http://api.openweathermap.org/data/2.5/weather?q=Palo+Alto" ) ); } </code></pre> <h4><a id="Rewrite+Filters">Rewrite Filters</a> <a href="#Rewrite+Filters"><img src="markbook-section-link.png"/></a></h4> <p>TODO - Cover the supported content types. TODO - Provide a XML and JSON “properties” example where one NVP is modified based on value of another name.</p> <pre><code class="xml"><rules> <filter name="WEBHBASE/webhbase/regions/outbound"> <content type="*/json"> <apply path="$[Region][*][location]" rule="WEBHBASE/webhbase/address/outbound"/> </content> <content type="*/xml"> <apply path="/TableInfo/Region/@location" rule="WEBHBASE/webhbase/address/outbound"/> </content> </filter> </rules> </code></pre> <pre><code class="xml"><gateway> ... <resource> <role>WEBHBASE</role> <pattern>/hbase/*/regions?**</pattern> ... <filter> <role>rewrite</role> <name>url-rewrite</name> <class>org.apache.knox.gateway.filter.rewrite.api.UrlRewriteServletFilter</class> <param> <name>response.body</name> <value>WEBHBASE/webhbase/regions/outbound</value> </param> </filter> ... </resource> ... </gateway> </code></pre> <p>HBaseDeploymentContributor</p> <pre><code class="java"> params = new ArrayList<FilterParamDescriptor>(); params.add( regionResource.createFilterParam().name( "response.body" ).value( "WEBHBASE/webhbase/regions/outbound" ) ); addRewriteFilter( context, service, regionResource, params ); </code></pre> <h4><a id="Rewrite+Functions">Rewrite Functions</a> <a href="#Rewrite+Functions"><img src="markbook-section-link.png"/></a></h4> <p>TODO - Provide an lowercase function as an example.</p> <pre><code class="xml"><rules> <functions> <hostmap config="/WEB-INF/hostmap.txt"/> </functions> ... </rules> </code></pre> <h4><a id="Rewrite+Steps">Rewrite Steps</a> <a href="#Rewrite+Steps"><img src="markbook-section-link.png"/></a></h4> <p>TODO - Provide an lowercase step as an example.</p> <pre><code class="xml"><rules> <rule dir="OUT" name="WEBHDFS/webhdfs/outbound/namenode/headers/location"> <match pattern="{scheme}://{host}:{port}/{path=**}?{**}"/> <rewrite template="{gateway.url}/webhdfs/data/v1/{path=**}?{scheme}?host={$hostmap(host)}?{port}?{**}"/> <encrypt-query/> </rule> </rules> </code></pre> <h3><a id="Identity+Assertion+Provider">Identity Assertion Provider</a> <a href="#Identity+Assertion+Provider"><img src="markbook-section-link.png"/></a></h3> <p>Adding a new identity assertion provider is as simple as extending the AbstractIdentityAsserterDeploymentContributor and the CommonIdentityAssertionFilter from the gateway-provider-identity-assertion-common module to initialize any specific configuration from filter init params and implement two methods:</p> <ol> <li>String mapUserPrincipal(String principalName);</li> <li>String[] mapGroupPrincipals(String principalName, Subject subject);</li> </ol> <p>To implement a simple toUpper or toLower identity assertion provider:</p> <pre><code class="java">package org.apache.knox.gateway.identityasserter.caseshifter.filter; import org.apache.knox.gateway.identityasserter.common.filter.AbstractIdentityAsserterDeploymentContributor; public class CaseShifterIdentityAsserterDeploymentContributor extends AbstractIdentityAsserterDeploymentContributor { @Override public String getName() { return "CaseShifter"; } protected String getFilterClassname() { return CaseShifterIdentityAssertionFilter.class.getName(); } } </code></pre> <p>We merely need to provide the provider name for use in the topology and the filter classname for the contributor to add to the filter chain.</p> <p>For the identity assertion filter itself it is just a matter of extension and the implementation of the two methods described earlier:</p> <pre><code class="java">package org.apache.knox.gateway.identityasserter.caseshifter.filter; import javax.security.auth.Subject; import javax.servlet.FilterConfig; import javax.servlet.ServletException; import org.apache.knox.gateway.identityasserter.common.filter.CommonIdentityAssertionFilter; public class CaseShifterIdentityAssertionFilter extends CommonIdentityAssertionFilter { private boolean toUpper = false; @Override public void init(FilterConfig filterConfig) throws ServletException { String upper = filterConfig.getInitParameter("caseshift.upper"); if ("true".equals(upper)) { toUpper = true; } } @Override public String[] mapGroupPrincipals(String mappedPrincipalName, Subject subject) { return null; } @Override public String mapUserPrincipal(String principalName) { if (toUpper) { principalName = principalName.toUpperCase(); } else { principalName = principalName.toLowerCase(); } return principalName; } } </code></pre> <p>Note that the above: </p> <ol> <li>looks for specific filter init parameters for configuration of whether to convert to upper or to lower case</li> <li>it no-ops the mapGroupPrincipals so that it returns null. This indicates that there are no changes needed to the groups contained within the Subject. If there are groups then they should be continued to flow through the system unchanged. This is actually the same implementation as the base class and is therefore not required to be overridden. We include it here for illustration.</li> <li>based upon the configuration interrogated in the init method the principalName is convert to either upper or lower case.</li> </ol> <p>That is the extent of what is needed to implement a new identity assertion provider module.</p> <h3><a id="Jersey+Provider">Jersey Provider</a> <a href="#Jersey+Provider"><img src="markbook-section-link.png"/></a></h3> <p>TODO</p> <h3><a id="KnoxSSO+Integration">KnoxSSO Integration</a> <a href="#KnoxSSO+Integration"><img src="markbook-section-link.png"/></a></h3> <h1>Knox SSO Integration for UIs</h1> <h2>Introduction</h2> <p>KnoxSSO provides an abstraction for integrating any number of authentication systems and SSO solutions and enables participating web applications to scale to those solutions more easily. Without the token exchange capabilities offered by KnoxSSO each component UI would need to integrate with each desired solution on its own. </p> <p>This document examines the way to integrate with Knox SSO in the form of a Servlet Filter. This approach should be easily extrapolated into other frameworks - ie. Spring Security.</p> <h3><a id="General+Flow">General Flow</a> <a href="#General+Flow"><img src="markbook-section-link.png"/></a></h3> <p>The following is a generic sequence diagram for SAML integration through KnoxSSO.</p> <img src='general_saml_flow.png'/> <h4><a id="KnoxSSO+Setup">KnoxSSO Setup</a> <a href="#KnoxSSO+Setup"><img src="markbook-section-link.png"/></a></h4> <h5><a id="knoxsso.xml+Topology">knoxsso.xml Topology</a> <a href="#knoxsso.xml+Topology"><img src="markbook-section-link.png"/></a></h5> <p>In order to enable KnoxSSO, we need to configure the IdP topology. The following is an example of this topology that is configured to use HTTP Basic Auth against the Knox Demo LDAP server. This is the lowest barrier of entry for your development environment that actually authenticates against a real user store. What’s great is if you work against the IdP with Basic Auth then you will work with SAML or anything else as well.</p> <pre><code> <?xml version="1.0" encoding="utf-8"?> <topology> <gateway> <provider> <role>authentication</role> <name>ShiroProvider</name> <enabled>true</enabled> <param> <name>sessionTimeout</name> <value>30</value> </param> <param> <name>main.ldapRealm</name> <value>org.apache.knox.gateway.shirorealm.KnoxLdapRealm</value> </param> <param> <name>main.ldapContextFactory</name> <value>org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory</value> </param> <param> <name>main.ldapRealm.contextFactory</name> <value>$ldapContextFactory</value> </param> <param> <name>main.ldapRealm.userDnTemplate</name> <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value> </param> <param> <name>main.ldapRealm.contextFactory.url</name> <value>ldap://localhost:33389</value> </param> <param> <name>main.ldapRealm.contextFactory.authenticationMechanism</name> <value>simple</value> </param> <param> <name>urls./**</name> <value>authcBasic</value> </param> </provider> <provider> <role>identity-assertion</role> <name>Default</name> <enabled>true</enabled> </provider> </gateway> <service> <role>KNOXSSO</role> <param> <name>knoxsso.cookie.secure.only</name> <value>true</value> </param> <param> <name>knoxsso.token.ttl</name> <value>100000</value> </param> </service> </topology> </code></pre> <p>Just as with any Knox service, the KNOXSSO service is protected by the gateway providers defined above it. In this case, the ShiroProvider is taking care of HTTP Basic Auth against LDAP for us. Once the user authenticates the request processing continues to the KNOXSSO service that will create the required cookie and do the necessary redirects.</p> <p>The authenticate/federation provider can be swapped out to fit your deployment environment.</p> <h5><a id="sandbox.xml+Topology">sandbox.xml Topology</a> <a href="#sandbox.xml+Topology"><img src="markbook-section-link.png"/></a></h5> <p>In order to see the end to end story and use it as an example in your development, you can configure one of the cluster topologies to use the SSOCookieProvider instead of the out of the box ShiroProvider. The following is an example sandbox.xml topology that is configured for using KnoxSSO to protect access to the Hadoop REST APIs.</p> <pre><code> <?xml version="1.0" encoding="utf-8"?> <topology> <gateway> <provider> <role>federation</role> <name>SSOCookieProvider</name> <enabled>true</enabled> <param> <name>sso.authentication.provider.url</name> <value>https://localhost:9443/gateway/idp/api/v1/websso</value> </param> </provider> <provider> <role>identity-assertion</role> <name>Default</name> <enabled>true</enabled> </provider> </gateway> <service> <role>NAMENODE</role> <url>hdfs://localhost:8020</url> </service> <service> <role>JOBTRACKER</role> <url>rpc://localhost:8050</url> </service> <service> <role>WEBHDFS</role> <url>http://localhost:50070/webhdfs</url> </service> <service> <role>WEBHCAT</role> <url>http://localhost:50111/templeton</url> </service> <service> <role>OOZIE</role> <url>http://localhost:11000/oozie</url> </service> <service> <role>WEBHBASE</role> <url>http://localhost:60080</url> </service> <service> <role>HIVE</role> <url>http://localhost:10001/cliservice</url> </service> <service> <role>RESOURCEMANAGER</role> <url>http://localhost:8088/ws</url> </service> </topology> </code></pre> <ul> <li>NOTE: Be aware that when using Chrome as your browser that cookies don’t seem to work for “localhost”. Either use a VM or like I did - use 127.0.0.1. Safari works with localhost without problems.</li> </ul> <p>As you can see above, the only thing being configured is the SSO provider URL. Since Knox is the issuer of the cookie and token, we don’t need to configure the public key since we have programmatic access to the actual keystore for use at verification time.</p> <h4><a id="Curl+the+Flow">Curl the Flow</a> <a href="#Curl+the+Flow"><img src="markbook-section-link.png"/></a></h4> <p>We should now be able to walk through the SSO Flow at the command line with curl to see everything that happens.</p> <p>First, issue a request to WEBHDFS through knox.</p> <pre><code> bash-3.2$ curl -iku guest:guest-password https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op+LISTSTATUS HTTP/1.1 302 Found Location: https://localhost:8443/gateway/idp/api/v1/websso?originalUrl=https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op+LISTSTATUS Content-Length: 0 Server: Jetty(8.1.14.v20131031) </code></pre> <p>Note the redirect to the knoxsso endpoint and the loginUrl with the originalUrl request parameter. We need to see that come from your integration as well.</p> <p>Let’s manually follow that redirect with curl now:</p> <pre><code> bash-3.2$ curl -iku guest:guest-password "https://localhost:8443/gateway/idp/api/v1/websso?originalUrl=https://localhost:9443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS" HTTP/1.1 307 Temporary Redirect Set-Cookie: JSESSIONID=mlkda4crv7z01jd0q0668nsxp;Path=/gateway/idp;Secure;HttpOnly Set-Cookie: hadoop-jwt=eyJhbGciOiJSUzI1NiJ9.eyJleHAiOjE0NDM1ODUzNzEsInN1YiI6Imd1ZXN0IiwiYXVkIjoiSFNTTyIsImlzcyI6IkhTU08ifQ.RpA84Qdr6RxEZjg21PyVCk0G1kogvkuJI2bo302bpwbvmc-i01gCwKNeoGYzUW27MBXf6a40vylHVR3aZuuBUxsJW3aa_ltrx0R5ztKKnTWeJedOqvFKSrVlBzJJ90PzmDKCqJxA7JUhyo800_lDHLTcDWOiY-ueWYV2RMlCO0w;Path=/;Domain=localhost;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Location: https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS Content-Length: 0 Server: Jetty(8.1.14.v20131031) </code></pre> <p>Note the redirect back to the original URL in the Location header and the Set-Cookie for the hadoop-jwt cookie. This is what the SSOCookieProvider in sandbox (and ultimately in your integration) will be looking for.</p> <p>Finally, we should be able to take the above cookie and pass it to the original url as indicated in the Location header for our originally requested resource:</p> <pre><code> bash-3.2$ curl -ikH "Cookie: hadoop-jwt=eyJhbGciOiJSUzI1NiJ9.eyJleHAiOjE0NDM1ODY2OTIsInN1YiI6Imd1ZXN0IiwiYXVkIjoiSFNTTyIsImlzcyI6IkhTU08ifQ.Os5HEfVBYiOIVNLRIvpYyjeLgAIMbBGXHBWMVRAEdiYcNlJRcbJJ5aSUl1aciNs1zd_SHijfB9gOdwnlvQ_0BCeGHlJBzHGyxeypIoGj9aOwEf36h-HVgqzGlBLYUk40gWAQk3aRehpIrHZT2hHm8Pu8W-zJCAwUd8HR3y6LF3M;Path=/;Domain=localhost;Secure;HttpOnly" https://localhost:9443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS TODO: cluster was down and needs to be recreated :/ </code></pre> <h4><a id="Browse+the+Flow">Browse the Flow</a> <a href="#Browse+the+Flow"><img src="markbook-section-link.png"/></a></h4> <p>At this point, we can use a web browser instead of the command line and see how the browser will challenge the user for Basic Auth Credentials and then manage the cookies such that the SSO and token exchange aspects of the flow are hidden from the user.</p> <p>Simply, try to invoke the same webhdfs API from the browser URL bar.</p> <pre><code> https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS </code></pre> <p>Based on our understanding of the flow it should behave like:</p> <ul> <li>SSOCookieProvider checks for hadoop-jwt cookie and in its absence redirects to the configured SSO provider URL (knoxsso endpoint)</li> <li>ShiroProvider on the KnoxSSO endpoint returns a 401 and the browser challenges the user for username/password</li> <li>The ShiroProvider authenticates the user against the Demo LDAP Server using a simple LDAP bind and establishes the security context for the WebSSO request</li> <li>The WebSSO service exchanges the normalized Java Subject into a JWT token and sets it on the response as a cookie named hadoop-jwt</li> <li>The WebSSO service then redirects the user agent back to the originally requested URL - the webhdfs Knox service subsequent invocations will find the cookie in the incoming request and not need to engage the WebSSO service again until it expires.</li> </ul> <h4><a id="Filter+by+Example">Filter by Example</a> <a href="#Filter+by+Example"><img src="markbook-section-link.png"/></a></h4> <p>We have added a federation provider to Knox for accepting KnoxSSO cookies for REST APIs. This provides us with a couple benefits: KnoxSSO support for REST APIs for XmlHttpRequests from JavaScript (basic CORS functionality is also included). This is still rather basic and considered beta code. A model and real world usecase for others to base their integrations on</p> <p>In addition, <a href="https://issues.apache.org/jira/browse/HADOOP-11717">https://issues.apache.org/jira/browse/HADOOP-11717</a> added support for the Hadoop UIs to the hadoop-auth module and it can be used as another example.</p> <p>We will examine the new SSOCookieFederationFilter in Knox here.</p> <pre><code>package org.apache.knox.gateway.provider.federation.jwt.filter; import java.io.IOException; import java.security.Principal; import java.security.PrivilegedActionException; import java.security.PrivilegedExceptionAction; import java.util.ArrayList; import java.util.Date; import java.util.HashSet; import java.util.List; import java.util.Set; import javax.security.auth.Subject; import javax.servlet.Filter; import javax.servlet.FilterChain; import javax.servlet.FilterConfig; import javax.servlet.ServletException; import javax.servlet.ServletRequest; import javax.servlet.ServletResponse; import javax.servlet.http.Cookie; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import org.apache.knox.gateway.i18n.messages.MessagesFactory; import org.apache.knox.gateway.provider.federation.jwt.JWTMessages; import org.apache.knox.gateway.security.PrimaryPrincipal; import org.apache.knox.gateway.services.GatewayServices; import org.apache.knox.gateway.services.security.token.JWTokenAuthority; import org.apache.knox.gateway.services.security.token.TokenServiceException; import org.apache.knox.gateway.services.security.token.impl.JWTToken; public class SSOCookieFederationFilter implements Filter { private static JWTMessages log = MessagesFactory.get( JWTMessages.class ); private static final String ORIGINAL_URL_QUERY_PARAM = "originalUrl="; private static final String SSO_COOKIE_NAME = "sso.cookie.name"; private static final String SSO_EXPECTED_AUDIENCES = "sso.expected.audiences"; private static final String SSO_AUTHENTICATION_PROVIDER_URL = "sso.authentication.provider.url"; private static final String DEFAULT_SSO_COOKIE_NAME = "hadoop-jwt"; </code></pre> <p>The above represent the configurable aspects of the integration</p> <pre><code> private JWTokenAuthority authority = null; private String cookieName = null; private List<String> audiences = null; private String authenticationProviderUrl = null; @Override public void init( FilterConfig filterConfig ) throws ServletException { GatewayServices services = (GatewayServices) filterConfig.getServletContext().getAttribute(GatewayServices.GATEWAY_SERVICES_ATTRIBUTE); authority = (JWTokenAuthority)services.getService(GatewayServices.TOKEN_SERVICE); </code></pre> <p>The above is a Knox specific internal service that we use to issue and verify JWT tokens. This will be covered separately and you will need to be implement something similar in your filter implementation.</p> <pre><code> // configured cookieName cookieName = filterConfig.getInitParameter(SSO_COOKIE_NAME); if (cookieName == null) { cookieName = DEFAULT_SSO_COOKIE_NAME; } </code></pre> <p>The configurable cookie name is something that can be used to change a cookie name to fit your deployment environment. The default name is hadoop-jwt which is also the default in the Hadoop implementation. This name must match the name being used by the KnoxSSO endpoint when setting the cookie.</p> <pre><code> // expected audiences or null String expectedAudiences = filterConfig.getInitParameter(SSO_EXPECTED_AUDIENCES); if (expectedAudiences != null) { audiences = parseExpectedAudiences(expectedAudiences); } </code></pre> <p>Audiences are configured as a comma separated list of audience strings. Names of intended recipients or intents. The semantics that we are using for this processing is that - if not configured than any (or none) audience is accepted. If there are audiences configured then as long as one of the expected ones is found in the set of claims in the token it is accepted.</p> <pre><code> // url to SSO authentication provider authenticationProviderUrl = filterConfig.getInitParameter(SSO_AUTHENTICATION_PROVIDER_URL); if (authenticationProviderUrl == null) { log.missingAuthenticationProviderUrlConfiguration(); } } </code></pre> <p>This is the URL to the KnoxSSO endpoint. It is required and SSO/token exchange will not work without this set correctly.</p> <pre><code> /** * @param expectedAudiences * @return */ private List<String> parseExpectedAudiences(String expectedAudiences) { ArrayList<String> audList = null; // setup the list of valid audiences for token validation if (expectedAudiences != null) { // parse into the list String[] audArray = expectedAudiences.split(","); audList = new ArrayList<String>(); for (String a : audArray) { audList.add(a); } } return audList; } </code></pre> <p>The above method parses the comma separated list of expected audiences and makes it available for interrogation during token validation.</p> <pre><code> public void destroy() { } public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { String wireToken = null; HttpServletRequest req = (HttpServletRequest) request; String loginURL = constructLoginURL(req); wireToken = getJWTFromCookie(req); if (wireToken == null) { if (req.getMethod().equals("OPTIONS")) { // CORS preflight requests to determine allowed origins and related config // must be able to continue without being redirected Subject sub = new Subject(); sub.getPrincipals().add(new PrimaryPrincipal("anonymous")); continueWithEstablishedSecurityContext(sub, req, (HttpServletResponse) response, chain); } log.sendRedirectToLoginURL(loginURL); ((HttpServletResponse) response).sendRedirect(loginURL); } else { JWTToken token = new JWTToken(wireToken); boolean verified = false; try { verified = authority.verifyToken(token); if (verified) { Date expires = token.getExpiresDate(); if (expires == null || new Date().before(expires)) { boolean audValid = validateAudiences(token); if (audValid) { Subject subject = createSubjectFromToken(token); continueWithEstablishedSecurityContext(subject, (HttpServletRequest)request, (HttpServletResponse)response, chain); } else { log.failedToValidateAudience(); ((HttpServletResponse) response).sendRedirect(loginURL); } } else { log.tokenHasExpired(); ((HttpServletResponse) response).sendRedirect(loginURL); } } else { log.failedToVerifyTokenSignature(); ((HttpServletResponse) response).sendRedirect(loginURL); } } catch (TokenServiceException e) { log.unableToVerifyToken(e); ((HttpServletResponse) response).sendRedirect(loginURL); } } } </code></pre> <p>The doFilter method above is where all the real work is done. We look for a cookie by the configured name. If it isn’t there then we redirect to the configured SSO provider URL in order to acquire one. That is unless it is an OPTIONS request which may be a preflight CORS request. You shouldn’t need to worry about this aspect. It is really a REST API concern not a web app UI one.</p> <p>Once we get a cookie, the underlying JWT token is extracted and returned as the wireToken from which we create a Knox specific JWTToken. This abstraction is around the use of the nimbus JWT library which you can use directly. We will cover those details separately.</p> <p>We then ask the token authority component to verify the token. This involves signature validation of the signed token. In order to verify the signature of the token you will need to have the public key of the Knox SSO server configured and provided to the nimbus library through its API at verification time. NOTE: This is a good place to look at the Hadoop implementation as an example.</p> <p>Once we know the token is signed by a trusted party we then validate whether it is expired and that it has an expected (or no) audience claims.</p> <p>Finally, when we have a valid token, we create a Java Subject from it and continue the request through the filterChain as the authenticated user.</p> <pre><code> /** * Encapsulate the acquisition of the JWT token from HTTP cookies within the * request. * * @param req servlet request to get the JWT token from * @return serialized JWT token */ protected String getJWTFromCookie(HttpServletRequest req) { String serializedJWT = null; Cookie[] cookies = req.getCookies(); if (cookies != null) { for (Cookie cookie : cookies) { if (cookieName.equals(cookie.getName())) { log.cookieHasBeenFound(cookieName); serializedJWT = cookie.getValue(); break; } } } return serializedJWT; } </code></pre> <p>The above method extracts the serialized token from the cookie and returns it as the wireToken.</p> <pre><code> /** * Create the URL to be used for authentication of the user in the absence of * a JWT token within the incoming request. * * @param request for getting the original request URL * @return url to use as login url for redirect */ protected String constructLoginURL(HttpServletRequest request) { String delimiter = "?"; if (authenticationProviderUrl.contains("?")) { delimiter = "&"; } String loginURL = authenticationProviderUrl + delimiter + ORIGINAL_URL_QUERY_PARAM + request.getRequestURL().toString()+ getOriginalQueryString(request); return loginURL; } private String getOriginalQueryString(HttpServletRequest request) { String originalQueryString = request.getQueryString(); return (originalQueryString == null) ? "" : "?" + originalQueryString; } </code></pre> <p>The above method creates the full URL to be used in redirecting to the KnoxSSO endpoint. It includes the SSO provider URL as well as the original request URL so that we can redirect back to it after authentication and token exchange.</p> <pre><code> /** * Validate whether any of the accepted audience claims is present in the * issued token claims list for audience. Override this method in subclasses * in order to customize the audience validation behavior. * * @param jwtToken * the JWT token where the allowed audiences will be found * @return true if an expected audience is present, otherwise false */ protected boolean validateAudiences(JWTToken jwtToken) { boolean valid = false; String[] tokenAudienceList = jwtToken.getAudienceClaims(); // if there were no expected audiences configured then just // consider any audience acceptable if (audiences == null) { valid = true; } else { // if any of the configured audiences is found then consider it // acceptable for (String aud : tokenAudienceList) { if (audiences.contains(aud)) { //log.debug("JWT token audience has been successfully validated"); log.jwtAudienceValidated(); valid = true; break; } } } return valid; } </code></pre> <p>The above method implements the audience claim semantics explained earlier.</p> <pre><code> private void continueWithEstablishedSecurityContext(Subject subject, final HttpServletRequest request, final HttpServletResponse response, final FilterChain chain) throws IOException, ServletException { try { Subject.doAs( subject, new PrivilegedExceptionAction<Object>() { @Override public Object run() throws Exception { chain.doFilter(request, response); return null; } } ); } catch (PrivilegedActionException e) { Throwable t = e.getCause(); if (t instanceof IOException) { throw (IOException) t; } else if (t instanceof ServletException) { throw (ServletException) t; } else { throw new ServletException(t); } } } </code></pre> <p>This method continues the filter chain processing upon successful validation of the token. This would need to be replaced with your environment’s equivalent of continuing the request or login to the app as the authenticated user.</p> <pre><code> private Subject createSubjectFromToken(JWTToken token) { final String principal = token.getSubject(); @SuppressWarnings("rawtypes") HashSet emptySet = new HashSet(); Set<Principal> principals = new HashSet<Principal>(); Principal p = new PrimaryPrincipal(principal); principals.add(p); javax.security.auth.Subject subject = new javax.security.auth.Subject(true, principals, emptySet, emptySet); return subject; } </code></pre> <p>This method takes a JWTToken and creates a Java Subject with the principals expected by the rest of the Knox processing. This would need to be implemented in a way appropriate for your operating environment as well. For instance, the Hadoop handler implementation returns a Hadoop AuthenticationToken to the calling filter which in turn ends up in the Hadoop auth cookie.</p> <pre><code> } </code></pre> <h4><a id="Token+Signature+Validation">Token Signature Validation</a> <a href="#Token+Signature+Validation"><img src="markbook-section-link.png"/></a></h4> <p>The following is the method from the Hadoop handler implementation that validates the signature.</p> <pre><code> /** * Verify the signature of the JWT token in this method. This method depends on the * public key that was established during init based upon the provisioned public key. * Override this method in subclasses in order to customize the signature verification behavior. * @param jwtToken the token that contains the signature to be validated * @return valid true if signature verifies successfully; false otherwise */ protected boolean validateSignature(SignedJWT jwtToken){ boolean valid=false; if (JWSObject.State.SIGNED == jwtToken.getState()) { LOG.debug("JWT token is in a SIGNED state"); if (jwtToken.getSignature() != null) { LOG.debug("JWT token signature is not null"); try { JWSVerifier verifier=new RSASSAVerifier(publicKey); if (jwtToken.verify(verifier)) { valid=true; LOG.debug("JWT token has been successfully verified"); } else { LOG.warn("JWT signature verification failed."); } } catch (JOSEException je) { LOG.warn("Error while validating signature",je); } } } return valid; } </code></pre> <p>Hadoop Configuration Example The following is like the configuration in the Hadoop handler implementation.</p> <p>OBSOLETE but in the proper spirit of HADOOP-11717 ( HADOOP-11717 - Add Redirecting WebSSO behavior with JWT Token in Hadoop Auth RESOLVED )</p> <pre><code> <property> <name>hadoop.http.authentication.type</name> <value>org.apache.hadoop/security.authentication/server.JWTRedirectAuthenticationHandler</value> </property> </code></pre> <p>This is the handler classname in Hadoop auth for JWT token (KnoxSSO) support.</p> <pre><code> <property> <name>hadoop.http.authentication.authentication.provider.url</name> <value>http://c6401.ambari.apache.org:8888/knoxsso</value> </property> </code></pre> <p>The above property is the SSO provider URL that points to the knoxsso endpoint.</p> <pre><code> <property> <name>hadoop.http.authentication.public.key.pem</name> <value>MIICVjCCAb+gAwIBAgIJAPPvOtuTxFeiMA0GCSqGSIb3DQEBBQUAMG0xCzAJBgNV BAYTAlVTMQ0wCwYDVQQIEwRUZXN0MQ0wCwYDVQQHEwRUZXN0MQ8wDQYDVQQKEwZI YWRvb3AxDTALBgNVBAsTBFRlc3QxIDAeBgNVBAMTF2M2NDAxLmFtYmFyaS5hcGFj aGUub3JnMB4XDTE1MDcxNjE4NDcyM1oXDTE2MDcxNTE4NDcyM1owbTELMAkGA1UE BhMCVVMxDTALBgNVBAgTBFRlc3QxDTALBgNVBAcTBFRlc3QxDzANBgNVBAoTBkhh ZG9vcDENMAsGA1UECxMEVGVzdDEgMB4GA1UEAxMXYzY0MDEuYW1iYXJpLmFwYWNo ZS5vcmcwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMFs/rymbiNvg8lDhsdA qvh5uHP6iMtfv9IYpDleShjkS1C+IqId6bwGIEO8yhIS5BnfUR/fcnHi2ZNrXX7x QUtQe7M9tDIKu48w//InnZ6VpAqjGShWxcSzR6UB/YoGe5ytHS6MrXaormfBg3VW tDoy2MS83W8pweS6p5JnK7S5AgMBAAEwDQYJKoZIhvcNAQEFBQADgYEANyVg6EzE 2q84gq7wQfLt9t047nYFkxcRfzhNVL3LB8p6IkM4RUrzWq4kLA+z+bpY2OdpkTOe wUpEdVKzOQd4V7vRxpdANxtbG/XXrJAAcY/S+eMy1eDK73cmaVPnxPUGWmMnQXUi TLab+w8tBQhNbq6BOQ42aOrLxA8k/M4cV1A=</value> </property> </code></pre> <p>The above property holds the KnoxSSO server’s public key for signature verification. Adding it directly to the config like this is convenient and is easily done through Ambari to existing config files that take custom properties. Config is generally protected as root access only as well - so it is a pretty good solution.</p> <h4><a id="Public+Key+Parsing">Public Key Parsing</a> <a href="#Public+Key+Parsing"><img src="markbook-section-link.png"/></a></h4> <p>In order to turn the pem encoded config item into a public key the hadoop handler implementation does the following in the init() method.</p> <pre><code> if (publicKey == null) { String pemPublicKey = config.getProperty(PUBLIC_KEY_PEM); if (pemPublicKey == null) { throw new ServletException( "Public key for signature validation must be provisioned."); } publicKey = CertificateUtil.parseRSAPublicKey(pemPublicKey); } </code></pre> <p>and the CertificateUtil class is below:</p> <pre><code> package org.apache.hadoop.security.authentication.util; import java.io.ByteArrayInputStream; import java.io.UnsupportedEncodingException; import java.security.PublicKey; import java.security.cert.CertificateException; import java.security.cert.CertificateFactory; import java.security.cert.X509Certificate; import java.security.interfaces.RSAPublicKey; import javax.servlet.ServletException; public class CertificateUtil { private static final String PEM_HEADER = "-----BEGIN CERTIFICATE-----\n"; private static final String PEM_FOOTER = "\n-----END CERTIFICATE-----"; /** * Gets an RSAPublicKey from the provided PEM encoding. * * @param pem * - the pem encoding from config without the header and footer * @return RSAPublicKey the RSA public key * @throws ServletException thrown if a processing error occurred */ public static RSAPublicKey parseRSAPublicKey(String pem) throws ServletException { String fullPem = PEM_HEADER + pem + PEM_FOOTER; PublicKey key = null; try { CertificateFactory fact = CertificateFactory.getInstance("X.509"); ByteArrayInputStream is = new ByteArrayInputStream( fullPem.getBytes("UTF8")); X509Certificate cer = (X509Certificate) fact.generateCertificate(is); key = cer.getPublicKey(); } catch (CertificateException ce) { String message = null; if (pem.startsWith(PEM_HEADER)) { message = "CertificateException - be sure not to include PEM header " + "and footer in the PEM configuration element."; } else { message = "CertificateException - PEM may be corrupt"; } throw new ServletException(message, ce); } catch (UnsupportedEncodingException uee) { throw new ServletException(uee); } return (RSAPublicKey) key; } } </code></pre> <h3><a id="Health+Monitoring+API">Health Monitoring API</a> <a href="#Health+Monitoring+API"><img src="markbook-section-link.png"/></a></h3> <h1>Health Monitoring REST API</h1> <p>Knox provides REST-ful API for monitoring the core service. It primarily exposes the health of the Knox service that includes service status (up/down) as well as other health metrics. This is a work-in-progress feature, which started with an extensible framework to support basic functionalities. In particular, it currently supports the API to A) <em>ping</em> the service and B) time-based statistics related to all API calls.</p> <h4><a id="Health+Monitoring+Setup">Health Monitoring Setup</a> <a href="#Health+Monitoring+Setup"><img src="markbook-section-link.png"/></a></h4> <p>The basic setup includes two major steps A) add configurations to enable the metrics collection and reporting B) write a topology file and upload it into <em>topologies</em> directory.</p> <h5><a id="Service+Configurations">Service Configurations</a> <a href="#Service+Configurations"><img src="markbook-section-link.png"/></a></h5> <p>At first, we need to make sure the gateway configurations to gather and report to JMX are turned on in <em>gateway-site.xml</em>. The following two configurations into <em>gateway-site.xml</em> will serve the purpose.</p> <pre><code><property> <name>gateway.metrics.enabled</name> <value>true</value> <description>Boolean flag indicates whether to enable the metrics collection</description> </property> <property> <name>gateway.jmx.metrics.reporting.enabled</name> <value>true</value> <description>Boolean flag indicates whether to enable the metrics reporting using JMX</description> </property> </code></pre> <h5><a id="health.xml+Topology">health.xml Topology</a> <a href="#health.xml+Topology"><img src="markbook-section-link.png"/></a></h5> <p>In order to enable health monitoring REST service, you need to add a new topology file (i.e. <em>health.xml</em>). The following is an example that is configured to test the basic functionalities of Knox service. It is highly recommended using more restricted authentication mechanism.</p> <pre><code><topology> <gateway> <provider> <role>authentication</role> <name>ShiroProvider</name> <enabled>true</enabled> <param> <!-- session timeout in minutes, this is really idle timeout, defaults to 30 mins, if the property value is not defined,, current client authentication would expire if client idles continuously for more than this value --> <name>sessionTimeout</name> <value>30</value> </param> <param> <name>main.ldapRealm</name> <value>org.apache.knox.gateway.shirorealm.KnoxLdapRealm</value> </param> <param> <name>main.ldapContextFactory</name> <value>org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory</value> </param> <param> <name>main.ldapRealm.contextFactory</name> <value>$ldapContextFactory</value> </param> <param> <name>main.ldapRealm.userDnTemplate</name> <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value> </param> <param> <name>main.ldapRealm.contextFactory.url</name> <value>ldap://localhost:33389</value> </param> <param> <name>main.ldapRealm.contextFactory.authenticationMechanism</name> <value>simple</value> </param> <param> <name>urls./**</name> <value>authcBasic</value> </param> </provider> <provider> <role>authorization</role> <name>AclsAuthz</name> <enabled>false</enabled> <param> <name>knox.acl</name> <value>admin;*;*</value> </param> </provider> <provider> <role>identity-assertion</role> <name>Default</name> <enabled>false</enabled> </provider> <provider> <role>hostmap</role> <name>static</name> <enabled>true</enabled> <param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param> </provider> </gateway> <service> <role>HEALTH</role> </service> </topology> </code></pre> <p>Just as with any Knox service, the gateway providers protect the health monitoring REST service defined above it. In this case, the ShiroProvider is taking care of HTTP Basic Auth using LDAP. Once the user authenticates with LDAP, the request processing continues to the <em>Health</em> service that will perform the necessary actions.</p> <p>The authenticate/federation provider can be swapped out to fit your deployment environment.</p> <p>After creating the file health.xml with above contents, you need to copy the file to <em>KNOX_HOME/conf/topologies</em> directory. If Knox/gateway service is not running, you can start it using “<em>bin/gateway.sh start</em>”. Otherwise the service would automatically pick this new ‘<em>health</em>’ service. When gateway service registers the new service, it displays the following log messages in <em>log/gateway.log</em>.</p> <pre><code>2017-08-22 03:44:25,045 INFO knox.gateway (GatewayServer.java:handleCreateDeployment(677)) - Deploying topology health to /home/joe/knox/knox-0.12.0/bin/../data/deployments/health.topo.15e080a91c0 2017-08-22 03:44:25,045 INFO knox.gateway (GatewayServer.java:internalDeactivateTopology(596)) - Deactivating topology health 2017-08-22 03:44:25,119 INFO knox.gateway (DefaultGatewayServices.java:initializeContribution(197)) - Creating credential store for the cluster: health 2017-08-22 03:44:25,142 INFO knox.gateway (GatewayServer.java:internalActivateTopology(566)) - Activating topology health 2017-08-22 03:44:25,142 INFO knox.gateway (GatewayServer.java:internalActivateArchive(576)) - Activating topology health archive %2F </code></pre> <h5><a id="Verify">Verify</a> <a href="#Verify"><img src="markbook-section-link.png"/></a></h5> <p>Once the health service is active, you can verify it by using the following <em>curl</em> command. The ‘<em>ping</em>’ end point displays if the service is up. This end point can be utilized for monitoring the basic health of a Knox service.</p> <pre><code>$ curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/health/v1/ping' HTTP/1.1 200 OK Date: Tue, 22 Aug 2017 07:09:37 GMT Set-Cookie: JSESSIONID=1o82bcvoqbhbb1apt7zs8ubybb;Path=/gateway/health;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: rememberMe=deleteMe; Path=/gateway/health; Max-Age=0; Expires=Mon, 21-Aug-2017 07:09:37 GMT Cache-Control: must-revalidate,no-cache,no-store Content-Type: text/plain; charset=ISO-8859-1 Content-Length: 3 Server: Jetty(9.2.15.v20160210) OK </code></pre> <p>To retrieve the meaningful metrics details of various service calls, you may need to run multiple REST calls such as the followings. After that, execute the metrics REST call as shown below with a sample output. As shown, metrics output is returned in JSON format.</p> <pre><code>curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS' </code></pre> <pre><code>$ curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/health/v1/metrics?pretty=true' HTTP/1.1 200 OK Date: Tue, 22 Aug 2017 07:10:44 GMT Set-Cookie: JSESSIONID=kqntcdaje9uai3pup7ffvfw4;Path=/gateway/health;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: rememberMe=deleteMe; Path=/gateway/health; Max-Age=0; Expires=Mon, 21-Aug-2017 07:10:44 GMT Content-Type: application/json Cache-Control: must-revalidate,no-cache,no-store Transfer-Encoding: chunked Server: Jetty(9.2.15.v20160210) { "version" : "3.0.0", "gauges" : { }, "counters" : { }, "histograms" : { }, "meters" : { }, "timers" : { "client./gateway/health/v1/metrics.GET-requests" : { "count" : 5, "max" : 0.624587973, "mean" : 0.027655743001736188, "min" : 0.006145587, "p50" : 0.010020548, "p75" : 0.010020548, "p95" : 0.074454725, "p98" : 0.624587973, "p99" : 0.624587973, "p999" : 0.624587973, "stddev" : 0.0929226225229978, "m15_rate" : 2.657500857422334E-7, "m1_rate" : 5.770087852901534E-89, "m5_rate" : 4.769163772973399E-19, "mean_rate" : 4.0952378345310894E-4, "duration_units" : "seconds", "rate_units" : "calls/second" }, "client./gateway/health/v1/ping.GET-requests" : { "count" : 1, "max" : 0.017257638000000002, "mean" : 0.017257638000000002, "min" : 0.017257638000000002, "p50" : 0.017257638000000002, "p75" : 0.017257638000000002, "p95" : 0.017257638000000002, "p98" : 0.017257638000000002, "p99" : 0.017257638000000002, "p999" : 0.017257638000000002, "stddev" : 0.0, "m15_rate" : 0.18710139700632353, "m1_rate" : 0.0735758882342885, "m5_rate" : 0.1637461506155964, "mean_rate" : 0.014990517517814805, "duration_units" : "seconds", "rate_units" : "calls/second" }, "client./gateway/sandbox/health/v1/.GET-requests" : { "count" : 1, "max" : 4.01873E-4, "mean" : 4.01873E-4, "min" : 4.01873E-4, "p50" : 4.01873E-4, "p75" : 4.01873E-4, "p95" : 4.01873E-4, "p98" : 4.01873E-4, "p99" : 4.01873E-4, "p999" : 4.01873E-4, "stddev" : 0.0, "m15_rate" : 2.536740427767808E-7, "m1_rate" : 7.074903404511115E-90, "m5_rate" : 4.081014139447941E-19, "mean_rate" : 8.179827684854002E-5, "duration_units" : "seconds", "rate_units" : "calls/second" }, "client./gateway/sandbox/v1/health/.GET-requests" : { "count" : 1, "max" : 5.470700000000001E-4, "mean" : 5.470700000000001E-4, "min" : 5.470700000000001E-4, "p50" : 5.470700000000001E-4, "p75" : 5.470700000000001E-4, "p95" : 5.470700000000001E-4, "p98" : 5.470700000000001E-4, "p99" : 5.470700000000001E-4, "p999" : 5.470700000000001E-4, "stddev" : 0.0, "m15_rate" : 2.413022137213267E-7, "m1_rate" : 3.341947732164585E-90, "m5_rate" : 3.512561421726287E-19, "mean_rate" : 8.149518570285245E-5, "duration_units" : "seconds", "rate_units" : "calls/second" }, "client./gateway/sandbox/webhdfs/v1/.GET-requests" : { "count" : 4, "max" : 0.463745401, "mean" : 0.024924118143299912, "min" : 0.016542244, "p50" : 0.024799078000000002, "p75" : 0.033933548, "p95" : 0.033933548, "p98" : 0.033933548, "p99" : 0.033933548, "p999" : 0.033933548, "stddev" : 0.007284773511002474, "m15_rate" : 2.120680068580741E-8, "m1_rate" : 4.7541228609699333E-91, "m5_rate" : 1.5806080232092864E-20, "mean_rate" : 2.7314359915623396E-4, "duration_units" : "seconds", "rate_units" : "calls/second" }, "service./gateway/sandbox/webhdfs/v1/.get-requests" : { "count" : 3, "max" : 0.014635496000000001, "mean" : 0.00342438191233768, "min" : 0.0020088890000000002, "p50" : 0.0020088890000000002, "p75" : 0.005144646, "p95" : 0.005144646, "p98" : 0.005144646, "p99" : 0.005144646, "p999" : 0.005144646, "stddev" : 0.0015604555820128599, "m15_rate" : 1.9913776931949195E-8, "m1_rate" : 3.1334281325640874E-91, "m5_rate" : 1.055281734633953E-20, "mean_rate" : 2.0486339070804923E-4, "duration_units" : "seconds", "rate_units" : "calls/second" } } } </code></pre> <h4><a id="REST+End+Points">REST End Points</a> <a href="#REST+End+Points"><img src="markbook-section-link.png"/></a></h4> <p>As mentioned above, currently Knox provides a few monitoring APIs to start with. The list will gradually grow to support new use-cases.</p> <h5><a id="/ping">/ping</a> <a href="#/ping"><img src="markbook-section-link.png"/></a></h5> <p>This end-point can be used to determine if a Knox gateway service is alive or not. It is useful for basic health monitoring of the core service. Although most of the results of REST calls are in JSON format, this one (*/ping*) is in plain text. </p> <p>Sample response</p> <pre><code>OK </code></pre> <h5><a id="/metrics">/metrics</a> <a href="#/metrics"><img src="markbook-section-link.png"/></a></h5> <p>This end-point returns all Knox metrics grouped by individual call type. For example, timer metrics for all <em>webhdfs</em> calls are aggregated into one set of metrics and then returned in a separate JSON element. This end-point also supports an option (*/metrics?pretty=true*) to pretty print the metrics output.</p> <p>A sample response with <em>pretty=true</em> is shown below:</p> <pre><code>{ "version" : "3.0.0", "gauges" : { }, "counters" : { }, "histograms" : { }, "meters" : { }, "timers" : { "client./gateway/health/v1/ping.GET-requests" : { "count" : 1, "max" : 0.017257638000000002, "mean" : 0.017257638000000002, "min" : 0.017257638000000002, "p50" : 0.017257638000000002, "p75" : 0.017257638000000002, "p95" : 0.017257638000000002, "p98" : 0.017257638000000002, "p99" : 0.017257638000000002, "p999" : 0.017257638000000002, "stddev" : 0.0, "m15_rate" : 0.18710139700632353, "m1_rate" : 0.0735758882342885, "m5_rate" : 0.1637461506155964, "mean_rate" : 0.014990517517814805, "duration_units" : "seconds", "rate_units" : "calls/second" }, "client./gateway/sandbox/v1/health/.GET-requests" : { "count" : 1, "max" : 5.470700000000001E-4, "mean" : 5.470700000000001E-4, "min" : 5.470700000000001E-4, "p50" : 5.470700000000001E-4, "p75" : 5.470700000000001E-4, "p95" : 5.470700000000001E-4, "p98" : 5.470700000000001E-4, "p99" : 5.470700000000001E-4, "p999" : 5.470700000000001E-4, "stddev" : 0.0, "m15_rate" : 2.413022137213267E-7, "m1_rate" : 3.341947732164585E-90, "m5_rate" : 3.512561421726287E-19, "mean_rate" : 8.149518570285245E-5, "duration_units" : "seconds", "rate_units" : "calls/second" }, "client./gateway/sandbox/webhdfs/v1/.GET-requests" : { "count" : 4, "max" : 0.463745401, "mean" : 0.024924118143299912, "min" : 0.016542244, "p50" : 0.024799078000000002, "p75" : 0.033933548, "p95" : 0.033933548, "p98" : 0.033933548, "p99" : 0.033933548, "p999" : 0.033933548, "stddev" : 0.007284773511002474, "m15_rate" : 2.120680068580741E-8, "m1_rate" : 4.7541228609699333E-91, "m5_rate" : 1.5806080232092864E-20, "mean_rate" : 2.7314359915623396E-4, "duration_units" : "seconds", "rate_units" : "calls/second" } } } </code></pre> <h2><a id="Auditing">Auditing</a> <a href="#Auditing"><img src="markbook-section-link.png"/></a></h2> <pre><code class="java">public class AuditingSample { private static Auditor AUDITOR = AuditServiceFactory.getAuditService().getAuditor( "sample-channel", "sample-service", "sample-component" ); public void sampleMethod() { ... AUDITOR.audit( Action.AUTHORIZATION, sourceUrl, ResourceType.URI, ActionOutcome.SUCCESS ); ... } } </code></pre> <h2><a id="Logging">Logging</a> <a href="#Logging"><img src="markbook-section-link.png"/></a></h2> <pre><code class="java">@Messages( logger = "org.apache.project.module" ) public interface CustomMessages { @Message( level = MessageLevel.FATAL, text = "Failed to parse command line: {0}" ) void failedToParseCommandLine( @StackTrace( level = MessageLevel.DEBUG ) ParseException e ); } </code></pre> <pre><code class="java">public class CustomLoggingSample { private static GatewayMessages MSG = MessagesFactory.get( GatewayMessages.class ); public void sampleMethod() { ... MSG.failedToParseCommandLine( e ); ... } } </code></pre> <h2><a id="Internationalization">Internationalization</a> <a href="#Internationalization"><img src="markbook-section-link.png"/></a></h2> <pre><code class="java">@Resources public interface CustomResources { @Resource( text = "Apache Hadoop Gateway {0} ({1})" ) String gatewayVersionMessage( String version, String hash ); } </code></pre> <pre><code class="java">public class CustomResourceSample { private static GatewayResources RES = ResourcesFactory.get( GatewayResources.class ); public void sampleMethod() { ... String s = RES.gatewayVersionMessage( "0.0.0", "XXXXXXX" ) ); ... } } </code></pre> <h2><a id="Admin+UI">Admin UI</a> <a href="#Admin+UI"><img src="markbook-section-link.png"/></a></h2> <h3><a id="Introduction">Introduction</a> <a href="#Introduction"><img src="markbook-section-link.png"/></a></h3> <p>The Admin UI is a work in progress. It has started with viewpoint of being a simple web interface for Admin API functions but will hopefully grow into being able to also provide visibility into the gateway in terms of logs and metrics.</p> <h3><a id="Source+and+Binaries">Source and Binaries</a> <a href="#Source+and+Binaries"><img src="markbook-section-link.png"/></a></h3> <p>The Admin UI application follows the architecture of a hosted application in Knox. To that end it needs to be packaged up in the gateway-applications module in the source tree so that in the installation it can wind up here</p> <p><code><GATEWAY_HOME>/data/applications/admin-ui</code></p> <p>However since the application is built using angular and various node modules the source tree is not something we want to place into the gateway-applications module. Instead we will place the production ‘binaries’ in gateway-applications and have the source in a module called ‘gateway-admin-ui’.</p> <p>To work with the angular application you need to install some prerequisite tools. </p> <p>The main tool needed is the <a href="https://github.com/angular/angular-cli#installation">angular cli</a> and while installing that you will get its dependencies which should fulfill any other requirements <a href="https://github.com/angular/angular-cli#prerequisites">Prerequisites</a></p> <h3><a id="Manager+Topology">Manager Topology</a> <a href="#Manager+Topology"><img src="markbook-section-link.png"/></a></h3> <p>The Admin UI is deployed to a fixed topology. The topology file can be found under</p> <p><code><GATEWAY_HOME>/conf/topologies/manager.xml</code></p> <p>The topology hosts an instance of the Admin API for the UI to use. The reason for this is that the existing Admin API needs to have a different security model from that used by the Admin UI. The key components of this topology are:</p> <pre><code class="xml"><provider> <role>webappsec</role> <name>WebAppSec</name> <enabled>true</enabled> <param><name>csrf.enabled</name><value>true</value></param> <param><name>csrf.customHeader</name><value>X-XSRF-Header</value></param> <param><name>csrf.methodsToIgnore</name><value>GET,OPTIONS,HEAD</value></param> <param><name>xframe-options.enabled</name><value>true</value></param> <param><name>strict.transport.enabled</name><value>true</value></param> </provider> </code></pre> <p>and </p> <pre><code class="xml"><application> <role>admin-ui</role> </application> </code></pre> <h2><a id="Trademarks">Trademarks</a> <a href="#Trademarks"><img src="markbook-section-link.png"/></a></h2> <p>Apache Knox, Apache Knox Gateway, Apache, the Apache feather logo and the Apache Knox Gateway project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p> <h2><a id="License">License</a> <a href="#License"><img src="markbook-section-link.png"/></a></h2> <p>Apache Knox uses the standard <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache license</a>.</p> <h2><a id="Privacy+Policy">Privacy Policy</a> <a href="#Privacy+Policy"><img src="markbook-section-link.png"/></a></h2> <p>Apache Knox uses the standard Apache privacy policy.</p> <p>Information about your use of this website is collected using server access logs and a tracking cookie. The collected information consists of the following:</p> <ul> <li>The IP address from which you access the website;</li> <li>The type of browser and operating system you use to access our site;</li> <li>The date and time you access our site;</li> <li>The pages you visit; and</li> <li>The addresses of pages from where you followed a link to our site.</li> </ul> <p>Part of this information is gathered using a tracking cookie set by the <a href="http://www.google.com/analytics/">Google Analytics</a> service. Google’s policy for the use of this information is described in their <a href="http://www.google.com/privacy.html">privacy policy</a>. See your browser’s documentation for instructions on how to disable the cookie if you prefer not to share this data with Google.</p> <p>We use the gathered information to help us make our site more useful to visitors and to better understand how and when our site is used. We do not track or collect personally identifiable information or associate gathered data with any personally identifying information from other sources.</p> <p>By using this website, you consent to the collection of this data in the manner and for the purpose described above.</p>