- Authentication and Web Resource Authorization of Web Services Clients
Tip: See Authentication from Third-Party Applications for examples of web services clients.
This document serves the following purposes:
- Web services security basics
- How existing and new web services are secured in Pentaho today
- How existing and new web services will be secured in Pentaho in the future
The ultimate destination of resources provided by the Pentaho BI Server is your users. Those resources can be consumed directly using a web browser or indirectly using web services. In the former case, the user has explicitly requested the resource. In the latter case, code has requested the resource (on behalf of the user).
The authentication process differs when dealing with a web browser as opposed to a web service client. Why? To find out, let's look at the diagram below.
In the above diagram, we see a Pentaho service at the lowest layer. In this context, a Pentaho service is any Pentaho BI Server code that one wishes to execute remotely. One layer above the service, we see two authentication mechanisms--one for web service clients and one for web browsers. At the top of the diagram, there is a server icon representing a web service client and a workstation icon representing a web browser request. Both make requests, as indicated by request labels on the down arrows. Both receive the resource that they requested if all goes well, as indicated by the resource labels on the up arrows. But what happens when things don't go well? If the BI Server knows who you are yet still decides to deny access, the requester will receive a forbidden response. Forbidden is HTTP response code 403. What about when the BI Server doesn't know who you are?
If you'll notice, there's only one thing different between the interactions observed from web browsers and web service clients--the returned value when the BI Server doesn't yet know who you are. Why do they differ? Well, one response is best processed by a human, as indicated by the redirect (to the login page). And another response is best processed by code, as indicated by the auth req. The auth req, or "authentication request" is HTTP response code 401.
Technical Information: What happens as a result of submitting bad credentials (e.g. an unknown username and/or incorrect password)? In digest authentication, the response code is still 401. In form authentication, the login page is redisplayed, this time with an error message. The response code in this case is 200 (OK).
So web service clients can easily know (by checking for response code 401) whether or not the BI Server recognizes them. Why can't it just check for response code 302 (a redirect)? A redirect is generic and used in other contexts besides authentication. So its presence alone is not enough. Even if the web service client is smart enough to follow a redirect (and re-request the target specified in the redirect instruction), the login page will be returned as the resource requested and the web service client will be unable to determine this.
If you wish to deploy a new web service today, you must add the URL for the web service to web.xml's /web-app/servlet/init-param[param-name="send401List"]/param-value. This parameter guarantees that if a user is not authenticated, then the web service will receive a 401 instead of the login page.
The information in this section describes the segmentation of URLs such that each segment can be be handled by different authentication mechanisms. This segmentation is the default as of 2010-JUNE-01. There is a defect that has been created to move away from the legacy method to the new method: BISERVER-1290.
The examples below show different ways of configuring the Pentaho BI Server for secure web services along with client-side code examples.
Basic authentication is enabled out-of-the-box. However, the server, as configured out-of-the-box, will never send a challenge (response code 401). That's because the Basic authentication processing mechanism is initiated only when credentials are supplied. Why? Because the Pentaho BI Server is also guarded using Form Authentication so the absence of Basic credentials is assumed to be a human client. Using this setup, the way to get your credentials processed is to send your request with credentials without waiting for the challenge.
Ideally, there is one set of URLs that are requested by humans and another set of URLs requested by code. In this example, this partitioning is shown.
Because we want an authentication mechanism depending on the client type (web browser or web service client), we define two filter chains. (What is implied here is that all web service calls live under /webservices/.) Notice that the filter chains are quite different yet share some common filters between them. We'll see that some filters can be shared between the two chains and others cannot.
Warning: Note that the end-of-line backslashes that occur in the excerpt below are present for formatting purposes only and should not be present in the actual file.
From the Javadoc for HttpSessionContextIntegrationFilter:
If for whatever reason no HttpSession should ever be created (eg this filter is only being used with Basic authentication or similar clients that will never present the same jsessionid etc), the setAllowSessionCreation(boolean) should be set to false.
And another quote from the Spring Security documentation stating something similar:
As web services will never present a jsessionid on future requests, creating HttpSession instances for such user agents would be wasteful. If you had a high-volume application which required maximum scalability, we recommend you use the approach shown [in this section of the documentation]. For smaller applications, using a single HttpSessionContextIntegrationFilter (with its default allowSessionCreation as true) would likely be sufficient.
While Spring Security recommends setting allowSessionCreation to false for web services, the example on this page shares a single HttpSessionContextIntegrationFilter (with allowSessionCreation as true) for reasons of simplicity.
Shown below is a single Pentaho service. It is a servlet that is exposed via two URLs. One is applicable for browsers (the first one) and one is applicable for web service clients (the second one). Note that both entry points (URLs) are protected; they each have a filter chain that is applicable.
Warning: The order of the filters in the filter chains is imperative! You might notice that the filter pentahoSecurityStartupFilter occurs after filterInvocationInterceptorForWS in the first filter chain but before filterInvocationInterceptor in the second filter chain.
The reason that pentahoSecurityStartupFilter must come after filterInvocationInterceptorForWS is this: DigestProcessingFilter populates the SecurityContext with an Authentication instance but does not set it to authenticated. That is done later in FilterSecurityInterceptor.beforeInvocation(). This is also when roles are fetched. And until roles are fetched, we don't want pentahoSecurityStartupFilter to run. (In the second filter chain, pentahoSecurityStartupFilter can precede filterInvocationInterceptor since authenticationProcessingFilter fetches the roles for the user.
The example below uses digest authentication.
The SecurityContextHolderAwareRequestFilter takes an Authentication instance (populated by Spring Security) and exposes that Authentication instance via Servlet API calls such as getRemoteUser() and getUserPrincipal(). This filter is necessary since SecurityStartupFilter, a custom Pentaho filter, depends on the aforementioned Servlet API calls.
SecurityContextHolderAwareRequestFilter makes use of wrappers. There are two wrappers provided by Spring Security: SecurityContextHolderAwareRequestWrapper and SavedRequestAwareWrapper. Note that the subclass of SecurityContextHolderAwareRequestWrapper, SavedRequestAwareWrapper, is not appropriate for web service clients! It attempts to save the original request made by the client for recalling later after authentication. This is bad for two reasons. For one, using the SavedRequestAwareWrapper adversely affects web service calls (see information box below). And for another, the request and the login are part of the same request, so saving the request is unnecessary.
Using SavedRequestAwareWrapper adversely affects web service calls. Why? Consider the following example. User requests a resource without Authorization header (a header required to authenticate via digest authentication). Server stores request and responds with 401. User is prompted with dialog for username and password. User enters username and password and submits. Server pulls saved request. DigestProcessingFilter asks request for Authorization header. Since request is the original request, no Authorization header is found and the user is prompted with dialog again.
Web services URLs are protected with different access rules.