Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

The previous section described our Solution Oriented approach and its benefits. The rest of this document provides a description of the Solution Engine and the documents and tools required to build solutions in the Pentaho BI Platform.


This section is intended for people interested in building solutions and creating content and is also valuable for anyone who needs to interface with or develop portions of the Pentaho BI Platform. Prior to reading this document, you should have read and understood the "Pentaho Technical White Paper" which is can be downloaded from SourceForge -

This document has examples and references to files distributed with the Pentaho BI Platform Pre-Configured Install (PCI). It is recommended that the PCI be downloaded and installed. It is available at SourceForge

Introduction to the Solution Engine

The Solution Engine is the focal point for activity within the Pentaho BI Platform. It "sits" between the outside world - Web Client, Services, System Monitor etc. and the Component Layer. See Figure 1 - Architecture Diagram. Requests to do work come into the solution engine and are routed to the appropriate component or components for execution. The following terms will be used in the discussion of the Solution Engine:

    • Solution -* A Solution consists of a collection of documents that collectively define the processes and activities that are the system's part in implementing a solution to a business problem. These documents include Action Sequence Definitions, workflow process definitions, report definitions, images, rules, queries etc.
    • Solution Repository --* The location where solution definitions and the metadata they rely on is stored and maintained. Requests made to the platform to have actions executed rely on the action being defined in the Solution Repository.
    • Solution Engine --* The engine that retrieves the definition of an action from the Solution Repository and directs its execution.
    • Component --* The component layer provides a standard interface between the solution engine and the application that executes business logic. A component may contain all of the code required to perform a task or may just be an interface to another application or system. Data and instructions to the component are provided via an Action Definition.
    • Action Definition --* An XML definition specifying the parameters, resources and settings required for the execution of a task within a single component. The Action Definition defines which component to call, what data to pass into and receive from the component and any component specific information required. An action definition is not a stand alone document; it is a part of an Action Sequence Definition.
    • Action Sequence Definition --* An XML document that defines the interaction between one or more Action Definitions. It defines the smallest complete task that the Solution Engine can perform. When the Solution Engine is told to execute - it is given an Action Sequence document to execute. The execution of the Action Sequence can be completed autonomously or may execute as part of another Action Sequence. Action Sequence Definitions are stored in the Solution Repository.
    • Runtime Context --* Action Sequences are transformed from XML by the solution engine into objects that are interpreted by the Runtime Context. The Runtime Context maintains a contract between the Solution Engine and the Action Sequence and enforces a contract between the Action Sequence and the components.

The architecture diagram below shows how the Solution Engine, Solution Repository, and components fit into the architecture. The Solutions, Action Definitions and Action Sequence Definitions are stored in the Solution Repository. The Solution Engine creates Runtime Contexts every time a request is received.
Figure 1 - Pentaho Architecture Diagram

The Solution Repository
A solution is not a single document; it's a collection of documents. It's a logical grouping of Action Sequence Definitions and the resources they require. The grouping is maintained by the Solution Repository. You can see the structure of the solution repository by navigating to the pentaho-solution directory in the top level PCI install directory. The default location is: /pentaho-demo/pentaho-solutions.

In the pentaho-solutions directory, there are 2 predefined solutions test and samples. Notice that there are subdirectories under the solution test. Action Sequence documents and resources may be located anywhere below the solution directory (in this case - test) and with any level of nesting of folders. This provides a way to logically group content. Within the test solution, Action Sequences are grouped by component type; email, reporting, workflow etc. For a production system the grouping may be by department or role or whatever structure makes sense for that solution.

To locate an Action Sequence in the repository, use the three part address: solution id, path and action sequence name. In the case of the HelloWorld Action Sequence the address is: samples, getting-started, HelloWorld.xaction.

The test solution is used to verify that the different components in the system are set up and function correctly. The samples solution has... well, samples. The samples are a good starting point for learning and building new Action Sequences. The Action Sequences referenced in this document are located in samples, getting-started.

The Design Studio

The Pentaho Design Studio provides a graphical environment for building, managing, and testing your solution repository. It provides a collection of templates, editors and wizards to help create and maintain your solution repository.

It's built using the Eclipse framework ( which is an open source software development project dedicated to providing a robust, full-featured, commercial-quality, industry platform for the development of highly integrated tools. It provides building blocks and a foundation for constructing and running integrated software-development tools. The Eclipse Platform allows tool builders to independently develop tools that integrate with other people's tools so seamlessly you can't tell where one tool ends and another starts. Leveraging Eclipse gives us a number of advantages including; an existing well known and well defined framework, the ability to integrate different tools while maintaining a common look and feel, reuse of existing components and a huge savings in development time.

Plug-in Architecture

Eclipse has a plug-in based architecture which means that almost all of its functionality comes from the contributions of independent modules. By selectively installing specific plug-ins, the Eclipse platform can be completely customized and have as much or as little functionality as desired. The Pentaho Design Studio is distributed as an individual plug-in that may be installed into an existing Eclipse platform for users already using Eclipse. For users new to Eclipse, or users that want a separate "application" that handles building BI solutions, the Pentaho Design Studio can also be downloaded as a bundle that includes both the Design Studio plug-in and the Eclipse framework.

Installing the full Design Studio

Java JRE:
The Design Studio does not include a Java runtime environment (JRE). You will need a 1.4.2 level or higher Java runtime or Java development kit (JDK) installed on your machine in order to run the Design Studio. If you have the Pentaho BI Platform Pre-Configured Install (sometimes called the Pentaho Demo) installed on your machine then you already have a usable JRE.

To use the JRE that comes with the Pentaho Pre-Configured Install:
Add the "JRE/bin" directory to your path: pentaho-demo/jre/bin

You can also download a free JRE from Sun Microsystems. The Java™ 2 SDK, Standard Edition Version 1.4.2 can be found here:

The 1.5 version of the JRE (also known as J2SE 5.0) should also work although it hasn't been fully tested. If you use J2SE 5.0 please report your results in the Pentaho Design Studio forum.

Design Studio:
A zip file containing the Pentaho Design Studio for the Windows platform (A Linux RPM is coming soon) is available for download:


Unzip the file to your location of choice, it will create a directory named "pentaho-design-studio" and extract all its files there. No other setup is required. To run the platform, execute PentahoDesignStudio.exe.

Installing the Design Studio Plug-ins

A zip file containing the Pentaho Design Studio plug-in for Eclipse 3.1 on all the platforms is available for download:


Simply unzip the file into your Eclipse top level directory; the zip will extract the files into the proper place in the plug-ins directory.
Restart eclipse and you should see the Design Studio icon: on the Eclipse toolbar.

Running the Design Studio

At this point, you have either installed the standalone Design Studio or installed the Design Studio plug-in into Eclipse, and you have a working install of the Pentaho samples. The samples have been tested and work within your browser. Your Pentaho BI Server is running and waiting for requests.

If you haven't done so already, start the Design Studio as described under installation. If the welcome screen appears close it by clicking on the X next to Welcome.

The first thing we need to do is hook up the Design Studio to the samples solution. Select File->New->Project. Then select Simple from the New Project wizard and press the Next> button. Enter "Pentaho Solutions" as the project name. Although any name is fine, this document will refer to Pentaho Solutions, uncheck the Use default checkbox, and browse to the pentaho-solutions directory. If you are using the PCI, this will be /pentaho-demo/pentaho-solutions. Select Finish and you are ready to go.

Browsing the Solution Repository

You should now see your Pentaho Solutions project displayed in the tree on the left side of the Design Studio. If you expand the solution folder you'll see plenty of files. These are the files that make up your solution and are managed with the Design Studio. Let's take a look at one to get a feel of what the Design Studio can do for us. Go ahead and in the left hand tree, open the Pentaho Solutions/samples/getting-started folder. Double-click on the HelloWorld.xaction file and the Action Sequence editor will open in the edit pane.

Action Sequences

The Action Sequence is an XML document that defines the smallest complete task that the solution engine can perform. It is executed by a very lightweight process flow engine and defines the order of execution of one or more the components of the Pentaho BI Platform. We avoid calling this a process flow because it is missing many of the capabilities of a true process flow engine. It is good for sequencing small, linear, success oriented tasks like reporting and bursting. It has the ability to loop through a result set, call another Action Sequence and conditionally execute components. The Action Sequence document should have a ".xaction" suffix.

Introducing the Action Sequence Editor

The Action Sequence editor has four tabs along the bottom: General, Define Process, XML Source and Test. The function of each tab will be discussed in more detail later, their basic functions are:

    • 1. General* - Basic properties like title, help etc.
    • 2. Define Process* - Defines the inputs, outputs, resources required by the Action Sequence and allows you to program the interactions between the Action Sequence and the Pentaho Components
    • XML Source* - The raw XML that the editor is generating
    • Test* - Interface for executing the Action Sequence on the Pentaho BI Server

Click through each tab to get familiar with the editor. Check out the XML Source tab to get an idea of what the editor is saving you from. Now let's look a bit more closely at the HelloWorld.xaction.

General Information

As we mentioned earlier the "General" tab contains some general information about the action sequence, such as the title, author, icon, description, and help text to be displayed in the browser window (as shown below). Notice that the design studio shows the title for action sequence to be "%title". The "%" indicates that this is the name of a string whose value is defined in a properties file with the same name is the same as the action sequence. In this case the property file is named This is how action sequences accommodate internationalization. . Additionally you can indicate the logging level you would like to use for this action sequence. Logged messages will appear in the pentaho-demo/jboss/server/default/log/server.log file. If you're having problems getting your action sequences working, the log file is a good place to look for clues as to what the problem might be.

Inputs and Resources

Now press the "Define Process" tab. You should see a section labeled "Process Inputs" which lists the inputs and resources used by the action sequence. The inputs are the pieces of information the action sequence will need from the outside world when it runs. They can come from four sources; runtime, request, session, global and default. Runtime parameters are parameters that are stored in the Runtime Context. Remember, the Runtime Context stores the inputs and outputs from previous instances and makes them available to future executions of the same runtime instance id. Request parameters are the name-value pairs specified on a URL. Session parameters are variables that are stored in the user's session and may contain unique values for each user. Global parameters are similar to session parameters except they have the same values for all users. Default values can be specified for each input and in the Action Sequence document and are used as a last resort.

Session and Global parameters can be used to provide secure filtering of data within the Action Sequence. A session parameter gets initialized by executing an action sequence when the user logs onto the system. The Action Sequence called upon login can be set up to perform a query using the user's login name in the where clause. The result is stored in the user's session and is available to subsequent Action Sequences. Global parameters are initialized when the system starts up and are available for all users. See the "Securing Data Access with Session and Global Filters" document for information on how to set up the filters and use them.

There are two implicit inputs instance-id and solution-idthat are always available and do not need to be specified as inputs or outputs. They are the... well I'm sure you can guess what they are.

Resources are the files needed by the action sequence to complete its job. For example: if the action sequence is going to run a JFree report, one of the resources would be the location of the JFree report definition file.

Using the Design Studio let's take a look at some examples of inputs and resources. Browse to the samples/reporting directory in your "Pentaho Solutions" project and double-click on the BIRT-quadrant-budget-for-region-hsql.xaction. Select each of the process inputs to view the details about each of the inputs and resources used by this action sequence.


The action sequence outputs are what the action sequence will leave behind when it's complete. Outputs can have three destinations: runtime, session, or content. The first two destinations correspond to the input sources discussed above. The third destination indicates that the output will be put in the http response header or content.

Flow Control

The Action Sequence is not meant to be a replacement for workflow, that being said, there are two ways to control the sequence of execution; loops and conditions. An Action Sequence can execute a group of actions multiple times. The most common usage is to perform the set of actions once for each row in a query result set. The data types that can be specified for a loop are string-list, result-set and property-map-list. Conditional execution can be specified.

A group of actions can also be executed conditionally. The condition that will be evaluated for true is based on a JavaScript result.

Actions (Components)

Within the design studio open the samples/bursting/BurstActionSequence.xaction. What you see in the Process Actions section is a list of all the actions to be performed by this action sequence. Note that the order is important here. The topmost action will be run first, followed by the one below it, and so on. The second action, the one that starts with "Action Loop" probably deserves special mention. It's a loop action that will perform the actions it contains multiple times, depending on what it's set to loop on. In this case it looks like there are five actions contained in the loop. Click on the first action in the list.

On the right side you can view the action details. You'll notice that there is a place for entering a brief description for the action. It's not necessary to enter anything here, but it's a good idea, as it makes the action sequence much easier to read. Each component has its own editor. Since this action uses the SQL query component, there is an area to specify the database connection, the query, and the expected contents of the query result. Now lets click on the "+" sign next to this action in the Process Actions tree. Notice that there are four items listed under the action. These are the outputs from this action. The rule-result output is where the results of the query are stored. The remaining three outputs correspond to particular columns within the rule-result output. Other actions that follow can use these outputs as their inputs. So, one action can leave outputs that following actions can use as inputs. Additionally each action has available to it the action sequence inputs we discussed earlier. The idea is that each little action has something it can do really well. It takes in some input does some work and leaves some output for some other action(s) to use. Your job is to tie these individual actions together to do something meaningful.

Let's now take a quick look at each of the actions in this action sequence and see how they're working together to get something useful done. As we go through this don't get tied up in all the little details. The idea is to get a feel for how the actions work together. Later we'll learn about the details of each individual type of action.

Let's start with the first action in the actions tree. It performs a SQL query to extract some region, manager, and email information from a database. As mentioned earlier, it leaves some outputs behind. The query results are saved in an output called rule-result, and the other three outputs tell the world the column names for information in the results. Anytime you run a SQL query action make sure you include the column names in the output. That way other actions know what data is available in the rule result.

Next is the action loop. If you select it you'll notice that there isn't a whole lot to it. One notable point is that whenever you see "<" and ">" around a string it indicates that a parameter is being referenced. In this case the parameter is "rule-result". It's not by coincidence that this happens to be the name of the output from the preceding SQL Query action. This is an example of an action using the outputs from a previous action. So the five actions within the loop will each be performed once for each row that was returned in the outputs of the previous query action.

The next three actions are all similar. They're each string formatting actions. They take some input strings, place them into a formatted message, and leave the formatted string as an output. Basically they get things in order for the last two actions. If you click on the "+" sign next to each of these actions in the Process Actions tree, you'll see the outputs that each leaves behind.

Click on the action titled "Generate the report". This action will be generating a JFree report. In the configuration section you'll find that the JFree report specification and report format are being referenced using parameter named "report-definition" and "output-type" respectively. Notice both of these parameters are defined under the Process Inputs. Additionally the configuration section contains the database connection and query information that will be passed to the JFree report. Note that since these values are not enclosed within "<>" they are not parameter names, but are constant values. Finally you'll notice that the report is being saved in an output called "report-output". So we've generated this report and it's sitting in an action output parameter called "report-output". Now what?

Select the last action in the sequence. The name says it all. This action will email the report to the manager of the region for which the JFree report was generated. Take a look at the configuration details and you'll see how this action ties all the pieces together to send report off as an attachment.

Again don't be too concerned if you don't understand every detail. Each type of action has its own set of inputs and outputs. Once you get familiar with them you'll soon be putting them together to do all kinds of useful stuff.

Executing an Action Sequence

There are several ways to execute a solution; via Design Studio, URL, Java Code or a Web Service call.

Design Studio

Click on the test tab on the HelloWorld.xaction editor. At the top of the test page, there is a field titled "Pentaho Server URL." If your pentaho server is running, enter the URL to your Pentaho BI Server which is likely to be http://localhost:8080/pentahoif you are running the PCI. Click the "Test Server" button and verify that you see the top level samples page displayed on the test page. Click on "Run" to execute the HelloWorld Action Sequence. You should see the familiar "Hello World. Greetings from the Pentaho BI Platform." message. In the unlikely event that you are not able to not see the Hello World message, make sure the server is running and that you typed the Server URL correctly. Verify that you can run the samples from your browser. If all else fails, try checking the Design Studio forum at


The samples that come with the preconfigured install are launched via URL using the ViewAction (org.pentaho.ui.servlet.ViewAction) servlet. The following URL will launch the HelloWorld Action Sequence:


The result returned depends on the Action Sequence Document. You may get a report to view, a text message or just "Action Successful." The following parameters can be entered on the URL:

    • solution, path, action* - The location of the Action Sequence document to load.
    • instance_id* - The instance Id of a previous Runtime Context
    • debug* - set to "true" in order to have debug information written to the execution log.

Web Service Call

In the "Settings and Services" group of the samples that come with the preconfigured install is a Web Service Example. It is still a URL call, this time to the servlet ServiceAction (org.pentaho.ui.servlet.HttpWebService). The following URL will launch the HelloWorld Action Sequence:


In this case, the result returned is an XML SOAP Response. The following parameters can be entered on the URL:

    • solution, path, action* - The location of the Action Sequence document to load.
    • instance_id* - The instance Id of a previous Runtime Context
    • debug* - set to "true" in order to have debug information written to the execution log.

Java Call

An Action Sequence can be executed directly from a Java application. For an example of how to do this, open the Java file "" and look at the JUnit test for HelloWorld. This class code can be found by accessing the Pentaho public repository at svn://

*Action Sequence Recap
The inputs, outputs and resources in the Action Sequence header define a contract between the Action Sequence and the outside world. The Sequence requires the specified inputs and resources to be passed in and will return the specified outputs.

The action-definition defines a contract between each component and the Action Sequence. The action-inputs and action-resources define the parameters that a component requires to execute. The action-outputs define what parameters will be available after the component completes executing. Outputs from one component can be used as inputs to another component. The mapping attribute of the action-inputs allow outputs from one component that have different names to be used as inputs to another component.

Specifying the input/output relationships and their data types allows the system to validate an Action Sequence or set of Action Sequences without actually executing the components. A complete solution can be validated and "locked down" to prevent modification of the Action Sequence documents and eliminate errors due to "broken links" between these documents.

  • No labels