March 25, 2008
Contributed by Mike Tarallo, Pentaho Team Member
Evaluating Pentaho Analysis Using Your Own Data
This document contains instructions and information that helps you evaluate Pentaho Analysis using your own data. This document can be used with version 1.7.X of the Pentaho Open BI Suite. If you are using version 1.7.x, It is assumed that you are familiar with the Pentaho Design Studio and the Pentaho BI Platform. Please note that with version 1.7 you can still use these instructions to create the analysis schema. After you create the schema, with version 1.7, you can easily create a Pentaho Analysis View using the web based interface available in the "Go" menu item of the PCI. That means you do not have to use the Pentaho Design Studio to create the initial view as you would with 1.6.x. However as you become more advanced with the Pentaho Platform you can use the Pentaho Design Studio to manipulate the MDX query to create some comprehensive dynamic analysis views.
Pentaho Analysis is built on the popular Mondrian OLAP engine. The Pentaho BI Platform embeds the Mondrian engine as part of its architecture. Along with the Pentaho BI Platform, the Mondrian engine processes MDX requests with the ROLAP (Relational OLAP) schemas. These schema files are XML metadata models that are created in a specific structure used by the Mondrian engine. These XML models can be considered cube-like structures which utilize existing FACT and DIMENSION tables found in your RDBMS. It does not require that an actual physical cube is built or maintained; only that the metadata model is created. The structures and related API are outlined in the Mondrian Technical Guide that accompanies the Mondrian distribution as well as the Schema Workbench.
This XML-based metadata structure can be created using a text editor or the Schema Workbench. I have found the Schema Workbench to be the best tool for this task; After the schema is created and published to the Pentaho BI Server there are some additional steps necessary to create the Pentaho Analysis View.
Additional information is available at the following locations:
This document is based on the following assumptions:
- You have a properly installed and configured Pentaho BI Server
- Oracle data was used for this example(you can use whatever data source you want, I just used Oracle and its sample data for this example)
Note: When using Oracle JDBC data connections make sure you use "thin" not "oci" as in: jdbc:oracle:thin:@localhost:1521:XE
- There is one fact table and some dimension tables
- References provide supplementation when additional information may be necessary
- The Java application server being used is Tomcat
- Your application server and the way Java JNDI data sources are configured may vary
- Java 1.5 must be used for the Schema Workbench to run
Creating a Pentaho Analysis View Using the Schema Workbench
Follow the five basic steps below when creating a Pentaho Analysis View using the Schema Workbench to create the ROLAP model:
- Configure your RDBMS JNDI data source for your Java application server so that the Pentaho BI Platform accesses the desired data version 1.7
Or in version 2.0 use the Admin Console to easily configure your data source (recommended)
- Create your Pentaho Analysis Schema Model using Schema Workbench
- Publish your Pentaho Analysis Schema Model to the Pentaho BI Server
- Create the Pentaho Analysis View Action Sequence
- Execute the Pentaho Analysis view using a Web browser
Pentaho Version 2.0 configuring your data source click here <-- NEW
Pentaho Version 1.7 configuring your RDBMS JNDI data source on your application server
Use the resources listed below to find instructions for configuring your RDBMS JNDI data source:
Other: Refer to your Java application server's documentation on how to configure JNDI data sources
Creating your Pentaho Analysis Schema Using the Schema Workbench
Follow the instructions below to create your schema:
- Download and unzip the Schema Workbench
- Create a "drivers" folder in the location where you extracted the files: C:\Pentaho\Install\workbench-184.108.40.20647\drivers
- Copy your JDBC driver to that folder. For example for Oracle 10g, the latest ojdbc14.jar file (found from Oracle's Web site) to that location were copied.
- Follow the schema_workbench.pdf documentation; it can be found in the "doc" sub-directory: C:\Pentaho\Install\workbench-220.127.116.1147\doc.
- The best way to learn how to create a simple schema using the Schema Workbench is to open an existing *.mondrian.xml file provided with the Pentaho Open BI Server, such as the one found in the ..\pentaho-solutions\samples\analysis directory in the solution repository.
Note: The sample *.mondrian.xml files require you to place the Hypersonic or MySQL JDBC driver in the "drivers" directory and set up the appropriate connection URL and CLASS in Tools > Preferences section prior to opening them.
- Use the Mondrian Technical Guide documentation as a reference to understand what the Mondrian structures are. When developing using the GUI please note that you can add additional elements that the Schema Workbench might not expose in the GUI. These items can be added in the actual XML code that is generated.
Creating a Simple Schema
Make sure data types used in the schema are appropriate with what is in your RDBMS. Sometimes dates in RDBMS need to be defined as Strings in the schema design depending on how your RDBMS reads and uses the date data type. Notice in the example below that GAME_DAY is a DATE, GAME_MONTH is a VARCHAR, and GAME_SEASON (which is YEAR) is a number. Make sure your data types in the schema match accordingly.
Follow the instructions below to create a simple schema:
- Go to Tools > Preferences to enter your JDBC parameters for your RDBMS
- Make sure the database is running before you click Accept.
- If an error occurs, start the database, then close and restart the Schema Workbench.
- Go to File > New > Schema to name the schema in name attribute value.
- Right-click Schema > Add Cube to name the cube in name attribute value.
- Select Table:Table, select schema ID, if applicable, and the fact table name.
- Make sure that Table: displays Table:<your_fact_table_here>.
- Right-click the cube name and select Add Measure.
- Name the measure, select an aggregator such as sum; select the column and data type.
- Right-click the cube name and select Add Dimension.
- Select the appropriate foreign key and dimension type such as a Standard Dimension or Time Dimension.
- Expand the newly created dimension and select Hierarchy.
- Name the hierarchy, select primaryKeyTable and primaryKey.
- Select Table:Table, select the schema and the Table name
- Right-click Hierarchy and add level.
- At minimum, select the Name level, the table, and the column. Sometimes it is necessary to specify the levelType for certain date dimension data types to work correctly.
- Repeat step 15 and 16 for different dimensional levels.
Publishing and Testing Your Analysis Schema on the Pentaho BI Platform
Note: With version 1.7 of the BI Platform and the new version of Schema Workbench (workbench-18.104.22.16813) you can easily publish your schema this way, click here
- File, New, MDX Query
(make sure the correct Schema is selected from the drop down or the Schema window is selected if you move the windows around)
- Enter query in query window on top following this usage example:
MDX Usage : select [<Dimension name>].[<Hierarchy name] on columns from [<Cube name>]
MDX Example: select [Category].[All Categories] on columns from [My Cube]
You should get something similar to this back:
Note: If query fails to return data then there may be an issue. Adjust the schema design accordingly and retest. Most common problems are the data types being used. It is possible for the schema to still run successfully when deployed to the Pentaho Platform.
Version 2.0 BI Server
Version 1.7 BI Server
- Go to File > Save; save the file to the Pentaho Solution repository in this structure: ..\pentaho-solutions\samples\analysis or to a directory on your local workstation, which you can then transfer to the Pentaho Solution repository on the remote BI server.
- Save the file as <yourfilename>.mondrian.xml>;
Creating the Pentaho Analysis using the Pentaho User Console
- Log in to the Pentaho User Console using a web browser: http://localhost:8080/pentaho
- Select File, New, Analysis View
- Select the appropriate schema and cube that you published
- Click OK
Your view will display in a tab to the right
A word on editing and then re-publishing or saving a Schema
Please note that any changes/updates/additions made to the ROLAP Analysis Schema using Schema Workbench or another editor (such as Design Studio or text editor) will require a refresh of the Pentaho Solution Repository and a refresh of the Mondrian Schema Cache. This can be done in the Pentaho User Console.
- Log in to the Pentaho User Console with an Admin equivalent ID
- Tools, Refresh, Repository Cache
- Tools, Refresh, Mondrian Schema Cache
These steps ensure that you are using the latest versions of the ROLAP schema when creating or using Pentaho Analysis Views.
Manually Creating the Pentaho Analysis View Action Sequence (advanced)
Follow the instructions below to create the action sequence manually that will display a Pentaho Analysis View. (This is a way where you can create the Pentaho action sequence manually to create the Pentaho Analysis View.) The advantage here is that you can have added control over the Pentaho Analysis query and the tool bar buttons available in the Pentaho Analysis Viewer. Users who are already familiar with Pentaho Design Studio and parameterized queries will be able to work with this section more effectiely.
- In the Pentaho Design Studio, select your solution path, right-click the folder, and select BI Platform > New Action Sequence.
- Enter the name for action sequence
- Select Create Analysis View in the Template drop down and click Finish
- Select Analysis View under Process Actions.
- Modify the Data Model and Data Source parameters accordingly. The Data Model is the path location and XML file you saved in Step 1 under Publishing your Analysis Schema Model on the Pentaho Open BI Server. The Data Source is the JNDI name you used when creating your data source in the section, Configuring your RDBMS JNDI data source on your application server, or the name you gave the data source in the Admin Console when using version 2.0 BI Server.
- Leave "default" as the query, later you can copy the MDX from the Analysis View MDX editor window and paste in
this query dialog to create parameterized and more advanced Pentaho Analysis Views.
- Save and test your Pentaho Analysis View using the URL with the appropriate solution= and path= directory names where you saved your .xaction and file.mondrian.xml files. You can use the TEST tab in Design Studio or copy and paste the URL into a browser or access it from the Pentaho Solution Repository via the Pentaho User Console.