March 25, 2008
Contributed by Mike Tarallo, Pentaho Team Member
Evaluating Pentaho Analysis Using Your Own Data
This document contains instructions and information that helps you evaluate Pentaho Analysis using your own data. This document should be used with the current version of the Pentaho Open BI Suite, version 1.6. It is assumed that you are familiar with the Pentaho Design Studio and the Pentaho BI Platform.
Pentaho Analysis is built on the popular Mondrian OLAP engine. The Pentaho BI Platform embeds the Mondrian engine as part of its architecture. Along with the Pentaho BI Platform, the Mondrian engine processes MDX requests with the ROLAP (Relational OLAP) schemas. These schema files are XML metadata models that are created in a specific structure used by the Mondrian engine. These XML models can be considered cube-like structures which utilize existing FACT and DIMENSION tables found in your RDBMS. It does not require that an actual physical cube is built or maintained; only that the metadata model is created. The structures and related API are outlined in the Mondrian Technical Guide that accompanies the Mondrian distribution as well as the Schema Workbench.
This XML-based metadata structure can be created using a text editor, Pentaho Cube Designer, or the Schema Workbench. I have found the Schema Workbench to be the best tool for this task; however, there are some additional steps necessary to create the Pentaho Analysis View. This process will be made easier in future releases.
Additional information is available at the following locations:
Mondrian Schema Workbench Software
(software to build Pentaho Analysis Schemas)
This document adheres to the following assumptions:
- Oracle data is used in this example.
Note: When using Oracle JDBC data connections make sure you use "thin" not "oci" as in: jdbc:oracle:thin:@localhost:1521:XE
- There is one fact table and some dimension tables
- References provide supplementation when additional information may be necessary
- The Java application server being used is Tomcat
- Your application server and the way Java JNDI data sources are configured may vary
- Java 1.5 must be used for the Schema Workbench to run
Creating a Pentaho Analysis View Using the Schema Workbench
Follow the five basic steps below when creating a Pentaho Analysis View using the Schema Workbench to create the ROLAP model:
- Configure your RDBMS JNDI data source for your Java application server so that the Pentaho BI Platform accesses the desired data.
- Create your Pentaho Analysis Schema Model using Schema Workbench
- Publish your Pentaho Analysis Schema Model to the Pentaho BI Server
- Create the Pentaho Analysis View Action Sequence
- Execute the Pentaho Analysis view using a Web browser
Configuring your RDBMS JNDI data source on your application server
Use the resources listed below to find instructions for configuring your RDBMS JNDI data source:
- Other: Refer to your Java application server's documentation on how to configure JNDI datasources
Creating your Pentaho Analysis Schema Using the Schema Workbench
Follow the instructions below to create your schema:
- Download and unzip the Schema Workbench
- Create a "drivers" folder in the location where you extracted the files: C:\Pentaho\Install\workbench-22.214.171.12447\drivers
- Copy your JDBC driver to that folder. For example for Oracle 10g, the latest ojdbc14.jar file (found from Oracle's Web site) to that location were copied.
- Follow the schema_workbench.pdf documentation; it can be found in the "doc" sub-directory: C:\Pentaho\Install\workbench-126.96.36.19947\doc.
- The best way to learn how to create a simple schema using the Schema Workbench is to open an existing *.mondrian.xml file provided with the Pentaho Open BI Server, such as the one found in the ..\pentaho-solutions\samples\analysis directory in the solution repository.
Note: The sample *.mondrian.xml files require you to place the Hypersonic or MySQL JDBC driver in the "drivers" directory and set up the appropriate connection URL and CLASS in Tools > Preferences section prior to opening them.
- Use the Mondrian Technical Guide documentation as a reference to understand what the Mondrian structures are. When developing using the GUI you can add additional elements that the Schema Workbench might not expose in the GUI.
Creating a Simple Schema
Make sure data types used in the schema are appropriate with what is in your RDBMS. Sometimes dates in RDBMS need to be defined as Strings in the schema design depending on how your RDBMS reads and uses the date data type. Notice in the example below that GAME_DAY is a DATE, GAME_MONTH is a VARCHAR, and GAME_SEASON (which is YEAR) is a number. Make sure your data types in the schema match accordingly.
Follow the instructions below to create a simple schema:
- Go to Tools > Preferences to enter your JDBC parameters for your RDBMS
- Make sure the database is running before you click Accept.
- If an error occurs, start the database, then close and restart the Schema Workbench.
- Go to File > New > Schema to name the schema in name attribute value.
- Right-click Schema > Add Cube to name the cube in name attribute value.
- Select Table:Table, select schema ID, if applicable, and the fact table name.
- Make sure that Table: displays Table:<your_fact_table_here>.
- Right-click the cube name and select Add Measure.
- Name the measure, select an aggregator such as sum; select the column and data type.
- Right-click the cube name and select Add Dimension.
- Select the appropriate foreign key and dimension type such as a Standard Dimension or Time Dimension.
- Expand the newly created dimension and select Hierarchy.
- Name the hierarchy, select primaryKeyTable and primaryKey.
- Select Table:Table, select the schema and the Table name
- Right-click Hierarchy and add level.
- At minimum, select the Name level, the table, and the column. Sometimes it is necessary to specify the levelType for certain date dimension data types to work correctly.
- Repeat step 15 and 16 for different dimensional levels.
Publishing Your Analysis Schema on the Pentaho BI Platform
Follow the instructions below to publish your analysis schema:
- Go to File > Save; save the file to the Pentaho Solution repository in this structure: ..\pentaho-solutions\samples\analysis or to a directory on your local workstation, which you can then transfer to the Pentaho Solution repository on the remote BI server.
- Save the file as <yourfilename>.mondrian.xml>; you must specify *.mondrian.xml).
- Test a simple MDX query by clicking, File > New > MDX Query. A success message appears.
- Enter query in query window on top.
Note: If query fails to return data then there may be an issue. Adjust the schema design accordingly and retest. Most common problems are the data types being used. It is possible for the schema to still run successfully when deployed to the Pentaho Platform.
Creating the Pentaho Analysis View Action Sequence
Follow the instructions below to create the action sequence:
- In the Pentaho Design Studio, select your solution path, right-click the folder, and select BI Platform > New Action Sequence.
- Enter the name for action sequence and select Create Analysis View and click Finish.
- Select Analysis View under Process Actions.
- Modify the Data Model and Data Source parameters accordingly. The Data Model is the path location and XML file you saved in Step 1 under Publishing your Analysis Schema Model on the Pentaho Open BI Server. The Data Source is the JNDI name you used when creating your data source in section, Configuring your RDBMS JNDI data source on your application server.
- Save and test your Pentaho Analysis View using the URL with the appropriate solution= and path= directory names where you saved your .xaction and file.mondrian.xml files. You can use the TEST tab in Design Studio or copy and paste the URL into a browser.