Hitachi Vantara Pentaho Community Wiki
Child pages
  • Pentaho metadata data source

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

The Metadata Source

Why use this source?

titleWhat is metadata?

Metadata is information about data, which sounds redundant, but for reporting systems is a very powerful tool. Metadata allows a designer to describe database fields, give them default formats, join tables in a meaningful way, calculate new values amongst other things.  This allows the dry, technical database structure that may only be understood by a few people to be transformed into an easy to understand business model that can reliably be used by many report designers.

  • fields are named using business language
  • meaningful table joins have already been created
  • authorisation controls for critical data elements have already been applied
  • meaningful formats for data values have already been applied

Adding the Metadata Source

Select the "Add Data Sources" button in the toolbar, or right click with your mouse on the data-sources node to open up the context menu. 

Select the "Metadata" entry.
The Metadata source editor opens up.

Browse to find the "metadata.xmi" file for the solution you are working with.  The metadata information is always in a metadata.xmi file, which is stored in each solution folder (assuming that metadata has been prepared for the folder).

titleDomain Id value is critical

The "Domain Id / BI-Server Solution Name" is always the name of the folder that your report will be stored in when it is delivered.  That's also the folder that contains the metadata.xmi file.  If you are delivering your report on a server you must use the name of the folder on the server.  So that you can test your report on a locally hosted Pentaho User Console you should use the same folder name locally. 

Each metadata data source can hold multiple queries.  The metadata is shared between all queries. You can use meaningful names for the queries in the metadata source. 
Create a new query click on the green plus in the upper right corner next to the "Available Queries" box. A new query named "Query1" gets added to the list of available queries.  Change this to a more meaningful name.

Select this query name and click the edit button 

Instead of typing SQL to create the query a query editor will open.

Select the "Business Model" that will best support your report.  The "Categories / Columns" window will populate with the available fields.
Select each column which you need from this window and move it into the "Selected Columns" section by clicking on the yellow arrow next to that section.
For each column you move into this section you may choose from the aggregations that the metadata definition allows.  For instance a column containing a number
may allow the "Sum", "Average", "Minimum", "Maximum" and "Count" aggregations, while a column containing a string may only allow "Count".
Similarly select the columns you want to use to select ythe data for your report and move these to the "Conditions" section using the yellow arrow next to that section.
You must apply any sorting that you need for your report within the query because The Pentaho Reporting System does not sort data itself. Instead, it relies on the databases to provide suitably sorted results. Databases are highly optimized products that can employ advanced techniques like indexes and smart sort algorithms to produce sorted results as fast as possible.
To tell the database that the results must be sorted select the columns to sort on and move them to the "Order by" section using the third yellow arrow.  Choose the appropriate sort order (ASCending or DESCending).

Click OK to return to the Data Source Editor, where you will see the query in the bottom window. 

Click on the Preview button to see what gets returned.  You should see this dialog:

If you compare this result to the preview in section 01, where the JDBC data source was used you will already see one difference.  The column headings / field names are more readable - they can contain spaces and upper and lower case characters.  In this case they are almost the same as the original field names, but are more readable.  In other cases these names will be more meaningful too.
Now the creation of the report is the same as for any other data source.  The available fields and functions can be seen in the data source window and dragged to the report.

titleFormatting is inherited from the metadata

As you create your report you may find that the columns are already formatted (font size, colour, ...), and that you are unable to change this formatting. This is because these attributes are coming from the metadata description. You can override the attributes in the style table by unchecking the "Inherit" box.