Hitachi Vantara Pentaho Community Wiki
Child pages
  • 15. Report Preprocessor
Skip to end of metadata
Go to start of metadata

Report Pre-Processors

A report-pre-processor is a specialized function that is executed after the report processing queried the data from the data-factory, but before the report's functions get initialized and the actual report-processing starts. Each master- and sub-report can have its own set of report-processors. The pre-processor is executed only once for each report-definition. At this point in the processing chain, a pre-processor has full control over the layout and data-computation aspects of the report-definition.

A pre-processor can:

  • create, remove or alter groups
  • create, remove or reconfigure report-functions
  • reconfigure existing layout-objects, including to add new subreport, bands or elements or to remove or replace existing ones.A pre-processor can reconfigure the data-sources of any of its report's subreports.

A pre-processor can not:

  • modify the datasource of the current report
  • affect the parameters or parameter-processing of the current report
  • change the report-configuration or influence the layouter-configuration and output-processor

Pre-Processors are specified via the "wizard::pre-processor" attribute on either the master-report or the sub-report. Pre-Processors specified on the master-report will not affect sub-reports and will not be called when a sub-report is executed.

In the reporting system, there are two classes of pre-processors. The pre-processors that can be specified in the report-designer are user-level pre-processors. In addition to these Pre-Processors, system-level Pre-Processors are also used by the reporting engine to optimize the report-processing and to perform necessary data-processing based on the elements found in the report. These automatic Pre-Processors are always executed after the user-level Pre-Processors have run.

Build-in automatic Pre-Processors


The Aggregate-Field-Pre-Processor introspects all data-elements of a report and activates any aggregation-function that may be specified in the "wizard::aggregation-type" attribute.


The Legacy-Chart-Preprocessor extracts a pre-configured chart-expression and one or more chart-data-collector-functions from the report and activates them as regular expressions with an auto-generated name. This is a helper pre-processor to make the legacy-chart-element work.

User-Level Pre-Processors


This is a empty Pre-Processor. It returns the report-definition unchanged.This class only serves as base-class for other implementations.


The RelationalAutoGeneratorPreProcessor inspects the result-set of the current data-source and generates an generic banded list-report without any groups. The auto-generated fields will be placed in the itemband and the details-header will be configured to contain suitable labels for these fields.

The target-bands for the pre-processor will follow the wizard-target-selection rules. Therefore a band will be considered a suitable target for generated fields, if it is either the empty root-level band or has the "wizard::generated-content-marker" attribute set to true. 

The helper method org.pentaho.reporting.engine.classic.core.wizard.AutoGeneratorUtility#findGeneratedContent can be used in your own implementations to perform such a lookup. The methods org.pentaho.reporting.engine.classic.core.wizard.AutoGeneratorUtility#generateDetailsElement and org.pentaho.reporting.engine.classic.core.wizard.AutoGeneratorUtility#generateHeaderElement are used to generate the fields and header-labels during the processing.


The WizardProcessor is the more configurable version of the RelationalAutoGeneratorPreProcessor. The WizardProcessor configures a report based on a wizard-specification-document stored in the report-bundle. The document itself can be edited via the Report-Design-Wizard inside the report-designer. It is also possible to use a Java-API to configure this wizard-specification at runtime.

The wizard-processor can either generate a report from scratch based on the data returned by the data-source and the fields defined in the wizard-specification document or can preserve a previously generated report-definition, so that only the meta-data style and attribute information is refreshed. The "wizard::enable" attribute controls whether a generate ("true") or refresh ("false") strategy is used.

Be aware that to make use of this WizardProcessor at runtime, you have to include "wizard-core" in your runtime classpath.


The BSF(Bean-Scripting-Framework)-PreProcessor is a scripted/programmable Pre-Processor. It can be used to prototype custom PreProcessor-implementations. As with all scripting functions, I do not recommend this kind of the functionality for production use if it is used in multiple reports, as it generates report-definitions that are hard to maintain and prone to errors during upgrades. For production use, implement a real Java-Class that can be parametrized from within the report-designer.

The BSHPreProcessor offers access to two variables to perform the pre-processing work.

  • definition: A instance of "AbstractReportDefinition", either a master-report or a sub-report.
  • flowController: A org.pentaho.reporting.engine.classic.core.states.datarow.DefaultFlowController, which grants access to the data (via flowController.getMasterRow().getGlobalView()) and the data-schema, which contains all metadata (via flowController.getMasterRow().getDataSchema())

Within the script, you can use either the Classic-Engine-Core API to modify the report-definition or you can create or modify a Wizard-Specification. I recommend to generate a Wizard-Specification when building an AdHoc-report. This way, you can describe the report in higher-level terms and dont have to deal with the detailed styling of the elements.

Note: All examples here are given as BeanShell/Java code. If you use a different language, the exact syntax may differ. In this case, please refer to the BeanScriptingHost documentation and your language's specification for details on the syntax.

For the next few code-examples, a couple of reporting-engine classes are used, and so we have to import them to make the script run. You can find the whole example in the Advanced-Samples" of in the report-designer 3.6.

import java.util.ArrayList;

import org.pentaho.reporting.engine.classic.core.ReportPreProcessor;
import org.pentaho.reporting.engine.classic.core.MasterReport;
import org.pentaho.reporting.engine.classic.core.ReportProcessingException;
import org.pentaho.reporting.engine.classic.core.SubReport;
import org.pentaho.reporting.engine.classic.core.wizard.DataSchema;
import org.pentaho.reporting.engine.classic.core.states.datarow.DefaultFlowController;
import org.pentaho.reporting.engine.classic.wizard.WizardProcessorUtil;
import org.pentaho.reporting.engine.classic.wizard.model.WizardSpecification;
import org.pentaho.reporting.engine.classic.wizard.model.DefaultWizardSpecification;
import org.pentaho.reporting.engine.classic.wizard.model.DefaultGroupDefinition;
import org.pentaho.reporting.engine.classic.wizard.model.GroupType;
import org.pentaho.reporting.engine.classic.wizard.model.GroupDefinition;
import org.pentaho.reporting.engine.classic.wizard.model.DefaultDetailFieldDefinition;
import org.pentaho.reporting.engine.classic.wizard.model.DetailFieldDefinition;

To load a wizard-specification document use org.pentaho.reporting.engine.classic.wizard.WizardProcessorUtil#loadWizardSpecification

DefaultWizardSpecification specification = (DefaultWizardSpecification)
        WizardProcessorUtil.loadWizardSpecification(definition, definition.getResourceManager());
if (specification == null)
  specification = new DefaultWizardSpecification();

You can now use the data-schema to query the available fields to populate the wizard-specification. The code here assumes that the first two columns will be group fields.

int numberOfGroupFields = 2;
final DataSchema schema = flowController.getDataSchema();
final String[] names = schema.getNames();

final ArrayList groupList = new ArrayList();
for (int i = 0; i < Math.min(numberOfGroupFields, names.length); i++)
  String name = names[i];
  final DefaultGroupDefinition o = new DefaultGroupDefinition();
  o.setGroupName("Group " + i);
  ((GroupDefinition[]) groupList.toArray(new GroupDefinition[groupList.size()]));

final ArrayList detailsList = new ArrayList();
for (int i = 2; i < names.length; i++)
  String name = names[i];
  final DefaultDetailFieldDefinition o = new DefaultDetailFieldDefinition();
  ((DetailFieldDefinition[]) detailsList.toArray(new DetailFieldDefinition[detailsList.size()]));

And finally, after all changes to the wizard-specification have been made, apply the new wizard-specification to the report and return the report-definition.

WizardProcessorUtil.applyWizardSpec(definition, wizardSpecification);
return definition;

Implementing Report-Pre-Processors

Pre-Processors are ordinary Java classes, which have to implement the interface org.pentaho.reporting.engine.classic.core.ReportPreProcessor.

Lets walk through an example, with a pre-processor that has three properties and apparently does nothing at all.

package example;

import org.pentaho.reporting.engine.classic.core.ReportPreProcessor;
import org.pentaho.reporting.engine.classic.core.MasterReport;
import org.pentaho.reporting.engine.classic.core.ReportProcessingException;
import org.pentaho.reporting.engine.classic.core.SubReport;
import org.pentaho.reporting.engine.classic.core.states.datarow.DefaultFlowController;

public class SamplePreProcessor implements ReportPreProcessor
  private String myFieldProperty;
  private String myIntProperty;
  private String myStringProperty;

  public SamplePreProcessor()

  public String getMyFieldProperty()
    return myFieldProperty;

  public void setMyFieldProperty(final String myFieldProperty)
    this.myFieldProperty = myFieldProperty;

  public String getMyIntProperty()
    return myIntProperty;

  public void setMyIntProperty(final String myIntProperty)
    this.myIntProperty = myIntProperty;

  public String getMyStringProperty()
    return myStringProperty;

  public void setMyStringProperty(final String myStringProperty)
    this.myStringProperty = myStringProperty;

  public Object clone() throws CloneNotSupportedException
    return super.clone();

  public MasterReport performPreProcessing(final MasterReport definition,
                                           final DefaultFlowController flowController) throws ReportProcessingException
    // yes we should do some work here
    return definition;

  public SubReport performPreProcessing(final SubReport definition,
                                        final DefaultFlowController flowController) throws ReportProcessingException
    // yes we should do some work here
    return definition;

To make pre-processors visible in the Pentaho-Report-Designer, you have to provide metadata for them at boot-time. The easiest way to ensure this, is to create your own module and to register the pre-processor while the module is initialized.

Pre-Processor metadata should be specified in a XML file called "meta-report-preprocessors.xml" located in your module.
The display names and design-time groupings of the metadata are held in a java.util.ResourceBundle named "example.SamplePreProcessorBundle" (and thus stored in a file called "example/")
pre-processor.example.SamplePreProcessor.grouping=Example Corp.
pre-processor.example.SamplePreProcessor.deprecated= Field Integer String holding a font-name
<meta-data xmlns="">
  <pre-processor class="example.SamplePreProcessor"
                 expert="false" hidden="false" preferred="false">
    <property name="myFieldProperty" value-role="Field"/>
    <property name="myIntProperty"/>
    <property name="myStringProperty" propertyEditor="org.pentaho.reporting.engine.classic.core.metadata.propertyeditors.FontFamilyPropertyEditor"/>

Within the module-initializer, you now can register the Pre-processor via