Hitachi Vantara Pentaho Community Wiki
Child pages
  • Pentaho Bursting
Skip to end of metadata
Go to start of metadata

There are several disadvantages to the original approach to bursting. First, it is very difficult to scale the bursting to large datasets because a single query or dataset does not lend itself to clustering or distributed processing. Second, large datasets result in a large single process that cannot be controlled easily. Finally, it is difficult to introduce complex logic or variations at the level of the individual report.

Pentaho Bursting's innovative approach, Re-usable template, used to define a Burst Rule and a Burst Process.

The Burst Rule

The Burst Rule is used to identify the particular cases, situations, or triggers that require content to be generated or information to be delivered. For example, it could determine departments that have exceeded budgets, employees that have too much overtime, suppliers with too many incorrect deliveries etc. This rule can be a simple query or can be a complex workflow involving multiple business rules and multiple data-sources. The Burst Rule typically iterates over the cases identified by the business rules and processes each case one at a time. The entire dataset needed to run all the rules and generate the content is never read into memory at one time.

To distribute and cluster the bursting process messaging (JMS) can be used by the Burst Rule to broadcast each case in a JMS message for a cluster of reporting servers to work on. A persistent message queue guarantees delivery and ensures that the Burst Process can be resumed after hardware failure.

The Burst Process

The Burst Process is a workflow that generates content and saves or delivers it. This process is designed to handle a single situation identified by the Burst Rule. The process receives parameters from the Burst Rule and can use those parameters to customize the process and the content that is generated. For example, the process could use a parameter such as the department to generate a different query, select a different report template, or use a web service to determine the recipient(s) for the information.

  • No labels