Hitachi Vantara Pentaho Community Wiki
Child pages
  • Pentaho Software Architecture

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The purpose of this document is to provide a detailed view of the overall software architecture that when combined makes up the entire Pentaho open source software suite as it exists today.

At a high level, the software components can be divided into a variety of forms.  In the following detailed list, the general organization includes third party libraries and components that Pentaho has needed to fork and maintain, common libraries and projects that are used in general ways, pillars that are core business analytics or data integration elements, tools that allow access to pillars, and plugins across the pillars that provide additional functionality.  These same components can be looked at from a architectural purpose point of view, including four general areas including information delivery, data movementmanagement / integration, analytics, and platform services.  For each project below we categorize in both manners to give a multi-faceted view of the overall architecture of Pentaho.

...

This detailed software listing is organized in the general order in which software components are dependent on one another, although it should not be used as the official build order of Pentaho.

First Pass:

  High Level Description

  Source Path
  Architectural Owner
  Architectural Area

Third Party Maintained Forks

...

 Kettle-VFS (Fork of Apache VFS) (MattC)
 Hive JDBC (Will)
 Pentaho OFC4J (Will)

...

kettle-

...

vfs

Kettle VFS is a maintained fork of Apache Commons VFS

...

Architectural Area: Data Management / Integration

...

hive

Due to the dynamic nature of Hadoop, Pentaho currently maintains our own Hive JDBC Driver implementation

...

Architectural Area: Data Management / Integration

...

pentaho-ofc4j

Pentaho ChartBeans Flash components, which are still used by Pentaho Dashboards and Action Sequences, are based on Open Flash Chart.  OFC4J is a Java to JSON converter that is used to generate the correct metadata for the charts on the server that is no longer maintained by the creator of the project.

...

Architectural Area: Information Delivery

Common Components

subfloor

Subfloor is Pentaho's common build system, based on ant and used by all projects for compilation, assembly, unit testing and code coverage.

Source Path: https://code.google.com/p/subfloor/  (Note that this location is out of date and should be transitioned to GitHub)

Architectural Owner: Will Gorman

Architectural Area: Engineering Operations

pentaho-commons-database

This commons project is a GWT thin client of the shared database dialog.  The submodule pentaho-database-model was an attempt at a thin Kettle DatabaseMeta implementation, which includes a dialect and JDBC Metadata architecture.

Source Path: https://github.com/pentaho/pentaho-commons-database

Architectural Owner: Will Gorman

Architectural Area: Data Management / Integration

Pillars

Tools

Plugins