Hitachi Vantara Pentaho Community Wiki

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


This page will serve as the central location for organizing integration efforts between Pentaho Data Integration and a number of Data Quality platforms provided by our technology partners.


Integrating Data Quality


Solutions with Pentaho

Data Quality as it relates to the Pentaho BI suite and Data Integration refers to the processes and technologies involved in ensuring the conformance of data values to business requirements and acceptance criteria.  Poor Data Quality can have a significant negative impact on business performance in many areas including:

  • Increased costs in direct mailing campaigns caused by bad address data
  • Back office implications in operational areas such as billing, accounting and credit management due to poor quality customer data
  • Compliance, security and/or privacy issues caused by duplicate or bad customer data
  • And the examples go on and on

Your ETL and Data Integration infrastructure is a natural place to integrate data quality best practices and technologies to validate that the data currently have is clean and to act as a 'data quality firewall' ensuring that as new data enters the system, it also is conforms to the quality standards of the business.

Pentaho Data Integration provides a highly scalable and extensible platform for addressing all of your ETL and Data Integration needs, and its pluggable architecture is a natural fit for integrating with industry leading Data Quality solutions from a variety of technology partners.  The sections that follow describe several such active integration projects between Pentaho Data Integration and products for industry leaders in the Data Quality space.

Highlighted Technology Partners