Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

About this page

Note: This Wiki space is in active development

This space is dedicated for Pentaho Data Integration (aka Kettle) topics around concepts, best practices and solutions.

It is more practical oriented whereas the basic reference documentation is more detailed and descriptive.

Main categories

  • Planning, Administration and Operations (e.g. Installation, Lifecycle Management, Monitoring, Logging, Exception Handling, Multi-Tenancy)
  • Documentation (Auto-Documentation, Data-Lineage, Process Documentation, References, Dependencies)
  • Connecting with 3rd Party Applications (e.g. Webservices, ERP, CRM systems)
  • Special database issues and experiences
  • Big Data (e.g. Hadoop)
  • Clustering (Basic clustering, failover, load balancing)
  • Performance Considerations
  • Change Data Caption (CDC)
  • Real-Time-Concepts
  • Data Quality, Data Profiling, Deduplication (e.g. Master Data Management - MDM)
  • Special File Processing (e.g. EDI(FACT), HL7 healthcare, large and complex XML files, hierarchical and multiple field formats)
  • Dynamic ETL (Meta-Data driven ETL, How to change the ETL process and fields dynamically depending on the processed content)
  • Special Job Topics (e.g. launching job entries in parallel, looping)
  • Special Transformation Topics (e.g. Error handling, tricky row and column handling)
  • No labels