Hitachi Vantara Pentaho Community Wiki
Child pages
  • Pentaho 3.2 Data Integration- Beginner's Guide
Skip to end of metadata
Go to start of metadata

Language: English
Author: Maria Carina Roldan
Paperback: 492 pages [ 235mm x 191mm ]
Release Date: April 2010
Publisher: Packt Publishing
ISBN: 978-1-847199-54-6
Foreword by Matt Casters, Chief Data Integration at Pentaho, Kettle founder
Covers Pentaho Data Integration 3.2
Introduces PDI 4.0 features


As part of Packt's Beginner's Guide, this book focuses on teaching by example. The book walks you through every aspect of PDI, giving step-by-step instructions in a friendly style, allowing you to learn in front of your computer, playing with the tool.
The book provides short, practical examples and also builds from scratch a small datamart intended to reinforce the learned concepts and to teach you the basics of data warehousing. 

What you will learn from this book

  • Install Pentaho Data Integration and get familiar with the graphical designer---Spoon
  • Get data from simple input sources, preview, and send it back in any of the common output formats
  • Perform both simple and complex transformation on your data and manipulate the flow of data in different ways
  • Explore the various PDI options to validate data and to handle errors
  • Solve real-world problems by manipulating the flow of data by combining or splitting it
  • Solve sophisticated problems such as normalizing data from pivoted tables with ease
  • Perform advanced operations with databases such as loading data warehouse dimensions
  • Create advanced processes by nesting jobs, iterating on jobs and transformations, and creating subtransformations

Table of Contents (full version)

  • Chapter 1 - Getting started with Pentaho Data Integration (Read an excerpt)
  • Chapter 2 - Getting Started with Transformations
  • Chapter 3 - Basic Data Manipulation
  • Chapter 4 - Controlling the Flow of Data
  • Chapter 5 - Transforming Your Data with JavaScript Code and the JavaScript Step
  • Chapter 6 - Transforming the Rowset
  • Chapter 7 - Validating Data and Handling Errors
  • Chapter 8 - Working with Databases
  • Chapter 9 - Performing Advanced Operations with Databases
  • Chapter 10 - Creating Basic Task Flow
  • Chapter 11 - Creating Advanced Transformations and Jobs
  • Chapter 12 - Developing and implementing a simple datamart (sample chapter ('s%20guide_SampleChapter.pdf))
  • Chapter 13 - Taking it Further
  • Appendix A - Working with repositories
  • Appendix B - Pan and Kitchen: Launching Transformations and Jobs from the Command Line
  • Appendix C - Quick Reference: Steps and Job Entries
  • Appendix D - Spoon Shortcuts
  • Appendix E - Introducing PDI 4 features
  • Appendix F - Pop Quiz Answers