Data Profiling (DataCleaner) is fully integrated within Pentaho Kettle / PDI and you can profile your data directly within Spoon.
- Download/Install the plug-in from http://s3.amazonaws.com/kettle4/kettle-profiling-datacleaner.zip
- Unzip into your folder data-integration\plugins\spoon
- Start Spoon
- Open as documented in the Data Profiling (Data Cleaner) section of the Human Inference page.
- Within Spoon, open one of your existing transformation or use the attached sample
- Right click on a step you want to profile its data and select Profile from the context menu