Hitachi Vantara Pentaho Community Wiki
Child pages
  • How to debug a Kettle 4 plugin
Skip to end of metadata
Go to start of metadata

Introduction

Since version 4 of Pentaho Data Integration the class path is no longer being scanned for classes that are annotated as a plugin.  Only those jar files that are located in a plugins folder or sub-folder are being scanned.  This was done to speed up the launch of Kettle.  Especially in situations where Spoon had a lot of jar files in the class path, for example on the Pentaho BI server or in the Agile-BI context, this was a concern.

Since the class path is no longer being scanned, the environment variable KETTLE_PLUGIN_PACKAGES that was introduced in Kettle v3, is now no longer being used.

The problem

Now that Kettle no longer automatically picks up classes in the class path of your environment, it becomes harder to debug plugins.  That is because only those plugins that are correctly deployed as a set of jar-files are being picked up.  It is obviously possible to debug and step through those deployed plugin jar-files with your favorite IDE.  However, it is not possible to change anything to these.  That in turn leads to a very cumbersome build - deploy - restart - debug cycle.

Multiple solutions

Using environment variable

All plugin classes can be described using annotations.  So make an annotated class that is located in the class path seen by Kettle, you can use the new environment variable KETTLE_PLUGIN_CLASSES.

This variable can contain the name of one or more annotated classes (separated by commas).  This will make your IDE pick up the plugin at runtime.  You can pass the new environment variable by passing it at runtime using -D, for example:

-DKETTLE_PLUGIN_CLASSES=org.example.plugin.package.Foo
Class path configuration fun

In addition to setting the KETTLE_PLUGIN_CLASSES variable, you also need to configure the class path of the Kettle project.

For simple step, job-entry or partitioning plugins it's usually sufficient to simply put the plugin project in the build path of your Kettle project to make it seen.  However, for more complex plugins with a lot of extra jar files, things are not that simple.  For those situations, it's no longer optimal to change the build path of the project since you will have a hard time knowing what you need and what not in the official Kettle project.

For those situations, it's usually a lot better to change the class path of your Spoon/Pan/Kitchen/Carte Run Configuration (Eclipse terminology).

Please remember that you including a project in the class path of the run configuration DOES NOT include the jar files in the build path as well.  You need to include any required extra jar-files as well.

Using separate Eclipse projects

Of course, when developing plugins, you don't want to mix your code with PDI code. Most often you also want to develop your plugin in relation to a GA build and not trunk build. Additionally, you want to have short debug cycles and the possibility to debug kettle itself (to hunt down bugs which aren't in your plugin etc.).
So the basic idea behind is to configure two Eclipse projects within one workspace. One will be PDI itself, the other one will be your plugin project. This way, there is no "cumbersome build - deploy - restart - debug cycle" ;-)

For the following, I assume you've created a separate Eclipse workspace, which is fresh and clean.

WARNING: Because of PDI is a "bigger" project, I recommend modern hardware. I'm using an SSD drive, thus cleaning and  building the whole project only takes about 20 seconds.

Step1 : Setup PDI Eclipse project

First you have to setup PDI as a project. More details are described in Building and Debugging with Eclipse.

Before continue, make sure you're able to run Spoon in debug mode (as described in the article).

For the following, this project will be named 'Kettle' project.

Step2: Create your PDI plugin project

I assume that you've successfully got step #1 to work. For this description we will use the public dummy plugin template. Compare and adopt the relevant settings for your own plugin project if necessary.

  1. Check out DummyPlugin3 from SVN source, http://source.pentaho.org/svnkettleroot/plugins/DummyPlugin3/trunk
  2. Fix all build errors 
    1. Check that you have a JDK1.6 Java Runtime installed in your Eclipse workspace. Set it as default. 
    2. Check that the DummyPlugin3 project Java compiler setting is set to "1.6"
  3.  Adjust kettle core jars for satisfying compiler (see also README.txt) 
    1. Copy and update the Kettle core jars (remove old ones from build path and add new ones manually). I recommend, to download the related binary package, in order to match the version you've checked out in step#1. For example, we've checked out "tags/4.2.0-GA", we also download "pdi-ce-4.2.0-stable.zip" and get the following jars from:
      1. kettle-core-4.2.jar
      2. kettle-db-4.2.jar
      3. kettle-dbdialog-4.2.jar
      4. kettle-engine-4.2.jar
      5. kettle-ui-swt-4.2.jar
    2. Copy and overwrite the "libswt" folder from Kettle project (step#1) into DummyPlugin3 project and update the SWT files. Adjust the platform (win32 etc.) if necessary. 

At this point, the Dummy3 project is error free and ready to get explored ;-)

Step3: Link the plugin project with PDI project

Now, we'll link the two projects in order to be able to debug the dummy plugin. Doing so, a little adjustment within the Kettle plugin is necessary.

  1. Delete the folder (in Kettle project) '/plugins/steps/DummyPlugin'. This is a sample step, called 'Example plugin'
    1. verify by starting Spoon, that there is no more step called 'Example plugin' available.
  2. Create the folder (in Kettle project) '/plugins/steps/Dummy3'
  3. Copy the two files 'DPL.png' and 'plugin.xml' from Dummy3 project folder '/distrib' into the newly created '/plugins/steps/Dummy3'' folder. (Don't copy the jar file!)
  4. Open and edit (in Kettle project) '/plugins/steps/Dummy3/plugin.xml'
    1. make 'libraries' an empty element, cause we don't have any JAR files
    2. removed all disturbing language entries, but keep 'en_US'
    3. rename all occurrences of 'Dummy' to 'Dummy3' to have a distinct difference to other dummy plugins
    4. EXAMPLE plugin.xml:
      <plugin
         id="DummyPlugin3"
         iconfile="DPL.png"
         description="Dummy3 Plugin"
         tooltip="This is a dummy3 plugin test step"
         category="Transform"
         classname="be.ibridge.kettle.dummy.DummyPluginMeta">
      
         <libraries />
      
         <localized_category>
           <category locale="en_US">Transform</category>
         </localized_category>
         <localized_description>
           <description locale="en_US">Dummy3 plugin</description>
         </localized_description>
         <localized_tooltip>
           <tooltip locale="en_US">This is a dummy3 plugin test step</tooltip>
         </localized_tooltip>
      
      </plugin>
      
  5. Configure Kettle project build path and add Dummy3 project as dependency

Ad this point, everything is ready to go. Launch Spoon again - you should now see your Dummy3 plugin available for transformation steps. You're also able to set breakpoints within PDI itself or in your plugin classes.

Technical Eclipse background or how to add dependencies

The major trick here is that we're still launching Spoon (Kettle) project and having Dummy3 plugin classes exported via project settings. This way, classloaders within Spoon are able to pick them up. The 'plugin.xml' has to be placed within the Kettle project, because the internal plugin mechanisms have to know about.

Let's have a look into Dummy3 project -> Java build path settings -> Order and Export:


As you can see, only the source folder is checked. That's why the Kettle project is able to see our Dummy3 classes. Notice, that we don't export the JAR files. This is because Kettle project already uses them. So we're avoiding duplicate SWT classes.

When your plugin needs additional JAR files, follow this rules:

  • Add the "externalrequired.jar" to Dummy3 plugin project dependencies
    • Thus, only the compiler is able to compile your classes.
  • Check, if Kettle already uses this "externalrequired.jar".
    • If yes, you're done.
    • If not, add the "externalrequired.jar" additionally to Kettles project dependencies.
  • For being able to browse Kettle sources from within your Dummy3 classes, adjust the source folder settings.
    • kettle-code-4.2.jar -> Java Source Attachment -> Workspace -> /Kettle-4.2.0-GA/src-core
    • kettle-db-4.2.jar -> Java Source Attachment -> Workspace -> /Kettle-4.2.0-GA/src-db
    • and so on ...
  • Avoid circular dependencies in any case!