The thin Kettle JDBC driver allows a Java client to query a Kettle transformation remotely using JDBC and SQL.
Available from Kettle v5.0-M1 or higher. http://ci.pentaho.com/job/Kettle/
PLEASE PROVIDE FEEDBACK WHEN TRYING OUT THE DEVELOPMENT SOFTWARE DESCRIBED BELOW!
As with most JDBC drivers, there is a server and a client component to the JDBC driver.
The server is designed to run as a Servlet on the Carte server, the Pentaho Data Integration server or Pentaho Business Analytics platform. At the time of writing only Carte is supported.
Download Kettle 5.0-M1 or later for the server software (Carte), for now go to CI : http://ci.pentaho.com/job/Kettle/
Recent builds will contain all the libraries mentioned below. Kettle jars are in the lib/ folder, the rest in libext/, libext/commons and libext/pentaho.
The carte configuration file accepts a <services> block that can contain <service> elements with the following sub-elements:
<slave_config> <slaveserver> <name>slave4-8084</name> <hostname>localhost</hostname> <port>8084</port> </slaveserver> <services> <service> <name>Service</name> <filename>/path/to/your/service-transformation.ktr</filename> <service_step>Output</service_step> </service> </services> </slave_config>
You can start your Carte server passing the carte configuration xml as the only argument, for example:
sh carte.sh carte-config.xml
Reminder: the default user/password is stored in the pwd/kettle.pwd file and is cluster/cluster.
During execution of a query, 2 transformations will be executed on the server:
These 2 transformations will be visible on Carte or in Spoon in the slave server monitor and can be tracked, sniff tested, paused and stopped just like any other transformation. However, it will not be possible to restart them manually since both transformations are programatically linked.
The JDBC driver uses the following class:
The URL is in the following format:
this example is for a the carte configuration file shown above.
The following standard options are available:
Parameters for the service transformation can be set with the following format: PARAMETER_name=value (so with the option name prepended with "PARAMETER_")
Support for the SQL is minimal at the moment.
The following things are supported, please consider everything else unsupported:
Dates have square brackets around them and the following formats are supported: \[yyyy/MM/dd HH:mm:ss.SSS\], \[yyyy/MM/dd HH:mm:ss\] and \[yyyy/MM/dd\]
Besides the obviously plentiful limitations in the support for the SQL standard, there are a few noteworthy things to note:
Since SQuirrel already contains most needed jar files, configuring it simply done by adding kettle-core.jar as a new driver jar file
The following jar files need to be added:
Simply replace the kettle-*.jar files in the lib/ folder with new files from Kettle v5.0-M1 or higher.
Replace the current kettle-*.jar files with the ones from Kettle v5 or later.
Interactive reporting runs off Pentaho Metadata so this advice also works there.
You need a BI Server that uses the PDI 5.0 jar files or you can use an older version and update the kettle-core, kettle-db and kettle-engine jar files in the /tomcat/webapps/pentaho/WEB-INF/lib/ folder
See Pentaho Interactive reporting: simply update the kettle-*.jar files in your Pentaho BI Server (tested with 4.1.0 EE and 4.5.0 EE) to get it to work.
Example of patching :
matt@kettle:~/pentaho/4.5.0-ee/server/biserver-ee$ rm ./tomcat/webapps/pentaho/WEB-INF/lib/kettle-core-4.3.0-GA.jar matt@kettle:~/pentaho/4.5.0-ee/server/biserver-ee$ rm ./tomcat/webapps/pentaho/WEB-INF/lib/kettle-engine-4.3.0-GA.jar matt@kettle:~/pentaho/4.5.0-ee/server/biserver-ee$ rm ./tomcat/webapps/pentaho/WEB-INF/lib/kettle-db-4.3.0-GA.jar matt@kettle:~/pentaho/4.5.0-ee/server/biserver-ee$ cp /kettle/5.0/lib/kettle-core.jar ./tomcat/webapps/pentaho/WEB-INF/lib/ matt@kettle:~/pentaho/4.5.0-ee/server/biserver-ee$ cp /kettle/5.0/lib/kettle-db.jar ./tomcat/webapps/pentaho/WEB-INF/lib/ matt@kettle:~/pentaho/4.5.0-ee/server/biserver-ee$ cp /kettle/5.0/lib/kettle-engine.jar ./tomcat/webapps/pentaho/WEB-INF/lib/
Fun fact: Mondrian generates the following SQL for the report shown above:
select "Service"."Category" as "c0", "Service"."Country" as "c1", sum("Service"."sales_amount") as "m0" from "Service" as "Service" group by "Service"."Category", "Service"."Country"
You can query a remote service transformation with any Kettle v5 or higher client. You can query the service through the database explorer and the various database steps (for example the Table Input step).
TODO: ask project owners to change the current old driver class to the new thin one.
Partial success as I'm getting some XML parsing errors. However, adding the aforementioned jar files at least allow you to get back query fields:
To be investigated.
The following things are next on the agenda: