Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Kettle JDBC driver

The thin Kettle JDBC driver allows a Java client to query a Kettle transformation remotely using JDBC and SQL.

Architecture

As with most JDBC drivers, there is a server and a client component to the JDBC driver.
The server is designed to run as a Servlet on the Carte server, the Pentaho Data Integration server or Pentaho Business Analytics platform.  At the time of writing only Carte is supported.


The client JDBC driver consists of the kettle-core.jar library which has dependencies against Apache Commons HTTP Client and Apache Commons VFS only.

Server configuration

The carte configuration file accepts a <services> block that can contain <service> elements with the following sub-elements:

  • name : The name of the service.  Only alphanumeric characters are supported at the moment, no spaces.
  • filename: The filename of the service transformation (.ktr) that will provide the data for the service
  • service_step: the name of the step which will provide data during querying.

For example:

<services>
  <service>
    <name>Service</name> 
    <filename>/home/matt/svn/kettle/trunk/testfiles/sql-transmeta-test-data.ktr</filename> 
    <service_step>Output</service_step>
   </service>
 </services>

The JDBC Client

The JDBC driver uses the following class:

org.pentaho.di.core.jdbc.ThinDriver

The URL is in the following format:

jdbc:pdi://hostname:port/kettle&option=value

The following standard options are available:

  public static final String ARG_WEBAPPNAME="webappname";
  
  public static final String ARG_PROXYHOSTNAME = "proxyhostname";
  public static final String ARG_PROXYPORT =  "proxyport"; 
  public static final String ARG_NONPROXYHOSTS = "nonproxyhosts";

  • webappname : the name of the web app (future feature to support running on the DI server)
  • proxyhostname : the proxy server for the HTTP connection(s)
  • proxyport : the port of the proxy server
  • nonproxyhosts : the hosts (comma seperator) for which not to use a proxy

Parameters for the service transformation can be set with the following format:  PARAMETER_name=value (so with the option name prepended with "PARAMETER_")

SQL Support

Support for the SQL is minimal at the moment.

The following things are supported:

  • SELECT:
    • *
    • COUNT(field)
    • COUNT(*)
    • COUNT(DISTINCT field)
    • IIF( condition, true-value or field, false-value or field)
    • Aggregates: SUM, AVG, MIN, MAX
    • Alias both with the "AS" keyword and with one or more spaces seperated, for example SUM(sales) AS "Total Sales" or SUM(sales) TotalSales
  • FROM
    • Strictly one service name
  • WHERE
    • nested brackets
    • AND, OR, NOT if preceded by brackets, for example: NOT ( A = 5 OR C = 3 )
    • precedence taken into account
    • Literals (String, Integer)
    • PARAMETER('parameter-name')='value'  (always evaluates to TRUE in the condition)
  • GROUP BY
    • Group on fields, not IIF() function
  • HAVING
    • Conditions should be placed on the aggregate construct, not the alias
  • ORDER BY
    • You can order on any column in the result. (to be fixed later to allow you to also sort on non-selected columns in the service)
  • No labels