What's a Pentaho Metadata Dialect?

A dialect defines how Pentaho Metadata should generate SQL for a given database type.

Pentaho Metadata's Dialect Plugin System

Introduced in BI Suite Release 3.7, Pentaho Metadata supports additional dialects through it's Service Provider Interface (SPI): SQLDialectInterface. A Pentaho Metadata Dialect Plugin is a jar file that consists of a dialect implementation and some configuration. The SPI enables developers to develop new database dialects which Pentaho Metadata will automatically detect and register.

Implementing SQLDialectInterface

Database types are defined in Kettle with a unique database id, e.g. SampleDB. Dialects relate one-to-one with database type ids defined in Kettle DB Core or a Kettle Database Plugin. A dialect is an implementation of the SPI:

public interface SQLDialectInterface {

  public String getDatabaseType();

  public String quoteStringLiteral(Object str);

  public boolean isSupportedFunction(String functionName);

  public boolean isAggregateFunction(String functionName);

  public boolean isSupportedInfixOperator(String operator);

  public SQLFunctionGeneratorInterface getFunctionSQLGenerator(String functionName);

  public SQLOperatorGeneratorInterface getInfixOperatorSQLGenerator(String operatorName);

  public String getDateSQL(int year, int month, int day);

  public String generateSelectStatement(SQLQueryModel model);

  public int getMaxTableNameLength();
}

A base implementation exists and can be used as a simple starting point for most new dialects. In addition to the SPI, DefaultSQLDialect defines additional extension points to facilitate building new dialects quickly. Below is a dialect that defines the maximum table name length and a trivial method for quoting a String literal for our Sample Database:

public class SampleDialect extends DefaultSQLDialect {
  public SampleDialect() {
    super("SampleDB");
  }

  @Override
  public int getMaxTableNameLength() {
    return 128;
  }

  @Override
  public String quoteStringLiteral(Object str) {
    return "'" + str + "'";
  }
}

Declaring your Dialect as a Service Provider

A jar file containing your new dialect should be created with an additional META-INF entry. Pentaho Metadata utilizes the Java Service Loader API to dynamically look up dialect implementations. The ServiceLoader API requires you list all Service Provider implementations in a file named after the interface of the service they provide in the META-INF/services directory of the jar file:

META-INF/services/org.pentaho.pms.mql.dialect.SQLDialectInterface:
org.pentaho.pms.mql.dialect.SampleDBDialect

Where org.pentaho.pms.mql.dialect.SampleDialect is the implementation of SQLDialectInterface for our fictitious database type: SampleDB.

For more information see Jar Service Provider Packaging Notes from Oracle.

Using your dialect

All that is required is the jar file to be in the classpath. In most applications this is a /lib directory within the application's installation directory.

Sample Source Code

Attached is a sample project with the aforementioned SampleDB Dialect Plugin and a SampleDB Kettle Database Plugin. The attachment is an Eclipse project with independent Ant build scripts based on Subfloor. All dependencies are managed by Ivy. Subfloor will take care of downloading Ivy and resolving all dependencies by running the default target:

../sampledb-plugin> ant

http://wiki.pentaho.com/download/attachments/19235857/sampledb-plugin.tar.gz

Additional Notes

Existing database dialects that ship with Pentaho Metadata can be overridden by supplying a dialect for the same database type id.

Further Reading

For more information on how to test a Kettle Plugin click here: http://wiki.pentaho.com/display/EAI/How+to+debug+a+Kettle+4+plugin

For more general information about Java's ServiceLoader API see these links:
http://download.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html
http://java.sun.com/developer/technicalArticles/javase/extensible/index.html