Hitachi Vantara Pentaho Community Wiki
Child pages
  • 4.4 Configuring Pentaho for your Hadoop Distro and Version
Skip to end of metadata
Go to start of metadata

How to set up and configure Kettle for your specific Hadoop distribution.

This page applies to Kettle and BA Suite version 4.4 (suite 4.8) only, for 5.0 go here

The Pentaho applications come pre-configured for Apache Hadoop 0.20.2. If you are using this distro and version, no further configuration is required.

Documentation for configuring Pentaho for distros other than Apache Hadoop 0.20.2 is now located on the Pentaho Infocenter here

Currently supported Hadoop distributions:

Pentaho uses an abstraction layer to facilitate supporting the rapid and never ending distributions version updates. We call this layer a shim. The following list shows the current known support and status of various distributions. We generally do not have to update a shim for a minor or patch version change.

Upgrade your Big Data Plugin to version 1.3.3.1

The Big Data plugin has been updated to version 1.3.3.1 and is available for download.

This upgrade works with PDI 4.4 (Suite 4.8) and is compatible with both EE and CE editions of Pentaho. Additional shims that were not shipped with the updated plugin are available on the Additional Shims download page.

Important information about supported Hadoop versions

Pentaho does not ship all available shims with the product. Shims that support older distributions as well as new ones created after release are available for download. If the note says that a later version of a shim also supports your version, Pentaho recommends using the later version.

Click Install Hadoop Distribution Shim for installation instructions.

    Hadoop Version

    Shim

    Pentaho Suite Ver

    Download

    Notes

    0.20.x

    hadoop-20

    4.8+

    included

     

    1.0.x

    NS*

     

     

    No support planned See this blog post

    1.1.x

    NS*

     

     

    Not likely to be done in favor of 1.2.x PDI-9964

    1.2.x

    NS*

     

     

    Possibly in patch post 5.0 but not committed

    http://jira.pentaho.com/browse/PDI-10393

    2.x.x

    NS*

     

     

    Distro is Alpha

    Go to Apache releases

    Hadoop Version

    Shim

    Pentaho Suite Ver

    Download

    Notes

    CDH3u3, u4 and u5

    cdh3U4

    4.8+

    download

    Support will be dropped in 5.0

    CDH4.0, 4.0.1, 4.1, 4.1.1

    cdh4

    4.8+

    download

    The cdh42 shim also supports this configuration

    CDH4.1.2

    cdh412

    4.8 + BD Plugin 1.3.2+

    download

    The cdh42 shim also supports this configuration

    CDH4.1.3

    cdh413

    4.8 + BD Plugin 1.3.2+

    download

    The cdh42 shim also supports this configuration

    CDH4.2

    cdh42

    4.8 + BD Plugin 1.3.2+

    included

    Backward compatible with all earlier cdh4.x distros

    CDH4.2.1

    cdh42

    4.8 + BD Plugin 1.3.3.1+

    included

     

    CDH4.3

    cdh42

    4.8 + BD Plugin 1.3.3.1+

    included

     

    CDH4.4.x

    cdh42

    4.8 + BD Plugin 1.3.3.1+

    included

     

    Go to Cloudera releases

    NOTE: the cdh42 shim supports all versions of CDH from 4.0 through 4.4.x

    Hadoop Version

    Shim

    Pentaho Suite Ver

    Download

    Notes

    DSE 3.0.x

    NS*

     

     

    Possibly in patch post 5.0 but not committed PDI-8036

    DSE 2.2.x

    NS*

     

     

    No current plans to support

    Go to DataStax releases

    Hadoop Version

    Shim

    Pentaho Suite Ver

    Download

    Notes

    HDP 1.2.x

    hdp12

    4.8 + BD Plugin 1.3.2+

    included

     

    HDP 1.3.x

    hdp13

    4.8 + BD Plugin 1.3.2+

    download

     

    HDP 2.x

    NS*

     

     

    In patch post 5.0 - PDI-8962

    HDP 1.1 for Win

    NS*

     

     

    In patch post 5.0 - PDI-10266

    Go to Hortonworks releases

    Hadoop Version

    Shim

    Pentaho Suite Ver

    Download

    Notes

    IDH 2.3

    ihd23

    4.8 + BD Plugin 1.3.2+

    download

     

    Go to Intel releases

    Hadoop Version

    Shim

    Pentaho Suite Ver

    Download

    Notes

    1.1.3, 1.2.0

    mapr

    4.8+

    download

     

    2.0.x

    NS*

     

     

    No Support planned PDI-9648

    2.1.x

    mapr21

    4.8 + BD Plugin 1.3.2+

    included

     

    3.0.x

    NS*

     

     

    Planned for immediately post 5.0 PDI-10037

    Go to MapR releases

    Error rendering macro 'deck'

    java.lang.NullPointerException

    * NS - Not supported. See Hadoop Configurations for information on how to create or modify a shim to support your configuration

    + Pentaho Ver is the earliest version of the Pentaho suite that supports this shim. Subsequent Pentaho versions will also support this shim unless otherwise noted.

    Unknown macro: {HTMLComment}

    The Pentaho support policy for Hadoop is available on the Pentaho Support Plan for Hadoop Distributions page.

    Open JIRA cases for Distro Support

    key fixVersion summary status assignee updated

    Data cannot be retrieved due to an unexpected error.

    View these issues in Jira

    Release resources

    • No labels