Hitachi Vantara Pentaho Community Wiki

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h1. Pentaho Big Data

...

The Pentaho Big Data Plugin Project provides support for an ever-expanding BigData community within the Pentaho ecosystem. It is a plugin for the Pentaho Kettle engine which can be used within Pentaho Data Integration (Kettle), Pentaho Reporting, and the Pentaho BI Platform.
Image Removed
<TODO: convert this into a list of currently supported items?>Highlights of the project are to provide support for interacting with Apache Hadoop, Apache Hive, Apache HBase, MongoDB, and Cassandra among other NoSQL data sources for the Pentaho ecosystem.

Pentaho Big Data Plugin Features

This project contains the implementations for:

  • Pentaho MapReduce: visually design MapReduce jobs as Kettle transformations
  • HDFS File Operations
  • Hive
  • HBase
  • Cassandra
  • MongoDB

Key Links

...

 Plugin
{div:style="position: absolute; top: 0px; right: 0px;"}!http://ci.pentaho.com/job/pentaho-big-data-plugin/lastBuild/buildStatus!{div}
The Pentaho Big Data Plugin Project provides support for an ever-expanding BigData community within the Pentaho ecosystem. It is a plugin for the Pentaho Kettle engine which can be used within Pentaho Data Integration (Kettle), Pentaho Reporting, and the Pentaho BI Platform.
!http://ci.pentaho.com/job/pentaho-big-data-plugin/lastBuild/buildStatus|border=0px\!important!
<TODO: convert this into a list of currently supported items?>Highlights of the project are to provide support for interacting with Apache Hadoop, Apache Hive, Apache HBase, MongoDB, and Cassandra among other NoSQL data sources for the Pentaho ecosystem.

h2. Pentaho Big Data Plugin Features

This project contains the implementations for:
- Pentaho MapReduce: visually design MapReduce jobs as Kettle transformations
- HDFS File Operations
- Hive
- HBase
- Cassandra
- MongoDB

h1. Key Links

- SVN Repository: [svn://source.pentaho.org/svnkettleroot/pentaho-big-data-plugin] (GitHub mirror: <TODO>)

...


- Documentation: <TODO: add dev doc page and aggregate links to wiki pages such as [Cassandra Input|http://wiki.pentaho.com/display/EAI/Cassandra+Input], [Cassandra Output|http://wiki.pentaho.com/display/EAI/Cassandra+Output], [MongoDB Input|http://wiki.pentaho.com/display/EAI/MongoDB+Input], [MongoDB Output

...

  • Link to Kettle plugin development

...

|http://wiki.pentaho.com/display/EAI/MongoDB+Output])
-- Link to Kettle plugin development
- CI: [pentaho-big-data-plugin

...

|http://ci.pentaho.com/job/pentaho-big-data-plugin]
Download: <TODO>

...



h1. Community and where to find help

...



The [Big Data Forum|http://forums.pentaho.com/forumdisplay.php?301-Big-Data] exists for both users and developers. The community also manages the ##pentaho IRC channel on irc.freenode.net.

...



h1. Quick Start: Building the project

...



The Pentaho Big Data Plugin is built with [Apache Ant|http://ant.apache.org/] and uses [Apache Ivy|http://ant.apache.org/ivy/] for dependency management. All you'll need to get started is Ant 1.8.0 or newer to build the project. The build scripts will download Ivy if you do not already have it installed.

...



{code
}svn co svn://source.pentaho.org/svnkettleroot/pentaho-big-data-plugin/trunk pentaho-big-data-plugin
cd pentaho-big-data-plugin
ant

Developing with Eclipse

We recommend Apache IvyDE to manage your Ivy dependencies within Eclipse.

...

{code}

h1. Developing with Eclipse

We recommend [Apache IvyDE|http://ant.apache.org/ivy/ivyde/] to manage your Ivy dependencies within Eclipse.

# Import pentaho-big-data-plugin into Eclipse

...


# Resolve the project using IvyDE

...



If IvyDE is not an option then you can manually add the jars from lib/ and libswt/ to your class path. This project, like all other Pentaho projects, uses the open-source [Subfloor|http://code.google.com/p/subfloor/] Ant build framework. Running the following targets will configure the Eclipse project to reference the required libraries:

...



{code
}ant resolve create-dot-classpath{code}

Then import or refresh the project in Eclipse and add the SWT libraries for your architecture, e.g. for Mac OS X x64:

...


!osx-swt-jars.png|border=1!