Added by Matt Casters, last edited by Jens Bleuel on Mar 08, 2012  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

Definition

A dynamic cluster is a cluster schema where the slave servers are only known at runtime.
This situation is occurring in those situations where hosts are being added or removed at will, such as in cloud computing settings.

Components

 These are the components of a dynamic cluster configuration:

Master slave server

The master slave server is just the same as it was before.  However, in the latest version (PDI 3.2) it gained a new trick: it can accept slave server registrations.
Once a slave server is registered, it will be monitored every 30 seconds to see if it is still available to run ETL jobs on.

Slave Server

The slave server gained a new trick as well: it can be configured through an XML file.  This is what carte expects to find in it:

<slave_config>
  <masters>

    <slaveserver>
      <name>master1</name>
      <hostname>localhost</hostname>
      <port>8080</port>
      <username>cluster</username>
      <password>cluster</password>
      <master>Y</master>
    </slaveserver>
  </masters>

  <report_to_masters>Y</report_to_masters>

  <slaveserver>
    <name>slave4-8084</name>
    <hostname>localhost</hostname>
    <port>8084</port>
    <username>cluster</username>
    <password>cluster</password>
    <master>N</master>
  </slaveserver>


</slave_config>

Node descriptions:

  • masters: You can list the slave servers to which this slave has to report back to.
    If this is a master, we will contact the other masters to get a list of all the slaves in the cluster.
  • report_to_masters : send a message to the defined masters to let them know we exist (Y/N)
  • slaveserver : specify the slave server details of this carte instance.
    IMPORTANT : the username and password specified here are used by the master instances to connect to this slave.

NOTE: In the <slaveserver> sections you can specify the <network_interface> parameter to replace the hostname with at Carte startup.  It will take the primary IP address of the given network interface instead.

The carte configuration xml can be passed to carte as a command line option, for example:

carte.sh /Pentaho/Kettle/slave_dyn_8082.xml
carte.bat \Pentaho\Kettle\slave_dyn_8082.xml

Sample configuration files are located in the pwd directory.

Transformation

On the transformation side, things become a bit easier: all we now need to do is check the "Dynamic" checkbox and specify a master slave server we want to talk to.
You can even specify multiple slave servers for fail-over purposes.

 Related HTTP services

 
/kettle/getSlaves : retrieves the list of Slave Server Detections in XML. The layout of the document is this:

<SlaveServerDetections>
  <SlaveServerDetection>...</SlaveServerDetection>
  <SlaveServerDetection>...</SlaveServerDetection>
  <SlaveServerDetection>...</SlaveServerDetection>
  ...

</SlaveServerDetections>

See also class: SlaveServerDetection
/kettle/registerSlave : Allows a slave server to register with a master server.  This is the XML to send:

<SlaveServerDetection>
 <slaveserver> ...  </slaveserver>
 <active>Y</active>
 <last_active_date>2008/10/31 21:25:37</last_active_date>
 <last_inactive_date></last_inactive_date>
</SlaveServerDetection>

See also class: SlaveServer

The carte configuration xml can be passed to carte by either

 carte.sh foo.xml

OR

 carte.bat foo.xml

http://forums.pentaho.org/showthread.php?t=70053

Comment: Posted by A L at Jun 04, 2009 14:19

can the user direct / customize the use of slaves say 25%, 25%, 50% against 3 slaves?

Comment: Posted by Sparjan Kote at Oct 08, 2013 10:22

Yes, you can accomplish this by partitioning (create a customized partitioner, an example is over here: http://funpdi.blogspot.de/2012/09/bucket-partitioner-plugin.html)

Comment: Posted by Jens Bleuel at Oct 09, 2013 05:12