Hitachi Vantara Pentaho Community Wiki
Child pages
  • HierarchicalClusterer
Skip to end of metadata
Go to start of metadata

Package

weka.clusterers

Synopsis

Hierarchical clustering class.
Implements a number of classic agglomorative (i.e. bottom up) hierarchical clustering methods.

Options

The table below describes the options available for HierarchicalClusterer.

Option

Description

debug

If set to true, classifier may output additional info to the console.

distanceFunction

Sets the distance function, which measures the distance between two individual. instances (or possibly the distance between an instance and the centroid of a cluster depending on the Link type).

linkType

Sets the method used to measure the distance between two clusters.
SINGLE:
find single link distance aka minimum link, which is the closest distance between any item in cluster1 and any item in cluster2
COMPLETE:
find complete link distance aka maximum link, which is the largest distance between any item in cluster1 and any item in cluster2
ADJCOMLPETE:
as COMPLETE, but with adjustment, which is the largest within cluster distance
AVERAGE:
finds average distance between the elements of the two clusters
MEAN:
calculates the mean distance of a merged cluster (akak Group-average agglomerative clustering)
CENTROID:
finds the distance of the centroids of the clusters
WARD:
finds the distance of the change in caused by merging the cluster. The information of a cluster is calculated as the error sum of squares of the centroids of the cluster and its members.
NEIGHBOR_JOINING
use neighbor joining algorithm.

numClusters

Sets the number of clusters. If a single hierarchy is desired, set this to 1.

printNewick

Flag to indicate whether the cluster should be print in Newick format. This can be useful for display in other programs. However, for large datasets a lot of text may be produced, which may not be a nuisance when the Newick format is not required

Capabilities

The table below describes the capabilities of HierarchicalClusterer.

Capability

Supported

Class

No class

Attributes

String attributes, Date attributes, Missing values, Unary attributes, Numeric attributes, Binary attributes, Nominal attributes, Empty nominal attributes

Min # of instances

0

  • No labels