Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)

Classifiers

Rotation Forest

Rodriguez et. al.'s method for constructing an ensemble of trees using random subspaces and principal components transformation applied to the input data. (weka.classifiers.meta.RotationForest). See:

Juan J. Rodriguez, Ludmila I. Kuncheva, Carlos J. Alonso (2006). Rotation Forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence. 28(10):1619-1630. (http://doi.ieeecomputersociety.org/10.1109/TPAMI.2006.211)

Thanks to Juan Rodriguez for this contribution.

Multi-class alternating decision trees

Alternating decision trees extended to handle multi-class problems. (weka.classifiers.trees.LADTree). See:

Geoffrey Holmes, Bernhard Pfahringer, Richard Kirkby, Eibe Frank and Mark Hall (2001). Multiclass alternating decision trees. Proceedings of the European Conference on Machine Learning. p 161-172. Springer.

Wrapper classifier for the LibLINEAR library

Access to the LibLINEAR library for fast linear support vector machines and logistic regression. (weka.classifiers.functions.LibLINEAR)

Thanks to Benedikt Waldvogel for contributing this wrapper.

Clusterers

K-means

K-means (weka.clusterers.SimpleKMeans) now has an option to use the Manhattan distance function in combination with the component-wise median as the cluster centroids.

Attribute selection

Scatter search

The sequential scatter search algorithm. (weka.attributeSelection.ScatterSearchV1). See:

Felix Garcia Lopez (2004). Solving feature subset selection problem by a Parallel Scatter Search. Elsevier.

Thanks to Adrian Pino for this contribution.

Command line interface

Improved help

Information about a classifier and clusterer (in addition to its options) is now available from the command line by supplying the -info or -synopsis flag (in conjunction with the -h flag).

Averaged information retrieval statistics

Averaged AUC, f-measure, precision, recall etc. are now available from the command line as well as in the Explorer and Experimenter GUIs.

=== Detailed Accuracy By Class ===

               TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
                 1         0          1         1         1          1        Iris-setosa
                 0.96      0.04       0.923     0.96      0.941      0.992    Iris-versicolor
                 0.92      0.02       0.958     0.92      0.939      0.992    Iris-virginica
Weighted Avg.    0.96      0.02       0.96      0.96      0.96       0.994

KnowledgeFlow

Usability

Usability of the KnowledgeFlow has been improved with a revamped status/log area.


Multi-threaded Classifier component

The Classifier component in the KnowledgeFlow is now multi-threaded and can learn models on multiple cross-validation folds concurrently.


PMML import

Support for import of PMML models (regression, general regression and neural networks) has moved into the main code base for Weka (weka.core.pmml and weka.classifiers.pmml.consumer). More information on Weka's support for PMML can be found here

Central logging

Weka now logs from the main GUIs to a central file ($HOME/weka.log).


This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence